Red Riding Hood 2

Emotion and Sentiment Visualisation

Deepen understanding and implementation of data visualization techniques, with a focus on sentiment analysis and the integration of emotional analysis

desktop monitor, tablet and paper layout out on a long table, showing original book pages and data visualisation

This project was longlisted in Information is Beautiful 2023 awards

Elevating Data Visualisation Skills and Expanding Prior Work

Since I learnt a lot from my last project following a data visualisation course, I started a similar data visualisation course this time with Frederica Fragapane. I liked her style and was keen to learn about her processes and methods which are explained in the course. Initially, I thought of using a new topic and data set but there were a few ideas I didn't get to try out with my previous Red Riding Hood project, and liked the idea of digging deeper into it and see how it could be presented differently. I used my 10% research time at King's Digital Lab, and my personal time (about 70%).

on left scan of open book page 1, right completed visualisation of page 1, circle with words of page 1 around it
Image on left: scan of page 1 of a book edition from Worthpoint. Image on right: screenshot of page 1 of final visualisation

Incorporating Sentiment Analysis and Exploring Techniques

In my previous project on Red Riding Hood, my colleague Arianna Ciula, gave me the idea to add sentiment analysis to the inner circles which represent each word to add an extra layer of interpretation and meaning. I had little knowledge of sentiment analysis so had to look it up to understand the basics. As I dug a bit deeper, I thought of different ways to represent the sentiment data that could be visually quite interesting. I wanted to show the sentiment for all the words by putting them around in a circle to be able to identify their position on the page, rather than the random placement of circles I had used previously. I tried a few different ways to implement this but they all changed the look of the "trees" too much and so I left the presentation minimal, just showing the top 20 positive and negative words, represented by light and dark outline circles, and also the page sentiment, represented by arcs above the page "tree". One of the methods in this second course was very similar to how I had tried to visualise the words previously placing data points around a circle.

on left focusing on legend with a tree, house and character using geometric shapes, on right circles with examples of sunburst like patterns, segments, radar graphs within
Legend of sentiment analysis in previous Red Riding Hood and work in progress examples trying out different styles

I was excited to follow the Fragapane’s process which involved putting the data in a radar graph and manually placing each data point to fit. The process was tedious and repetitive, but I didn't mind that. What I did mind, after trying it out with some data, was that it was far too easy for me to make an error with the placement. I knew I had to adapt to make the process less error prone for myself. The data presented also wasn't that clear in the previous project and I knew improvements could be made to the sentiment analysis and that I could present that more clearly.

Reevaluating Sentiment Analysis Tools

I thought I could fast-track this stage since I was reusing the same data, but I was wrong. I had used the free online sentiment analysis tool MonkeyLearn in the previous project as it was the best, I found at the time that fit my requirements (details in previous project). I wanted to use it again so both projects would have the same data but when I looked at the results it gave for all the words on the first page, I wasn't satisfied with the results. "hood"(91.1%) and "above" (86.3%) were the top negative words, compared to "died" (54.8%) and "lonely" (51.8%) which I intuitively think are more negative. I was rather annoyed I didn't spot this earlier for the previous project. I had already spent quite a while searching for a sentiment analysis tool that would fit my requirements previously so I knew the search for an alternative would not be easy. I really wanted to stick with MonkeyLearn but the results just didn't sit well with me. I searched for quite a long time, past the first few pages of what the search results would give, and finally came across twinword. The results were far better, "lonely" (-0.927) and "died" (-0.797) were the top negative words for the first page, and the process was also going to be much easier. With MonkeyLearn, I had to supply the page text by copying and pasting to get the page sentiment value and input each word individually to get each word sentiment value. With twinword, I only had to supply the page text and it gave the results for page sentiment together with results of all the positive and negative words on that page in one go so it would save me a lot of time. It didn't give neutral word values but it was better that way as I would have felt tempted to use and visualise them.

left: 2 bar charts with positive to negative sentiment on vertical y axis and pages on x axis, center: 2 bubble charts with positive to negative sentiment on vertical y axis and pages on x axis, right: data of page 1 on spreadsheet, column 1 spelling of word, column 2 monkeylearn sentiment results, column 3 twinword sentiment results
Comparison of words of a page with MonkeyLearn and twinword showing excel spreadsheet, bar chart and RAWgraph bubble chart

Sketching Ideas and Data Wrangling

Once I was satisfied with the new and additional sentiment data I collected, I started with my usual process of sketching down some ideas and layouts. With the previous project, there was a visual focus on the characters, and locations. With this project, I wanted to the focus to be on the sentiment and the words.

photograph of the same sketchbook opened up to 4 pages with black fountain pen line drawings of circles, sunburst patterns and scribblednotes
Concept drawings in sketchbook throughout the project

I went back to RAWgraphs to see if there was a graph I could use to get a similar effect. What I needed was a bar chart but arranged in a circle - a radial bar chart but RAWgraphs didn't have that. I tried a few and the sunburst chart was close but the data structure wasn't quite right. I changed things around in excel, something that I'm sure is not conventional and a bit crazy to do but I knew it would give me what I needed. I typed out a binary system of the letter count for each word per row for the first page, put it into RAWgraphs to make the sunburst chart, exported the svg to Illustrator, swapped the numbers by typing the relevant letter of each word and deleted the additional data which was added for spacing.

left: words shooting out going clockwise around a circle, right: styled version with sentiment colours and handwritten and script fonts
Export of RAWgraph sunburst chart and a version after cleaning up in Illustrator

I mocked up a version of how all the pages might look altogether when placed as a scatter plot with the page sentiment on the y axis and pages on the x axis.

top: bubble chart drawn in illustrator with positive and negative sentiment on vertical y axis and pages on horizontal x axis, 16 circles depicting pages with words circling laid out horizontally with sentiment colours in circle and words, bottom: enlarged view of a circle
Mockup of layout and individual pages in illustrator and photoshop

Incorporating Emotion Analysis, Sentences, and Rhyme Patterns

When browsing through twinword's site, I found out it had more tools and one I thought would be interesting to have alongside the sentiment analysis I was already including was emotion analysis and to add that for each page. I also wanted to show the sentences since there were always 8 on each page and the rhymes as it was written as a poem. So, I ended up having a rather long process. For each page my process would be:

  1. Get the text from the original digitised format
  2. Compile in MS Word
  3. Gather the data in Excel
  4. Put the data in RawGraphs and export as svg
  5. Remove the numbers and type each letter in Illustrator
  6. Run sentiment analysis on data in twinword
  7. Compile in MS Word
  8. Colour word according to sentiment value in Illustrator
  9. Style sentence and rhyme in Illustrator
  10. Run emotion analysis on data in twinword
  11. Compile in MS Word
  12. Gather in Excel
  13. Put the data in RawGraphs and export svg
  14. Add emotions and clean up in Illustrator
  15. Style words
screenshots of original page, word document, excel sheets, excel graphs, RAWgraphs, web pages, vector graphics
Steps of my initial process of a page

Prototyping: Leveraging CSS for Responsive, Interactive and Accessible Visualizations

It was taking me about 2-3 hours per page and I hadn't even finished 3 when I stopped to think. I could continue and finish them all but I also knew I wanted to eventually make the final piece interactive and responsive as it had quite a lot of information that would make the visualisation quite cluttered if it was shown in a static format. I decided to stop my work in Illustrator and start playing around with code.

I know a lot of data developer/visualisers were using D3 to make their very impressive data visualisation pieces. I had tried a few times to learn Javascript but never got it and usually spent a lot of time debugging my errors. I just wasn’t ready to try again just yet. I really liked working with CSS, had done quite a few side projects previously and have seen some amazing work with just CSS so wanted to see how far I could get with it, knowing it has its limitations.

I started making some prototypes in codepen which I have found really useful to code with since: it shows both code and results in the same browser window, refreshes automatically; and what I particularly found useful for this project was their HTML and CSS preprocessors, in which I used Pug and SCSS so I didn't have to write as much and could structure the code more easily. I managed to come up with a prototype that looked very similar to the design in Illustrator. It was responsive, semi-accessible for screen readers (I think), but some elements just didn’t scale well. The connecting lines and the dots to mark the beginning and end of a sentence get misaligned when scaled due to some elements being positioned manually.

After a few more iterations, I came up with a version that removed the beginning and end points and represented the start with a darker colour and gradually fading to a lighter colour at the end of the sentence on curved lines, replaced the rhyme lines with two side borders on the rhyming words, one rhyme pair with a single solid border line and the other pair with double border lines.

As the project progressed, I added page sentiment colour to the centre circle, some page summary information, and the rhyming words of the page, gave the emotion triangles a slight gradient to make it more subtle. I also added some tooltips to include additional information.

 three versions showing slight variations of words placed around a circle with sentiment colours on certain words, one with a gradient sentiment colour in the circle and more text, dashed lines with start and end points, connecting dotted lines or side borders, 3 gray or gradient triangles around circle.
Design of 'Page 3' in Illustrator as a mockup, Codepen early and later prototypes

Evolving Approach to Improve Interaction and Responsiveness

I wanted to have a default overview of all sixteen pages and to be able to zoom in to view each page in more detail. At first, I thought I could scale down each page and on hover over it would expand but it kept flickering from each expanded page during the interactions. After a few trials, I got it working using the modal interaction I have used previously in several projects but with some added transitions. But I didn't like the way the prototype worked. There would be too many clicks or taps to interact with just the basic part of the visualisation to view each page.

I went back to my sketchbook to draw out a few ideas and thought of making it more of a scrollytelling-like piece, so the user gets an overview of the project, a how to read section, and then an overview of all the data visualisations, before then moving on to the individual page data visualisations. I had to consider how everything would work on different screens especially smaller and touch screens. As with my work projects, I now try to have as few variations between the screen sizes so there is more consistency, and less to style and code up. I tried to use only CSS properties but ran into a few that were not supported in all browsers so had to add a few Javascript snippets to compensate. I finally came up with a layout that I was happy with. The initial prototype had the "how to read" section on the same page, then iterated to having it in on a separate modal pop up. This section was iterated over slightly as the project progressed and I received feedback.

Sketches and Mockups of layouts
Sketches and Mockups of two layouts

Data Integration Workflow

Until this point, I had only input data for one page and left the other pages until the end since I knew it was just a case of data entry once I had all the structure in place. What I didn't realise was how long it would take as there were so many stages in the process I had added along the way. It took about 40 minutes - 1 hour per page. This was my process to input the individual page data into HTML and CSS after the general code structure had been finalised:

screenshots of the some of the code process
Screenshots of data entry into code process of the main page sections
  1. Input page summary in HTML of the inner circle
    • Page number
    • Number of words
    • Two sets of rhyming words
    • Page sentiment value
    • Top emotion
  2. Input Emotion types and values in HTML, position height values in CSS
  3. Input individual word and sentiment value in HTML, colour sentiment value in CSS
  4. Add sentence lines in CSS
  5. Input borders for rhyming words in CSS
  6. Add Sentiment colour and opacity in CSS
  7. Add Page Text in HTML

Seeking Feedback

I had entered the first few pages of data so the end was near (or so I thought). Since there were so many parts to this visualisation, I knew I had to ask for feedback. I asked around my usual circle for feedback but I didn’t get any replies. I carried on with the data entry, continued testing it out but had a growing feeling something was lacking. I went back to the previous Red Riding Hood project to recall what it looked like and realise it looked a lot more interesting. Apart from it having more colour, which I made a conscious decision to minimise for this one, I concluded it was because the narrative of the story was much more evident. I couldn’t add much more data as it would add more clutter and probably add confusion to an already unfamiliar and unconventional format. I thought at minimum, I could show the narrative. Also, after reviewing the previous project and after a bit of thought, I narrowed it down to two characters (Red Riding Hood, Wolf) and two locations (Red Riding Hood’s House, Grandma’s House). Those four things I thought could give a sense of the story and I could replace the page numbers with the characters, add some sort of house indication, and made a mock up of it.

replacing numbers with symbols
Replacing numbers with symbols in the navigation

I still needed feedback and by some good fortune, I had visited the Data Visualisation Society website as I occasionally do to check on upcoming events and there was an “Early Career Create: Project Jam” event in 2 days. It read: “During this event, we'll separate into smaller coworking groups based on what aspect of the project cycle you are working on at the moment: ideation, prototyping/design, development, etc. so you can get feedback from others in a similar place. This event is open to all DVS members.“

I had been to a few DVS online events but only lingered in the shadows. I had never spoken about my personal work to anyone other than my family and a handful of colleagues so was a bit nervous having to explain the work to people I don’t know. Fortunately it wasn’t a large group, just four others, I showed the latest version of the prototype I had at the time and the feedback was really useful.

screenshot of visualisation, 16 circles positioned with various heights positioned horizontally left to right depicting pages, a large circular sunburst like chart in the center
Prototype version of what was shown at the DVS Project Jam

It confirmed what I had felt but with more accuracy and valuable suggestions for improvement, so thank you again everyone who was there. There were so many things I should and could improve, and so many solutions and ideas. I sketched out a few and tried to figure out which were the ones that solved it best, but also if I could implement them quickly as I had already gone over my personal deadline to complete. I decided to not give myself another deadline as it would just add unnecessary pressure for a project that was meant to be enjoyable, but maybe have one again when I was closer to the end so as not to drag it out longer than needed, as I did have my next project in mind and was also keen to start that.

Some of the feedback included:

I started adding tooltips to the navigation, to show on hover the word count, and numerical sentiment value with a line linking it to the scale. I also added the emotions but took them back out as it felt too messy and confusing.

showing all the tooltips in the navigation
Adding tooltips to navigation

I still wanted to add the characters and locations so worked on many mockups to find ways to integrate them. I thought an overview of the character and location counts across the story would be useful, then breaking down per page. I also had a suggestion from my partner to add the counts within the individual pages as well, so made them show upon hover.

examples of characters locations mentions
Mockup examples of characters locations, and mentions done in Illustrator
How to Read

The "How to read" section was a lot trickier than expected and I had to re-write it several times along with the gif animations as I couldn't think of good way of presenting it with just static images. Initially I wanted it as part of the main page combined with the scrollytelling function but thought it would be frustrating for a user who already knew how to read it to have to scroll through it all to get to the visualisation section so put it as a separate modal page. But I had feedback from the Project Jam DVS session it was less likely to be used if not on the main section, so made a tab/accordion element so the information would take up less space but still be accessible for those who needed it.

Stacked section titles on left, image and explaination on right
How to read section

Iterating to improve Accessibility and User Experience

I made the changes and wanted to show my colleagues at work what I had done so asked if I could book a meeting slot to present. It was important to me to show them what I had been working on, especially since part of the time was done using the 10% allocated time at work to work on projects like this, and I also wanted to see if this was the type of work that could be integrated into the work the lab does.

I took a lot of time to prep for the presentation, had a lot of material, did a trial run with another colleague then dropped about half of the slides as it felt irrelevant. The feedback I got was very useful. There were questions on why I made certain choices, which made me reflect on them, and suggestions for improvements.

Continuing Iterative Improvements

I was getting very good at tweaking things and felt I could work on improvements forever. But I gave myself a few tasks and worked on making those changes. One of those changes was adding a punctuation view to the page overview based on the feedback I had received and decided to add the emotions there also.

Pages depicted by circles on top row, punctuations on second row, emotions on third row, characters on fourth row, locations on fifth row. Each column is a page from 1 to 16.
Overview of the 16 pages showing Punctuations, Emotions, Characters, Locations.

Another thing I wanted to show was how a researcher’s insights could be integrated and added the “Notes” section. I copied and pasted material from another resource as I did not want to spend the time and probably wouldn’t be able to write anything substantial.

The page felt very long and I added a “Menu” that expanded to show the page sections to act as a table of contents and navigation tool. I also restructured the sections, retitled some headings and made an attempt to fit some elements better for viewing on smaller screen devices.

Text box in the middle for 'Notes', Slide out column 'Menu' on the right with table of contents
"Notes" section and expanded "Menu"

Reflections on Lessons Learned from a Challenging yet Rewarding Project

This project took a lot longer than I expected with lots of late nights, but I have achieved what I set out to do and even more. It definitely takes much longer to code up and make a data visualisation (best viewed on a Chrome or Firefox browser on a non-touch device, medium sized screen) interactive than a flat design due to all the time spent fixing elements that break each time I added or edited some code. Having to test and code up various breakpoints, testing and editing for different screen sizes and browsers, many of which I hadn't yet covered as I only tested on the devices and browsers I had to hand also extended the project duration

Of course as with all projects there is so much more I could do, lots of different directions I could have taken, many improvements I still want to make... but overall I have learnt so much from this project and need to finish it to move on. I will take new learnings and any unfinished ideas to my next project.



~
Many thanks to several in the KDL team for your feedback along the way, and editorial review by Samantha Callaghan and Neil Jakeman.
~