Red Riding Hood 2

Emotion and Sentiment Visualisation

Project:

I started another data visualisation course this time with Frederica Fragapane. I liked her style and was keen to learn about her process and methods which is explained in the course. Initially I thought of using a new topic and data set but there were a few ideas I didn't get to try out with my previous Red Riding Hood project, and liked the idea of digging deeper more into it and see how differently it could be presented. I used my 10% time at King's Digital Lab, and also my personal time (about 70%).

on left scan of open book page 1, right completed visualisation of page 1, circle with words of page 1 around it
Image on left: scan of page 1 of a book edition from Worthpoint. Image on right: screenshot of page 1 of final visualisation

Process:

Note: This has turned out to be a really long read so I have broken it up into sections. I wanted to take more notes and screenshots along the way, after reading and impressed with Nicholas Rougeux’s "making of" write ups. Not only to improve my documentation process, but I find it helps me rationalise my choices during the project.

Background

In my previous project of Red Riding Hood, a colleague gave me the idea to add sentiment analysis to the inner circles which represent each word to add an extra layer of data and meaning. I had little knowledge of sentiment analysis so had to look it up to understand the basics. As I dug a bit deeper, I thought of different ways to represent the sentiment data that could be visually quite interesting. I wanted to show the sentiment for all the words by putting them around in a circle to be able to identify their position on the page, rather than the random placement of circles I had. I tried a few different ways but it they all changed the look of the "trees" too much and so left it minimal, just showing the top 20 positive and negative words, represented by light and dark outline circles, and also the page sentiment, represented by arcs above the page "tree". One of the methods in this second course was very similar to how I had tried to visualise the words previously placing data points around a circle.

on left focusing on legend with a tree, house and character using geometric shapes, on right circles with examples of sunburst like patterns, segments, radar graphs within
Legend of sentiment analysis in previous Red Riding Hood and work in progress examples trying out different styles

I was really excited to follow the process which involved putting the data in a radar graph and manually placing each data point to fit. The process was tedious and repetitive but I didn't mind that. What I did mind, after trying it out with some data, was that it was far too easy for me to make an error with the placement. I knew I had to adapt to make the process less error prone for myself. The data presented also wasn't that clear in the previous project and I knew improvements could be made to the sentiment analysis and present that more clearly.

Data

I thought I could fast track this stage since I was reusing the same data but I was wrong. I had used the free online sentiment analysis tool MonkeyLearn in the previous project as it was the best I found at the time that fit my requirements. I wanted to use it again so both projects would have the same data but when I listed down the results it gave for all the words on the first page, I wasn't satified with the results. "hood"(91.1%) and "above" (86.3%) were the top negative words, compared to "died" (54.8%) and "lonely" (51.8%) what I think are more negative. I was rather annoyed I didn't spot this earlier for the previous project. I had already spent quite a while searching for a sentiment analysis tool that would fit my requirements previously so I knew the search for an alternative would not be easy. I really wanted to stick with MonkeyLearn but the results just didn't sit well with me. I searched for quite a long time, past the first few pages of what the search results would give, and finally came across twinword.

The results were far better, "lonely" (-0.927) and "died" (-0.797) were the top negative words for the first page, and the process was also going to be much easier. With MonkeyLearn, I had to put the page text to get the page sentiment value, and input each word individual to get each word sentiment value. With twinword, I only had to put the page text and it gave the results of page sentiment and also results of all the positive and negative words on that page in one go so it would save me quite a lot of time. It didn't give neutral word values but better that way as I would feel tempted to use and visualise it.

left: 2 bar charts with positive to negative sentiment on vertical y axis and pages on x axis, center: 2 bubble charts with positive to negative sentiment on vertical y axis and pages on x axis, right: data of page 1 on spreadsheet, column 1 spelling of word, column 2 monkeylearn sentiment results, column 3 twinword sentiment results
Comparison of words of a page with MonkeyLearn and twinword showing excel spreadsheet, bar chart and RAWgraph bubble chart
Mockups

Once I was satisfied with the new and additional sentiment data I collected, I started with my usual process of sketching down some ideas and layouts. With the previous project, there was a visual focus on the characters, and locations. With this project, I wanted to the focus to be on the sentiment and the words.

photograph of the same sketchbook opened up to 4 pages with black fountain pen line drawings of circles, sunburst patterns and scribblednotes
Concept drawings in sketchbook throughout the project

I went back to RAWgraphs to see if there was a graph I could use to get a similar effect. What I needed was a bar chart but arranged in a circle - radial bar chart but RAWgraphs didn't have that. I tried a few and the sunburst one was close but the data structure wasn't quite right. I changed things around in excel, something that I'm sure is not conventional and a bit crazy to do but I knew it would give me what I needed. I typed out a binary system of the letter count for each word per row for the first page, put it into RAWgraphs to make the sunburst chart, exported the svg to Illustrator, typed out each letter, of the word and deleted the additional data which was added for spacing.

left: words shooting out going clockwise around a circle, right: styled version with sentiment colours and handwritten and script fonts
Export of RAWgraph sunburst chart and a version after cleaning up in Illustrator

I mocked up a version of how all the pages might look altogether when placed as a scatter plot with the page sentiment on the y axis and pages on the x axis.

top: bubble chart drawn in illustrator with positive and negative sentiment on vertical y axis and pages on horizontal x axis, 16 circles depicting pages with words circling laid out horizontally with sentiment colours in circle and words, bottom: enlarged view of a circle
Mockup of layout and individual pages in illustrator and photoshop
More Data

When browsing throughtwinword's site, I found out it had more tools and one I thought would be interesting to have alongside was emotion analysis so though it would be interesting to add that for each page. I also wanted to show the sentences since there were always 8 on each page and also the rhymes as it was written as a poem. So I ended up having a rather long process. For each page my process would be:

  1. Get the text from the original format
  2. Compile in MS Word
  3. Gather the data in Excel
  4. Put the data in RawGraphs and export svg
  5. Clean and type the letter in Illustrator
  6. Get sentiment analysis data from twinword
  7. Compile in MS Word
  8. Colour sentiment to word in Illustrator
  9. Style sentence and rhyme in Illustrator
  10. Get emotion analysis in twinword
  11. Compile in MS Word
  12. Gather in Excel
  13. Put the data in RawGraphs and export svg
  14. Add emotions and clean up in Illustrator
  15. Style words
screenshots of original page, word document, excel sheets, excel graphs, RAWgraphs, web pages, vector graphics
Steps of my initial process of a page
Prototype

It was taking me about 2-3 hours per page and I hadn't even finished 3 when I stopped to think. I could continue and finish them all but I also knew I wanted to eventually make the final piece interactive and responsive as it had quite a lot of information that would make the visualisation quite cluttered if it was shown in a static format. I decided to stop my work in Illustrator and start playing around with code.

I know a lot of data developer/visualisers were using D3 to make their very impressive data visualisation pieces. I had tried a few times to learn Javascript but never got it and usually spent a lot of time debugging my errors. I just wasn’t ready to try again just yet. I really liked working with CSS, had done quite a few side projects previously and have seen some amazing work with just CSS so wanted to see how far I could get with it, knowing it has its limitations.

I started making some prototypes in codepen which I have found really useful to code with since both code and results are on the same window, refreshes automatically, and what I particularly found useful for this project was their HTML and CSS preprocessors, which I used Pug and SCSS so I didn't have to write as much and structure the code more easily. I managed to come up with a prototype that looked very similar to the design in Illustrator. It was responsive, semi-accessible for screen readers (I think), but some elements just didn’t scale well.The connecting lines, lines and dots to mark the begin and end of a sentence get misaligned when scaled due to some elements being position manually.

After a few more iterations, I came up with a version that removed the begin and end points and represented the start with a darker colour and gradually fading to a lighter colour ar the end of the sentence on curved lines, replaced the rhyme lines with two side borders on the rhyme words, one pair with solid and the other pair with double.

As the project progressed, I added page sentiment colour to the center circle, some page summary information, and the rhyming words of the page, gave the emotion triangles a slight gradient to make it more subtle and subdue. And also some tooltips for additional information.

 three versions showing slight variations of words placed around a circle with sentiment colours on certain words, one with a gradient sentiment colour in the circle and more text, dashed lines with start and end points, connecting dotted lines or side borders, 3 gray or gradient triangles around circle.
Design of 'Page 3' in Illustrator as a mockup, Codepen early and later prototypes
Layout

I wanted to have a default overview of all sixteen pages and able to zoom in to view each one in more detail. At first I thought I could scale down each page and on hover over would expand but it kept flickering from each version during the interactions. After a few trials, I got it working using the modal interaction I have used previously in several projects but with some added transitions. But I didn't like the way the prototype worked. There would be too many clicks or taps to interact with just the basic part of the visualisation to view each page.

I went back to my sketchbook to draw out a few ideas and thought of making it more of a scrollytelling like piece so the user gets and overview of the project, how to read section, and overview of all the data visualisations, then moving on to the individual data visualisations. I had to consider how everything would work on different screens especially smaller and touch screens and like with my work projects, I now try to have as few variations between the screen sizes so there more consistency, also less to style and code up. I used as much CSS properties but ran into a few that either were not supported in all browsers and had to add a few Javascript snippets. I finally came up with a layout that I was happy with. The initial prototype had the "how to read" section on the same page, then iterated to having it in on a separate modal pop up as I thought it would be an------ was iterated slightly as the project progressed and with feedback.

Sketches and Mockups of layouts
Sketches and Mockups of two layouts
Data Entry

Until this point I had only input data for one page and left the other pages until the end since I knew it was just a case of data entry as I had all the structure in place. What I didn't realise was how long it would take as there were so many stages, I had added along the way. It took about 40 minutes - 1 hour per page. This was my process to input individual page data into HTML and CSS after the general code structure had been finalised:

screenshots of the some of the code process
Screenshots of the some data entry into code process of the main page sections
  1. Input page summary in HTML of the inner circle
    • Page number
    • Number of words
    • Two sets of rhyming words
    • Page sentiment value
    • Top emotion
  2. Input Emotion types and values in HTML, position height values in CSS
  3. Input individual word and sentiment value in HTML, colour sentiment value in CSS
  4. Add sentence lines in CSS
  5. Input borders for rhyming words in CSS
  6. Add Sentiment colour and opacity in CSS
  7. Add Page Text in HTML
Feedback

I had entered the first few pages of data so the end was near (or so I thought). Since there were so many parts to this visualisation, I knew I had to ask for feedback. I asked around my usual circle for feedback but I didn’t get any replies. I carried on with the data entry, continued testing it out but had a growing feeling something was lacking. I went back to the previous Red Riding Hood project to recall what it looked like and realise it looked a lot more interesting. Apart from it having more colour, which I made a conscious decision to minimise for this one, I concluded it was because the narrative of the story was much more evident. I couldn’t add much more data as it would add more clutter and probably add confusion to an already unfamiliar and unconventional format. I thought what would be the minimum addition that could show the narrative. After relooking at the previous project and a bit of thought, I narrowed it down to two characters and two locations.
- Red Riding Hood
- Wolf
- Red Riding Hood’s House
- Grandma’s House
. Those 4 things I thought could give a sense of the story and I could replace the page numbers with the characters, add some sort of house indication, and made a mock up of it.

sss

replacing numbers with symbols
Replacing numbers with symbols in the navigation

I still needed feedback and by some good fortunate, I had visited to the Data Visualisation Society website as I occasionally do to check on upcoming events and there was an “Early Career Create: Project Jam” event in 2 days. It read:
 “During this event, we'll separate into smaller coworking groups based on what aspect of the project cycle you are working on at the moment: ideation, prototyping/design, development, etc. so you can get feedback from others in a similar place. This event is open to all DVS members.“ 



Data Visualisation Society Project Jam
Event image from the Project Jam of Data Visualisation Society

I had been to a few DVS online events but only lingered in the shadows. I had never spoken about my personal work to anyone other than my family and a handful of colleagues so was a bit nervous having to explain the work to people I don’t know. Fortunately it wasn’t a large group, just 4 others, I showed the latest version of the prototype I had at the time and the feedback was really useful.

screenshot of visualisation, 16 circles positioned with various heights positioned horizontally left to right depicting pages, a large circular sunburst like chart in the center
Prototype version of what was shown at the DVS Project Jam

It confirmed what I had felt but with more accuracy and valuable suggestions for improvement, so thank you again everyone who was there. 

There were so many things I should and could improve, and so many solutions and ideas. I sketched out a few and tried to figure out which were the ones that solved it the best, but also if I could implement them quickly as I had already gone over my personal deadline to complete. I decided to not give myself another deadline as it would just add unnecessary pressure for a project that was meant to be enjoyable, but maybe have one again when I was closer to the end not to drag it out longer than needed, as I did have my next project in mind and also keen to start that.


Some of the feedback included:

I started adding tooltips to the navigation, to show on hover the word count, and numerical sentiment value with a line linking it to the scale. I also added the emotions but took it out as it felt too messy and confusing.

showing all the tooltips in the navigation
Adding tooltips to navigation

I still wanted to add the characters and locations so worked on many mockups to find ways to integrate them. I thought an overview of the character and location counts across the story would be useful, then breaking down into per page. Then I also had a suggestion from my partner to add them within the individual pages as well, so made them show upon hover.

examples of characters locations mentions
Mockup examples of characters locations mentions done in Illustrator
How to Read

The "How to read" section was a lot trickier than expected and I had to re-write several times along with the gif animations as I couldn't think of good way with just static images. Initially I wanted it as part of the main page combined with the scrollytelling function but thought it would be frustrating for a user who already knew how to read it to have to scroll through it all to get to the visualisation section so put it as a separate modal page. But had feeback from the Project Jam DVS session it was less likely to be used if not on the main section, so made a tab/accordion element so the information would take up less space.

Stacked section titles on left, image and explaination on right
How to read section
More Feedback

I made the changes and wanted to show my colleagues at work what I had done so asked if I could book a meeting slot to present. It was important to me to show them what I had been working on, especially since part of the time was done using the 10% allocated at work to work on projects like this, and I also wanted to see if this was the type of work that could be integrated into the work the lab does.

I took a lot of time to prep for the presentation, had a lot of material, did a trial run with another colleague then dropped about half of the slides as it felt irrelevant. The feedback I got was very useful. There were questions on why I made certain choices, which made me reflect on them, and suggestions for improvements.

Final changes

I was getting very good at tweaking things and felt I could work on improvements forever. But I gave myself a few tasks and worked on making those changes. One of those changes was adding punctuations to the page overview based on the feedback I had received and also decided to also add the emotions.

Pages depicted by circles on top row, punctuations on second row, emotions on third row, characters on fourth row, locations on fifth row. Each column is a page from 1 to 16.
Overview of the 16 pages showing Punctuations, Emotions, Characters, Locations.

Another thing I wanted to show was how a researcher’s insights could be integrated within and added the “Notes” section. I copied and pasted material from another resource as I did not want to spend the time and probably wouldn’t be able to write anything substantial.

The page felt very long and I added a “Menu” that expanded to show the page sections to act as a table of contents and navigation tool. Also restructured the sections, retitled some headings and made an attempt to fit some elements better for viewing on smaller screen devices.

Text box in the middle for 'Notes', Slide out column 'Menu' on the right with table of contents
"Notes" section and expanded "Menu"

Conclusion:

This project took a lot longer than I expected with lots of late nights but have achieved what I set out to do and even more. It definitely takes much longer to code up and make a data visualisation interactive than a flat design due to all the time spent fixing elements that break each time I added or edited some code. And also having to test and code up various breakpoints, testing and editing for different screen sizes and browsers, many of which I haven't covered as I only tested on the devices and browsers I had to hand.

Of course as with all projects there is so much more I could do, lots of different directions I could have taken, many improvements I still want to make... but overall I have learnt so much from this project and need to finish it to move on. I will take new learnings and any unfinished ideas to my next project.

Visualisation:

https://ongtiffany.github.io/project/rrh2-viz.html
(best viewed on a Chrome or Firefox browser on a non-touch device, medium sized screen)