Workshop for Love's Labour's Lost, June 2017

Note that my previously-nifty 10-minute Shakespeare play to network graph hack now no longer works due to the Google Fusion Tables 2019 turn-down. Pour out a glass of mead for Google Fusion Tables.

This site is for the actors of the Shakespeare in the Summer troupe, and it's intended to help make the methods and decisions behind the different types of analysis more transparent, and to give you opportunities to play around more with these techniques, and make them your own, if you like.

Download the Actor Information Packets.

Table of Contents
Activity 1: Words
Activity 2: Networks
Activity 3: Feelings
WordLab Workshop at UPenn/Raw Text and Data Files
Further Reading, and Thanks

Activity 1: Words

Pretty much all types of textual analysis start with counting words. For the first activity, I used Voyant Tools, an open-source, web-based tool for analysis and visualization of digital texts. It's one of the foundational tools which is often used to introduce textual analysis in academic settings.

The bigger the word is, the more times it appears.

The interactive, below, is set up to let you look closely at our text. (Adjust the "scale" at the bottom left corner to see your own character's top words.) If you want to look at a different text, go to voyant-tools.org. You can copy-and-paste text right in, or, if you want to look at a bunch of texts together, the way we did here, make them into a zip file.

Seeing just a screen to load, or want to see more info? Check out the Voyant Page for our play.

A note on stopwords
A stopword list is a list of the the/and/it/is-type words that dominate our language. When they're in the textual analysis, they can make it hard to see anything else, so Voyant removes them by default. (Though because Voyant isn't designed specifically for Shakespeare, it doesn't catch them all--anyone notice "doth" in our word cloud?)

Stopwords can be confusing, but also really interesting. Did you know that over 20% of the play's uses of "I" are from Biron?

Colorful word cloud of stopwords, most prominetly,

Activity 2: Networks

Imagine actors in a giant circle, and every time you say a line to someone, you step closer to them. Ultimately, you'd move closer to the people you have more interaction with, and other actors would drift to other sides of the room. That's approximately what a network graph represents. They're used a lot to analyze things like Twitter networks and to see how communities are connected (for example, check out this visualization showing two distinct communities based on political party).

Every dot is a character, and every line is a time the character speaks following another. (Note that I haven't made any attempt here to determine whether characters are actually speaking to each other--they just have lines next to each other.) Play around with it, and see what the data looks like.

Network graph shows dots (nodes) for different characters, attached with lines. Characters cluster in communities---the ladies of the court are off to the left, with Boyet, their servant, in a transitional role, while the lords close around them. On the right side, the scholars and Dull cluster, and Moth, Armado and Jaquenetta cluster. Biron is most central, and Costard is evenly spaced between courtly and clown worlds.

Love's Labour's Lost

For the workshop, I used Gephi, which is a free open-source software program for network analysis. It does the same thing as Fusion tables, but it's got some functionality built in to do more analysis of the network and push the characters closer or further away based on the strength of the connections.

It doesn't live online so I can't post a live visualization here, but here's the image file of one way the network looks.

Keep in mind that this isn't cut-and-dried--the network will look different depending on the parameters you set. But you'll see that it's very similar to the more basic one in the Fusion tables, just a little more exaggerated.

For contrast, here are some network graphs I made of our previous shows. Do you see anything that surprises you?

Network graph shows dots (nodes) for different characters, attached with lines. Characters cluster in communities--Titania, Oberon and Puck are closely connected, as are all four lovers, and the clowns. Bottom (connected both to clowns and Titania) is most central.

A Midsummer Night's Dream

Network graph shows dots (nodes) for different characters, attached with lines. Characters cluster in two main groups, one of the courtly characters, one of the clowns and villains, with Don John as the main connector.

Much Ado About Nothing

Activity 3: Feelings

This analysis is based on a method called "sentiment analysis." You know how social media sometimes targets ads at you based on your interest? They're using a form of sentiment analysis--if you say you really like Shakespeare, it might pick up the words "like" and "Shakespeare" and pop some theater ads into your newsfeed.

For this project, I did the analysis with a software called LIWC. I cheated a bit with this one by using a proprietary software--there are free open-source ones, but they tend not to let you dive as deeply into the specifics of a speech.

The software compares each word to a human-created dictionary of words in particular categories. English words are complicated and nuanced, so this won't give you 100% accuracy in any case. Plus, the dictionaries are designed for modern English. Still, it can give you some interesting jumping-off points for analysis.

Fun "Facts"* From Sentiment Analysis (read on to see why these aren't facts at all)

Armado uses the highest percentage of words associated with digestion.
Biron has the highest percentage of “seeing” words.
Boyet uses the highest percentage of body-part words.
Costard uses the highest percentage of words associated with money.
Dull has the highest percentage of words associated with discrepancy (would/should/could), and also with certainty (every, all, entirely).
Dumain uses the highest percentage of sexually-associated words.
Ferdinand has the highest percentage of interrogative words (i.e., questions).
Holofernes uses the highest percentage of words of six letters or over.
Jaquenetta has 0% words associated with seeing, but the play’s highest percentage of words associated with hearing.
Katharine has 0% words associated with anxiety. Rosaline and Maria also have 0%!
Longaville scores as the most authentic in tone, and has the most words associated with risk, and has the highest percentage of words associated with anxiety.
Maria uses the highest percentage of words associated with feeling.
Moth has the highest percentage of tentative words.
Nathaniel uses the highest percentage of words associated with “insight,” i.e., cognitive processes.
Rosaline scores the least analytic in tone (i.e., more informal, personal, narrative)

*Why these aren't really facts: Here's the full report for our play. (Explanations for the categories are at the end of the LIWC Operator's Manual). This is not very accurate--it shows what percentage of that file matches a particular category, and because the files have very different wordcounts, the numbers get quite skewed. Also, because the dictionaries are proprietary, it's hard to get under the hood and see why things might have turned out the way they did. But it's kind of fun to look through.

See the full report on your character's speech here:

Armado speech and analysis
Biron speech and analysis
Boyet speech and analysis
Costard speech and analysis
Dumain speech and analysis
Ferdinand speech and analysis
Holofernes speech and analysis
Katharine speech and analysis
Longaville speech and analysis
Maria speech and analysis
Moth speech and analysis
Nathaniel speech and analysis
Princess speech and analysis
Rosaline speech and analysis

WordLab Workshop

(This is some material and slides I put up for a 6/21 workshop/brainstorming session at WordLab, a textual analysis group at the University of Pennsylvania.)

Plain-text uncut versions of text are pulled from this version (the HTML tagging structure made it a little easier to manipulate). I decided to use uncut versions of the text for Activity 1 and 2 because they're making generalizations about the play and characters, which I felt might be slightly more authentic with the uncut text. I used the cut text for the more specific analysis of Activity 3.

Workshop for Love's Labour's Lost, June 2017

Activity 1: Words

Activity 2: Networks

Activity 3: Feelings

WordLab Workshop

Further Reading and Thanks