Get with the Programming – Michael Ullyot

(This continues my previous post on this research project, about my questions and initial steps.)

This week I’m away to the Pacific Northwest Renaissance Conference to deliver a paper on rhetorical figures in early modern drama. (Wait! Don’t stop reading, it gets better.) I feel like a legit digital humanist for the first time in my life, because I’ve written my own computer program to analyze texts – a bash script in Unix that you can try for yourself on Github.

Okay, so my program just prepares my text files to run a far more complex program by Marie Dubremetz at Uppsala University (chiasmusDetector), but getting it to run on my files took some work.

Marie’s program is written for Python 2.7, so I installed pyenv to switch between versions of the program. I also had to download Stanford’s coreNLP, which processes plain-text files to prep them for chiasmusDetector. Among other things, it encodes a lemma for every word: so chiasmusDetector can see that ‘drew’ is the past participle of “draw,” or “days” is the plural of “day,” and so on: and thus it will find repetitions like “{He} grieves {much} — | And me as {much} to see {his} misery,” from Shakespeare’s Two Noble Kinsmen.

After those installations, it was a matter of getting the files ready. First, I focused on dramatic texts – mostly because I needed a proving ground, and I could have well-edited texts of 69 plays from two projects: 38 Shakespeare plays from Folger Digital Texts; and 31 by his contemporaries from Early Modern English Drama. I needed only the words of those plays, not extraneous bits like character lists or speech prefixes: i.e. “To be or not to be, ” not “HAMLET: To be or not to be.” (For those files, I’m indebted to Mike Poston and Meag Brown at the Folger Shakespeare Library.)

Finally I was ready to run chiasmusDetector. I wrote a bash script with help (to put it mildly) from Kourosh Banaeianzadeh, who works for University of Calgary’s digital-humanities lab, LabNext, in the Taylor Family Digital Library. A bash script is just a series of Terminal commands (on my macOS High Sierra) that run in sequence.

If you read my script, you’ll see comments on every line; remember they were written by both a newbie programmer and a verbose English professor, who documents everything down to file movements and directory changes. Not exactly thrill-a-minute reading, but the results are worth it.

In my next post, I’ll describe some of those results, and my conclusions.