Get with the Programming

(This continues my previous post on this research project, about my questions and initial steps.)

This week I’m away to the Pacific Northwest Renaissance Conference to deliver a paper on rhetorical figures in early modern drama. (Wait! Don’t stop reading, it gets better.) I feel like a legit digital humanist for the first time in my life, because I’ve written my own computer program to analyze texts – a bash script in Unix that you can try for yourself on Github.

Okay, so my program just prepares my text files to run a far more complex program by Marie Dubremetz at Uppsala University (chiasmusDetector), but getting it to run on my files took some work.

Marie’s program is written for Python 2.7, so I installed pyenv to switch between versions of the program. I also had to download Stanford’s coreNLP, which processes plain-text files to prep them for chiasmusDetector. Among other things, it encodes a lemma for every word: so chiasmusDetector can see that ‘drew’ is the past participle of “draw,” or “days” is the plural of “day,” and so on: and thus it will find repetitions like “{He} grieves {much} — | And me as {much} to see {his} misery,” from Shakespeare’s Two Noble Kinsmen.

After those installations, it was a matter of getting the files ready. First, I focused on dramatic texts – mostly because I needed a proving ground, and I could have well-edited texts of 69 plays from two projects: 38 Shakespeare plays from Folger Digital Texts; and 31 by his contemporaries from Early Modern English Drama. I needed only the words of those plays, not extraneous bits like character lists or speech prefixes: i.e. “To be or not to be, ” not “HAMLET: To be or not to be.” (For those files, I’m indebted to Mike Poston and Meag Brown at the Folger Shakespeare Library.)

Finally I was ready to run chiasmusDetector. I wrote a bash script with help (to put it mildly) from Kourosh Banaeianzadeh, who works for University of Calgary’s digital-humanities lab, LabNext, in the Taylor Family Digital Library. A bash script is just a series of Terminal commands (on my macOS High Sierra) that run in sequence.

If you read my script, you’ll see comments on every line; remember they were written by both a newbie programmer and a verbose English professor, who documents everything down to file movements and directory changes. Not exactly thrill-a-minute reading, but the results are worth it.

In my next post, I’ll describe some of those results, and my conclusions.

Leave a Reply