TEI for Close-Readings – Michael Ullyot

This is the paper that I delivered on 13 November 2017 at the Text Encoding Initiative (TEI) annual meeting at the University of Victoria (British Columbia). Here’s the PDF of my slideshow, whose images intersect my script below.

This talk is prescriptive and theoretical, rather than descriptive and practical. I’m going to advocate for the benefits of implementing TEI to standardize students’ markup of close-reading terms, but I haven’t actually done this. The closest I’ve come is using Google Docs, and then LitGenius, to compile student annotations, but not to channel them through a standard like TEI.

I know I can do better, both for better learning outcomes and better research outcomes. So I’m here to present a plan, one that needs your advice and guidance to realize. (This past weekend I’ve taken two workshops and had many conversations about implementing different parts of this plan: first with Janelle Jenstad and Joey Takeda, and then with Martin Holmes.)

So here it is: a framework to standardize TEI markup of the terms that scholars use when close-reading texts.

Why would we want to do that? There are two main purposes:

first is the pedagogical purpose of training readers, namely students, to annotate texts with interpretive metadata at the word level (often, multi-word level);
and second is the research purpose of building a training set for supervised machine learning systems (and someday, unsupervised machine learning systems) to recognize those text features that we human readers can find more or less naturally.

Now, I recognize I’ve just made quite a leap: from trusting novice readers (#1), to trusting robots to automate me out of my job as a literary critic (#2).

I’m not advocating either of those things, not without a lot of human experts to examine and verify what the students and the machines annotate:

this system will need humans to verify students’ metadata (so they don’t mis-label terms — mistaking synecdoche for metonymy, say); this is crucial because errors in that metadata will propagate in the next stage;
and then, in that next stage, this system will need humans to guide the machine learning process, to correct their errors and confirm their results with each iteration of that process.

But I’ve still not specified the ultimate payoff. Why teach a machine to close-read texts? (Or more precisely, to encode texts with close-reading terms that mimic human annotations?)

I hear the humanists protesting: “Is nothing sacred in your techno-deterministic future? Doesn’t close reading makes us human?”

It does, and that won’t change. All that will change is what we will read.

Right now, literary critics are prisoners of context. We take a book from the shelf or we load a play from the ISE library, we read it sequentially, and we make an argument from the patterns we recognize in it. We choose those books based on their cultural capital, maybe their authorship or their canonical status or their recommendation by authorities we trust.

And our arguments are brilliant in their particularity: we can understand Shakespeare’s rhetoric because readers from Miriam Joseph to Frank Kermode to Jonathan Hope have described its forms in detail.

But we don’t understand rhetoric, or metaphor or personification or symbol, in trans-contextual ways. We interpret these phenomena as particularities assembled by a given author or text, but not as abstracted, trans-textual phenomena.

Whether or not you agree with that methodological goal, I argue that we should have the option. Or at least, we should have the ability to see how Shakespeare’s rhetoric compares with other writers’ rhetoric. And our choice of those writers shouldn’t be arbitrary; it should be wide-ranging and it should be as objective as possible.

This is what my collaborators and I call ‘augmented criticism’: taking what critics do naturally, noting textual features, and expanding our grasp of comparable features.

Is nothing sacred? Of course: time-honoured critical habits are sacred. We read to gather examples, and to make arguments from them. This extends our reach to more examples.

We uphold those habits through practice, but also by teaching them to the next generation.

No matter how often we illustrate its terms and tropes, there’s no better way to teach close-reading skills than making students try it: reading with a pen in their hand, noting local features and identifying broader patterns.

Their individual results will vary, but: the aggregate result will be a working consensus about the patterns and variations in a text that seem to reveal the writer’s deliberate choices, or have the strongest effect on readers.

Here is a set of text features, grouped into four categories. (This is a subset of a much longer list, available here.)

For TEI encoders, the categories aren’t too important; but for readers, the categories help distinguish between different modes of address, different mental habits or (you might say) filters that they bring to a text each time they read it:

structural terms for the relationships between words, mostly in poetry;
linguistic terms for a text’s surface-level features;
semantic terms for more connotative features, below the surface; and finally
cultural terms for broader features, some of them pointing outside the text.

By the way, I’m using the word ‘text’ as if it applies equally to all forms and genres. But a glance through this list will uncover my biases and teaching habits: I developed it as a teaching tool for students reading Shakespeare’s sonnets (and it’s actually longer than this, but I’m already cramming too much text on a slide).

My list of terms and categories is both provisional and incomplete: with the aim only of starting a taxonomy of interpretive tags.

Now, some of you will protest: how can we encode something so interpretive as ‘tone’ or ‘irony’ or ‘paradox’? And even if we could agree that it obtains somewhere in a text, exactly which words or characters would be contained in these tags?

I don’t have an answer, but if a text has a given tone it can only exist at the level of words (what other level is there?); so agreeing on which words is an interesting, but secondary, problem.

Okay. Forgive me for this rough division: but if we imagine that every term is somewhere between objective and subjective, then let’s start (at least) by encoding more objective features, a few of which are here on the left.

A simile is a metaphor using the word “like” or “as”, so “my love is like a rose” is undeniably a simile.
Whereas a metaphor (“my love is a rose”) is much subtler: it only requires a writer to yoke together two unconventional images.

Incidentally, it’s really tempting to think that if we fed enough metaphors into a machine we could ‘teach’ it through juxtapositions of synonyms (or something) to detect metaphors automatically. Maybe we could, but the problem with unconventional devices is that they’re, well, unconventional: there’s no universal formula for something that’s by nature anti-formulaic.

And if you think that challenges my second purpose of about machine learning, you’re right. It’s the core problem with algorithmic criticism: texts are slippery.

Okay, then: so start with the low-hanging fruit. Set aside your irony and symbol tags and start with something like enjambment: that is, a line of poetry that flows over the line-end barrier into the next line. Or start with repetition: words repeated, sometimes with variation, for effect. Surely these are features we can agree on, right?

Let’s look briefly at two categories that seem objective, and usually behave conventionally: rhetorical figures, and rhyme.

My work for the past few years has been automating the detection of rhetorical figures: those repetitions and variations of diction and syntax that lodge themselves in your memory, that sound deliberate and purposeful, that are beautiful and compelling.

Consider chiasmus, the inverted repetition of two words or ideas, AB|BA:

“Fair is foul, and foul is fair.”
“Ask not what your country can do for you, but what you can do for your country.”

Or gradatio, the sequential chain of words at the beginnings and ends of clauses, AB|BC|CD:

“Pleasure might cause her read, reading might make her know, knowledge might pity win, and pity grace obtain.”
Or, more simply: “She swallowed the bird to catch the spider, She swallowed the spider to catch the fly.”

We can find these readily enough; you can read my other posts on how we did that.

What about rhyme? Is it objective or subjective? Does it behave conventionally?

Rhyme isn’t a simple binary; there are eye-rhymes and sound-rhymes, and pronunciations change over time. I think of the final couplet from Shakespeare’s Sonnet 116, rhyming “proved” and “loved”: in the Elizabethan Original Pronunciation those words audibly rhymed. So there are degrees of rhyme certainty.

Here’s a more recent example, from Module 4 of “TEI by Example.” I’ve added its rhyme scheme, ABAB CDCD EFG EFG.

And here is how TEI by Example advises we encode that scheme: as the value of the rhyme attribute in the line-group element; and for good measure, as the value of the label attribute in the rhyme element, around each line.

Let’s return to basics: reading and writing. I can’t interpret anything I read without annotating it: marginalia is the original markup.

This is my copy of Shakespeare’s Henry V that I annotated for my students, to demonstrate my habits of close-reading.

I used this to write a short model essay on the same passage (which was harder to write than I remembered!), and then assigned my students other passages in the play.

This is just one way to demystify close-reading. Others from disciplinary guides (like Wolfe and Wilder’s Digging into Literature) include the think-aloud, whereby you record yourself doing the thinking behind these annotations.

Another is to aggregate students’ annotations. I’ve tried the Lit Genius model, a web interface designed for music lyrics (as you can see: I promise this is the first time I’ve used Taylor Swift in a conference paper!); but it can easily be repurposed with an educator account.

The advantage is that it’s a well-designed web interface; but the catch is that you lock the annotations into their hosted system, and they’re not exportable; nor is their encoding transparent; and the functionality is limited. You can’t standardize the annotations, or toggle them by category, for instance.

And then there’s XML. Far more customizable, as we well know; but far less smooth and serene than the Taylor Swift interface.

With a custom TEI schema, and an expanded and repurposed tagset, I can compile my students’ annotations of a common text’s features in order to compare and verify them. TEI standards will ensure they’re interoperable with those beyond my classes, eventually.

That helps me address a question that’s come up a few times at this conferense: when do you use TEI rather than a home-cooked markup language?

My answer: Whenever your learning or research outcomes require the rigour of standardization, either across the classroom or across the larger field.

So what combination of TEI elements, attributes, and values will get us there? This is where I need your advice most.

I learned this weekend that stand-off markup will be necessary because of overlapping hierarchies: because you’ll have (say) repetitions that spill over the edges of rhetorical figures; and enjambment elements that overlap with rhymes.

This is an instance of chiasmus (the rhetorical figure AB|BA) in Henry V: first in the text, and second in the TEI.

I’ve used the element from the TEI’s core module, with the type attribute’s value naming the figure and its number in the text; I’ve wrapped that around the self-closing element from the analysis module. Its two attributes’ values points to the beginning and end of that red line in the text; you may have noticed in the last image that each word had a unique xml:id.

I can’t say this is the best system. There’s also the element in the analysis module, and in the tei module there’s something called an att.interpLike class (which I don’t understand) that offers an attribute ‘@type’ whose values can include image, character, theme, or allusion.

So, clearly there’s the potential for me to adapt systems that already exist.

Finally, what interface would best capture this markup from students? I don’t know yet. Students have a productive anxiety about raw XML, because they (and we) live in an edited world, with coded reality behind the serene surfaces.

So do we make students swallow the red pill, and lift the veil? Or the blue pill, and follow the LitGenius path?

I’ve said much more today about student learning than about machine learning; this is a pedagogy panel, after all. And honestly, it’s premature to make more ambitious plans before I get this aggregation system right.

But indulge me for a moment, and consider why we would want to make those plans.

A well-annotated library of close readings could serve as a training set for machine learning, to enable machines to detect these features automatically. It’s easy to imagine starting with low-hanging figures of speech, as I have with rhetorical figures, before progressing to higher-level figures of thought: the metaphors and allusions that require human readers, at least for now.

Set aside the pragmatics for now. (“Damn it, Jim! I’m a critic, not a computer scientist.”)

Think instead about how automated detection of text features might train human readers. Not overtly doing the interpretive work for them; this isn’t an answer-key or shortcut for lazy readers. No: think about how you might mark up a passage in a digital edition, or annotate a sonnet, and trigger a recommendation algorithm like Lucent’s MoreLikeThis, along the Netflix/Spotify model: if you like this passage, here are some other with similar features.

You know the Netflix-style textual subgenres (like “gritty coming-of-age heist comedies” or “post-apocalyptic musicals with a strong female lead”). So for poetry, imagine:

ironic-tone ABCBA-stanzas with chiasmus
personification in prose allusions to King David

Why would you want such recommendations? To escape the narrow, arbitrary particularities of human readings. To break the canonical grip of Shakespeare or Herman Melville on our arguments. And to make more persuasive, definitive arguments with wider-ranging evidence, not just with the books we happen to have read.