Encoding Exercise Description for English 203
Digital text-analysis relies on a layer of encoded information between the text and the algorithms that analyze it. Encoding is the necessary first step to making the elements of a text show us interesting things.
If we want to analyze all the questions that include nouns in Hamlet, for instance, we need to identify or tag every noun in the play, and isolate those that appear in the sentences identified as questions. Likewise, if we want to count Hamlet’s neologisms, we need to tag both a set of speeches as Hamlet’s, and a set of words as neologistic.
The purpose of these tags is to enable computers to give us new, unexpected, and quantified readings of texts we thought we know. ‘Quantified’ is an important word, because these codes are about numbers — about not just the 0s and 1s of binary code, but the numbers and positions of verbs or repetitions or sentences in a text, and how they compare. A text whose verbs are all tagged as verbs allows a computer to put all of those verbs in a list, and then to count which ones appear most frequently.
So, what can we tag? Tags can identify a text’s structure (lines, speeches, sentences), its language and parts of speech (nouns, verbs), and other information (e.g. repetitions). All of these are quantitative or (relatively speaking) objective features of a text. What they miss are all the qualitative features of a text, like a speaker’s tone of voice, or a metaphor, or a collection of words that seem thematically linked. Those are a lot more difficult to tag, because they involve more subjective interpretation. I’ve written a little more about that problem elsewhere.
For this exercise in English 203, we’ll stick to the questions of why we do text encoding, and what we might encode in Hamlet 1.3 — rather than how to do it, which we covered the week before.
You’ll address these questions of why and what in a 45-minute in-class writing exercise. In an essay of about 5 to 6 double-spaced pages, using three or more quotations from Hamlet 1.3, answer the following question:
What are the three categories of tags you would add to different words, lines, and speeches in this scene, and why? Give a few examples from each category, and identify the word or words you would tag with them. Discuss how each category is either quantitative/objective or qualitative/subjective. Does it matter?
You can choose from among these suggested categories (each followed by examples), or invent your own:
- parts of speech (noun, verb, adjective, adverb, preposition, article, …)
- repetitions of words, anywhere in the text
- tone/performance (advice, anger, affection, respect, resentment, … )
- name/reference (“my lord,” “my father” = Polonius; “you,” “sister” = Ophelia; …) (depend on context/speaker)
- thematic clusters:
- parts of the body (head, heart, ear, …)
- financial terms (brokers, investments, bonds, …)
- structure (speech, line, sentence, enjambment, caesura, interruption, …)
- interactions (statement, question, answer, …)