by on

Text Creation Partnership

How does an invisible system shape the experience of an end user?

I found myself pondering this question again and again throughout the 2011 Chicago Colloquium on Digital Humanities and Computer Science, which took place November 20-21. This event is jointly sponsored each year by Northwestern University, Loyola University, University of Chicago, and the Illinois Institute of Technology. This year, Loyola University hosted the colloquium at its Water Tower campus.

Some papers explicitly asked us to look more closely at the underlying structures that frame our interactions with digital resources. Nick Montfort’s opening keynote talk, “Platform Studies,” got us thinking in this direction. According to Montfort and Ian Bogost, who co-edit a series in this area for the MIT Press:

“Platform Studies investigates the relationships between the hardware and software design of computing systems and the creative works produced on those systems.”

Scholars in this field might be interested, for example, in what it means to re-create a video game originally designed for an arcade machine on a home video game system. How does the experience of the game change when it is re-imagined for an entirely different kind of machine and style of computing? This talk was essentially my first introduction to platform studies, but the questions being asked there strike me as quite similar to those investigated by scholars of book history and print culture, who seek to understand how the technologies of printing and the book have shaped reading and writing practices over the last 500 years.

Other papers also discussed computing platforms: George K. Thiruvathukal, Professor of Computer Science and Steven E. Jones, Professor of English, Co-Directors of Loyola University’s Center for Textual Studies and Digital Humanities and organizers of the colloquium, presented on “Platform Studies and the Construction of Game Space: The Nintendo Wii as a social platform,” and Nathan Altice gave the paper “Tool-Assisted: Console Emulation and Platform Plasticity.”

All of this might sound quite far removed from the work we’re doing here at the TCP, but what I enjoy about DHCS is the way consistent themes emerge from very different sorts of research. Other papers did not focus on computing platforms, but still raised questions and suggestions about all that can be hidden–or, conversely, made clear–by the systems through which we engage with digital content.

In “The Opportunities and Challenges of Virtual Library Systems,” Kyle Roberts described the process of customizing software designed for a modern library catalog in order to use it to reconstruct historic library collections online. His project team discovered many mismatches between library practices of the 19th century and a system designed for a present-day library. Library patrons may take cataloging and circulation practices for granted as invisible, natural processes, and software designed to facilitate these practices only reinforces these assumptions. However, these collisions between 19th-century practices and the assumptions made by 21st-century software remind us that library practices, old and new, sharply impact a patron’s experience. As as emphasized in platform studies, one environment can never be seamlessly interchanged for another.

Acknowledging that interfaces can hide a great deal of “what lies beneath,” Richard Whaling presented “Faceted Search for a Corpus Query System,” which described new development on PhiloLogic, the search engine developed by ARTFL and used by projects such as the Perseus Digital Library. Currently, many sites support complex querying of encoded texts by offering sophisticated search interfaces, sometimes with an intimidating number of fields and not a lot of information about how to use them. If the user doesn’t already understand the encoding of the text (which is often invisible to them!) this searching may be more like stabbing in the dark. Whaling proposes instead a system of faceted browsing, which would show users what tags occur in each division of text and allow them to choose how to drill down into the texts. The ECCO-TCP texts are already indexed by ARTFL, so I look forward to exploring them in this new way!

I was pleased to hear updates from Stephen Ramsay and Brian Pytlik-Zillig on their ongoing work to improve the interoperability of XML-encoded text collections. It is a well-known problem that collections of texts, even when encoded to the same standard (say, the TEI Guidelines), typically have so much variation that they can’t be joined together and processed at once without significant work to make them more homogeneous. At the University of Nebraska, Lincoln, a suite of tools has been developed (based in large part on working with the TCP texts) to handle this work. Ramsay and Pytlik-Zillig’s paper, “Code Generation Techniques for Document Collection Interoperability” discussed how the Clojure programming language made it possible for them to automatically generate the XSLT needed to transform tens of thousands of heterogenous XML documents so that they all conform to a single schema.

The TCP texts also featured in Ted Underwood and Jordan Sellers’ paper, “Combining topic-modeling and time-series approaches to reveal 18th-century trends.” Underwood and Sellers used the ECCO-TCP texts to experiment with computational approaches to finding sets of words that have a similar pattern of frequency over time, perhaps revealing a pattern worthy of further study. This is a vast improvement, Underwood notes, over “plugging promising keywords into Google’s ngram viewer, one by one, and seeing what comes up.”

I was grateful to have the opportunity to present my paper, “Making the Most of Free, Unlimited Texts: A First Look at the Promise of the Text Creation Partnership,” in this environment. This paper, about which I’ll say more in a separate post, discusses what the TCP has learned (lots of interesting stuff!) since restrictions were lifted on the ECCO-TCP texts six months ago. It also begins to wrestle with some the same questions about systems and platforms that popped up throughout the colloquium. As restrictions are lifted from these texts, we must decide how best to present and distribute them to the public.

All in all, DHCS 2011 was an energizing and well-organized event, with many more exciting presentations than I was able to address here. The program (with links to full papers) is available on the colloquium’s website. I’m looking forward to DHCS 2012 at the University of Chicago next November!


Comments are closed.