On July 18, the JISC Digitisation Programme published a thought-provoking blog post about the growing number of platforms and portals through which users can access (and potentially edit) digital resources, such as those produced by the TCP.
The post asks,
If there are multiple versions of the original content, then which one is the one you use? In fact it’s not only about the content. Which platform works quickest? Which gives the most ‘accurate’ search results? Which one provides enhanced tools for analysis? Which gives the best results for your particular area of research? Where do you send your students? Which one do you cite?
Most importantly, which one do you trust? And why?
It seems to me that this post is really talking about two distinct factors that users should consider when working with digital text archives:
- How do I evaluate the platform (i.e., the access point)?
- How do I evaluate the data (i.e., the source)?
These are really important questions, and they’re on our mind here at the TCP, too. Today, I’ll address our stance on platforms. Tomorrow, look for another post with our thoughts on data. Please note that these posts will focus only on the TCP texts themselves, and not on the EEBO, ECCO, or Evans Early American Imprints databases that provide the sources for our transcriptions.
Historically, the TCP has made a strong distinction between data and platform, because our mission pertains to only one of these. Our aim is to create a large amount of useful data that might be served through any number of platforms, and that is exactly what has happened.
For example, whether you access the EEBO-TCP texts through Proquest, the University of Michigan, Philologic at Northwestern University, or the MONK Workbench (or load the texts all up on your computer and dig right in), will depend on what exactly you want to do. But the texts themselves all come from the same source—the TCP—and are created, reviewed, and distributed in the same way.
While we have gratefully received feedback from our users over the years, and as a result made corrections to our data and processes, the creation and publication of the TCP texts has (so far) moved in one direction only: from the TCP outward. This is in large part because we are continuing to produce and distribute new texts all the time.
However, the end users working with these texts have always been free to transform them in any way they like—to create “variant digital editions.” For example, the MONK Workbench normalizes all of the text files and adds additional markup to them in order to perform different kinds of analysis on the corpus. Alternatively, a scholar might use a single text as the basis for a specialized scholarly edition.
This is exactly what we want to see happening. If the texts produced by the TCP remain inert and static—if we end up with a single destination rather than a starting point for new scholarship—we will not have done our job.
We see the navigation of this landscape as a natural extension of the way scholarly editions have been produced, evaluated, and used for many years. For at least the last 500 years, the history of texts has also been the history of editions of books, and we have had to make choices about which editions are preferable to others. The author of the JISC post is right that the skills required to evaluate digital resources are different from those we’ve come to rely on for print, but we believe the need to do this kind of critical assessment is nothing new.
In short, we are thrilled to see the TCP texts made available through many different platforms, and glad to deliver the texts to our partners (and in the years to come, to anyone) who want to use them in just about any project or platform they can imagine. We don’t see multiple access points, in and of themselves, as a problem.
But we do recognize that this model becomes a great deal more complicated when these various platforms support the editing, and not just the consumption, of texts. Tomorrow, more on how the TCP is thinking about this kind of project.