Tuesday, June 15, 2010

C.I.T.E - The Infrastructure of the Homer Multitext (Part 1 - Introduction)

The Infrastructure of the Homer Multitext

     C · I · T · E

The Homer Multitext (HMT) is a project of the Center for Hellenic Studies of Harvard University (CHS). It is best described in the words of its editors, Casey Dué and Mary Ebbott:
“The Homer Multitext project, the first of its kind in Homeric studies, seeks to present the textual transmission of the Iliad and Odyssey in a historical framework. Such a framework is needed to account for the full reality of a complex medium of oral performance that underwent many changes over a long period of time. These changes, as reflected in the many texts of Homer, need to be understood in their many different historical contexts. The Homer Multitext provides ways to view these contexts both synchronically and diachronically.” (From the CHS website)
Dué and Ebbott, in collaboration with the Director of the CHS, Gregory Nagy, and the CHS’s Head of Publications, Leonard Muellner, initiated research toward this project with an eye to advancing particular arguments about the nature of Homeric poetry. But anyone interested in epic poetry, Greek poetry in general, and the intellectual history of the Greco-Roman world, the cultures that came into contact with it, and those that succeeded it, stand to profit from the project.


The HMT aims to collect, as comprehensively as possible, all of the sources for our knowledge of the Homeric epics, and to publish these online, freely accessible to any interested reader.

These sources include versions of the Iliad and Odyssey, and the surviving pieces of lesser-known epic poems born in the Greek Bronze Age. These versions may be fragments of papyrus found in the sands of Egypt or manuscripts produced under the Byzantine Emperors of Constantinople. These sources also include texts of later Greek and Roman writers who quote from Homer, writers such as Plato, Aristotle, Herodotus, and Thucydides. A particularly rich body of evidence comes from the writings of the literary scholars who worked in the Libraries of Alexandria and Pergamum; the works of these writers do not survive intact, but thousands of excerpts from them and references to them do survive, as comments written in the margins of manuscripts.

Dué and Ebbott are committed to providing the most useful access possible to these sources. This means offering texts of those sources in the original Greek and translated into modern languages where possible. It also means providing high-quality digital facsimiles of the actual manuscripts wherever possible.

It is impossible to overstate the value of digital facsimiles. The Greek and Latin texts that we can check out of libraries, or find online, are highly processed documents. Editors will compare different manuscripts of a work – which always differ – and produce a uniform text that is identical to no single medieval or ancient “witness” to the work. Responsible editors will provide notes explaining in what ways their edited text differs from particular manuscripts, but these notes – even the most meticulous – fall far short of providing the depth of information that can be gleaned from direct access to good images of the manuscripts themselves.

Scholarship based entirely on edited texts is fundamentally handicapped. However brilliant the scholars working from these texts may be, their insights will be limited by the absent editors of their source-texts, by their assumptions, and by the innumerable details that disappear on the journey from the hand-written manuscript, through generations of editions, to the shelves of the library. 

For the past century, scholars of Greece and Rome have been content for the most part to work from edited texts. There were justifiable reasons for this – practical, technological, and economic reasons. None of those justifications survived the turn of the 21st Century.

In addition to texts and images, other kinds of data might shed light on Homeric poetry: morphological and lexical data, lists of persons, geographic information (where is "Sandy Pylos” or “Horse-rolling Thessaly”, is a reference to Thebes pointing to Seven-Gated Thebes, or Hundred-Gated Thebes in Egypt?), and so forth.

The Challenge

To bring these disparate materials online in a useful way posed a challenge. The collaborators on the HMT wanted an all-purpose infrastructure that would both contribute to end-user applications for browsing, searching, and reading, but would also make the raw data available for discovery and retrieval. 

Some kind of digital library infrastructure was necessary, but the complexity of the anticipated contents of that library posed another problem. A digital library containing highly diverse data, which is expected to expand indefinitely must be exposed through protocols that define requests and responses. Those requests and responses should allow discovery of contents, access to objects, retrieval of parts of objects – passages of texts, data elements, parts of images – and querying, manipulation, and other kinds of processing.

Since the data is highly varied and the possible uses of the data potentially infinite,  should the protocol become correspondingly complex, then the infrastructure would become, essentially, an end-user application, useable only to its creators, fragile and difficult to maintain, and increasingly vulnerable to obsolescence as time goes by.

Almost a decade of thinking and experimentation went into defining a generic, scaleable protocol that enables scholarly access to and use of these materials in a networked environment, as simply as possible.
This was mainly the task of the HMT’s Project Architects, Neel Smith and me, Christopher Blackwell.

Our answer is C.I.T.E., that is, Collections, Indices, Texts, and Extensions.

This looks like four things, but it is really only three: texts, collections, and indices. In our conception of the requirements of the Homer Multitext, we have reduced scholarship to these three kinds of digital object, have defined protocols for working with each, and have working code that implements each.
In the next installments of this series of postings, I will describe each element in the C.I.T.E. architecture in some detail. Finally, I will describe how they can be brought together to build rich applications for sholarship.

A Final Note

Any discussion of a “generic infrastructure for scholarship” will inevitably sound like the beginning of an evangelical spiel about how everyone needs to adopt the speaker’s pet approach to data. That is not our intention here. 

Our dear friend, the late Professor Ross Scaife, was once playing advocatus diabli as I was describing our protocol for texts. “How many other projects need to adopt this protocol for it to be useful?” My colleague Neel had the answer: “One, ours.”

We have developed C.I.T.E. because we needed something like it in order to do what we want to do with the history of Homeric texts. I am describing it here because it is the foundation for much of the ongoing research of the HMT team, which we will also document here, and it might be of interest to other scholars working on similar projects.

All computer code developed for the HMT is free and open-source; all data published by the project is open-content under a Creative Commons or similar license.

Next… Part 2 - Texts

Ongoing Research, Summer 2010

Christopher Blackwell here:

I have begun a series of blog posts aimed at describing and narrating one corner of the constellation of research that surrounds the Homer Multitext. These posts will appear on my blog: http://nobleswineherd.blogspot.com .

They will focus on the work of this summer, 2010, both the projects in Europe, and what my undergraduate collaborators are doing in Greenville, SC.

I am hoping to use these as tools for recruiting good students to study Classics at Furman University, so they wil tend to have a local focus.

However, I also want to give an overarching view of how the Homer Multitext is progressing, what we have done, and what we hope to do in the near term. I will post those pieces here, and link to them from my “Eumaeus, the Noble Swineherd” blog.

- Posted using BlogPress from my iPad

UH High Performance Computing hosts Homer Multitext data

The Homer Multitext is a publication of Harvard's Center for Hellenic Studies. The project has been from the beginning, however, a collaborative one between colleagues with various strengths and abilities and from a variety of different kinds of institutions all over the United States and Europe. My own particular research focus has always been the Homeric epics and the oral tradition in which they were composed, but our team includes computer scientists, conservators, and photographers, philologists, art historians, codicologists, papyrologists, and historians. I would like to record my appreciation here for the constant assistance and support of the University of Houston's Research Computing Center, its director Keith Crabb, and especially staff member Alan Pfeiffer-Traum. All image data for the Homer Multitext Project is also hosted by the UH RCC, and can be found at http://amphoreus.hpcc.uh.edu/.

Friday, June 4, 2010

The Homer Multitext and undergraduate research

The Homer Multitext is a large, collaborative research project, and will require the contributions of many researchers to achieve its goals. We have therefore developed ways for undergraduate researchers to be involved in producing original research, published and credited as their own but contributing to the larger endeavor. This summer five undergraduates will be contributing to the project. At Furman University, three undergraduates are working on digital diplomatic editions of Homeric papyri, some of our oldest witnesses to the Homeric epics. At the College of the Holy Cross we have two students working with the high-resolution digital photographs of the Venetus A manuscript that we acquired in 2007 (see the images via the Manuscript browser here) to create digital texts of its text of the Iliad, the scholia (marginal commentary) and all other features of each page of the manuscript. The texts will all be linked through structured mark-up to the images themselves. The goal for the summer project is to complete this task for two books of the Iliad. In future posts I will give updates on their progress.