NINES Project Set to Unveil New Version of Text Analysis Software

May 30, 2012 — Professor Andrew Stauffer was taken by surprise when a recent theater review in The New Yorker referenced Juxta, software developed at the University of Virginia to help scholars trace changes to literary texts.

"At first I thought, 'This can't be right. This has got to be a different Juxta,'" said Stauffer, an associate English professor in the College of Arts & Sciences and director of the Networked Infrastructure for Nineteenth-Century Electronic Scholarship, or NINES – the primary developer of Juxta since 2008.

The New Yorker review was of a recent performance of a Tennessee Williams play in which the show's creators used Juxta to help generate a script drawn from several versions of an unfinished work.

The review was the first Stauffer and the NINES team had heard of the program being used in that way, and the news came as they prepare to unveil a new Web-based version of Juxta designed to facilitate quick and easy sharing of literary analysis.

Juxta, available as a free download, was designed to assist in one of the basic tasks of scholars who work with text: tracking changes over time. The program's users upload text files that represent multiple copies of the same work, and the program compares the content of those files against each other, looking for differences.

"You have multiple versions of the same document, and you have to authenticate them and determine patterns of change, determine the order in which they were created, things like that," Stauffer said. "It's something scholars have always done by hand, with our eyes and pencils, but Juxta is a way to leverage computer power to do it much more powerfully and to visualize it in a way that's continuous with the reading experience."

Collating a document by hand often requires the use of an apparatus or index away from the text, while Juxta's "heat map" visualization highlights the changed passages inside the document so a reader doesn't have to look away to compare versions. Differences that exist between only two versions will be highlighted in a lighter color, while differences across multiple versions are darker.

Though it's easy to think of a book, play, or poem as having a finished, concrete form, many exist in multiple versions. Walt Whitman's "Leaves of Grass" poetry collection is a famous example. Whitman published it in 1855, but kept revising it and releasing new versions until he died. As a demonstration tool, Juxta comes preloaded with versions of "Leaves of Grass" from 1855, 1856 and 1867.

In one example from Whitman's "Song of Myself," a Juxta user can see at a glance that the 1855 poem's reference to "the mockingbird in the swamp" is highlighted in deep blue. Clicking on the phrase shows that the mockingbird became "the mocking-bird" in 1856 and "the jay" in 1867.

Juxta's use isn't limited to famous literary works. In another example available online, a user tracked the changes across 11 versions of a Wikipedia entry on the digital humanities. Often-changed passages are highlighted in dark blue, while less-changed passages are lighter.

Joe Jeffreys, the dramaturge on the recent Broadway performance of Tennessee Williams' "Masks Outrageous and Austere," said he and his colleagues on the production turned to Juxta to help reconcile multiple versions of the play, which Williams never completed to his satisfaction.

Juxta "was an ideal tool for this work and opened up vast new vistas for the study and interpretation of Tennessee Williams' final full-length play," Jeffreys said.

Dana Wheeles, project manager at NINES, said she hopes the new Web-based version of Juxta will promote the use of the program for projects that scholars can easily share with colleagues or friends via email or a blog post.

"What we've done with the new version of Juxta is take what you could do in the desktop application and make it easier to share that workflow online," she said.

In the new version, a scholar could generate a link from Juxta that would lead to a visualization of a particular passage or change referenced in a literary analysis, she said.

The new version is in the beta-testing phase now and should be fully released around the end of summer, Wheeles said. The Juxta project originated in 2004 with funding from the Mellon Foundation. Funding for the Web-based Juxta has been provided by Google, via two digital humanities grants, as well as by the University, which supports the operations of NINES.

– by Rob Seal