U.Va. Library Evolves to Support Research, Scholarship in the ‘Big Data’ Era

By Rob Seal, rseal@virginia.edu

November 14, 2012

As scholars across the University of Virginia increasingly turn toward quantitative, data-intensive research methods, the University Library is adjusting to support them.

These adaptations include hiring more experts in data use and management, entering a new data-services partnership with the Curry School of Education and working to make sure digital research results and scholarship generated today are available for future generations.

The “big data” label applies to the voluminous amounts of digital information generated and interpreted in modern scholarship, ranging from the massive data sets produced in the search for the Higgs Boson to a collection of digital videos used by a humanities scholar studying modern dance.

“What we’ve seen over the course of recent years is substantial growth in interest in using data in all fields and across all disciplines,” Deputy University Librarian Martha Sites said. “That includes not just the natural sciences, but the social sciences and humanities as well.”

Advances in technology and a general move toward quantitative research methodology have forced scholars in recent years to confront many issues that come with large amounts of digital information: What’s the best way to access or analyze it? How do you keep it for posterity?

Though the term “big data” is a relatively recent addition to the academic lexicon, libraries are well-suited to facing these challenges, as they’ve long been in the business of storing information captured, cataloged and interpreted during the scholarly process, Donna Tolson, library strategist, said.

“If you went down into our government documents section, you’d find volumes of old census data,” Tolson said. “Those are large sets of quantitative data, but they were bound in a book form and it was clear that the library was the place to keep them. That data is still being generated, it just doesn’t live in a book any more. But scholars still need it, and libraries still want to make sure it’s available for future scholars.”

To help with data needs, the library is filling three new positions: a data acquisition expert, a social sciences research data specialist and a data mining specialist. These new positions complement an existing staff of data experts that include GIS experts from the Scholars’ Lab, an ITS-funded software consultant, and the Scientific Data Consulting Group, which helps scholars develop the data management plans increasingly required by grant-funding agencies such as the National Science Foundation and the National Institutes of Health.

The library is also shifting some of its data services to consolidate them and make them more widely known to the University community, said Andrew Sallans, the library’s head of strategic data initiatives.

“We hope to better position the services we have to support data-intensive research and instruction,” he said. “I think we’re going to work a lot better through collaboration.”

The realignment includes working with organizations such as the Quantitative Collaborative, an initiative of the College of Arts & Sciences, as well as a new partnership with the Curry School, which is particularly interested in making data-related research support available to its scholars, Sallans said.

Mark Hampton, Curry’s associate dean for administration and planning, said many education scholars are already involved in data-intensive research, and that the use of such methods will only increase.

“In fields such as educational policy, you really need to be able to track and analyze school data, as well as other large data sets like Social Security information or unemployment data,” Hampton said. “Our faculty members have a lot of expertise in that area, but too often it’s individuals working on their own. Partnering with the library gives us a physical place to go for data resources and a critical mass of services for our faculty and graduate researchers.”

Ruffner Hall, Curry’s home, is on the brink of a two-year renovation. The education library there has relocated to the third floor of Bavaro Hall, which will serve as a sort of hub for the library’s data support services, Sallans said. Some of the new positions will be located there, and the Bavaro Hall space will evolve into an expanded data services location in Ruffner Hall when the renovation is completed.

Hampton said he hopes the space can develop into a data services hub for the greater University, as scholars increasingly rely on libraries to ensure the long-term preservation of the data sets they develop and research.

“Preservation is sort of the hangover after the big party,” he said. “Everybody is out collecting data and doing wonderful things with it. But if you aren’t paying attention to data management and preservation, digital data might not last. I know in my own case I can probably find a copy of a paper I did in high school more easily than one I did in grad school, because my grad school work was saved on floppy discs that are now obsolete.”

Though many professional organizations and publishers help preserve research results in digital format, libraries are best positioned to make sure data are available for subsequent generations, Sites said. U.Va. already has services such as Libra, an online scholarly archive that recently began accepting data sets.

“One of the things that libraries do that faculty members don’t, necessarily, is to think about students that will be here in 70 years,” Tolson said. “Will they have access to the scholarship that’s being done here now? The archive a faculty member deposits to today might or might not be around in 70 years. We want to save that work, so that we can help future students access it in the same way that we help students today access work published in the 1920s, or even the 1820s.”