Big Data. It’s in the news and may seem like the newest buzz-term.
But the fact is a deluge of data increasingly is confounding efforts to make sense of it. To find correlations and trends, in business, finance, science and engineering, and, increasingly, education, the arts and humanities, and to use those correlations to make reliable predictions.
The management of big data affects academia, industry, government – and virtually every area of human endeavor.
To address a growing need to manage and analyze this rapidly growing data stream, the University of Virginia has embarked on a big data initiative designed to help faculty, staff and students across academic disciplines and administrative units come together and develop services, curricula and new research activities related to complex data.
“The data are pouring in at a breathtaking pace; but we need better ways to handle, analyze and visualize it. We need new tools to manage that data,” said Rick Horwitz, associate vice president for research and biosciences, who is facilitating and coordinating big data activities across Grounds.
To address the big data challenge nationwide, the Obama administration this year announced a $200 million request for grant applications to develop big data infrastructure. Some of these funds will be granted to universities, and U.Va. will be fiercely competing for money against peer institutions.
In preparation for these opportunities, and as a result of a grant proposal from U.Va. President Teresa A. Sullivan, the Jefferson Trust last month awarded $100,000 to the University for student-initiated, interdisciplinary collaborations in big data.
There is another impetus: the National Science Foundation, the National Institutes of Health and other funding agencies increasingly are requiring researchers and institutions to demonstrate that they can handle big data. Funding now depends upon it.
Because of advances in computing, almost every discipline is becoming data intensive, Horwitz said, not only the traditional data-heavy fields of engineering, physics and biosciences, but also in an array of nonscience-related fields that include humanities, education and architecture. New tools are needed to extract the new knowledge that lurks in massive data sets.
Sullivan considers the big data initiative a priority for the future of the University. At a U.Va. Big Data Summit last May, sponsored by the University Alliance for Computation in Science and Engineering and the Office of the Vice President for Research, Sullivan said, “Society’s most pressing challenges and nature’s deepest mysteries frequently… are best understood by handling large amounts of data… We need to develop new ways to acquire, analyze and make sense out of big data…” The summit brought together 170 people from 32 departments across Grounds.
A goal of the summit was to catalyze interactions among colleagues across disciplines – people working on big data problems who might benefit from new connections – and also to access strengths, needs and opportunities. The summit even included fields that generally may not be considered part of the big data revolution, such as media studies, drama and law because of emerging ethical and legal issues involving the management and use of data.
The summit revealed data and analytic parallels and cooperative possibilities among diverse and sometimes seemingly disparate disciplines from science, medicine and engineering to the arts and humanities; it found connections for computing infrastructure; revealed that the University is an eager community for big data; and laid the groundwork for planning cross-Grounds collaborations and other activities.
“The University is positioned well to address this challenge and benefit from unusual synergies due to our unique breadth and depth,” Horwitz said, “but we need an overall plan to manage, mine, model and manipulate enormous volumes of data. That is what we now are developing.”
In the wake of the summit, an interschool group of faculty, organized by Kevin Skadron, chair of the Department of Computer Science, is developing an overall plan for big data that will identify hiring needs, address curricular issues, coordinate service and research needs, create connections with industry and enable fundraising.
Among other activities, the group is identifying needs for curricula, certificates and majors, to train quantitative data and information researchers and to provide literacy to anyone whose research might touch big data. One notion is a general, introductory course sequence that would bring quantitative literacy in math, statistics and computation to a large body of students. Another goal is to form a virtual institute that would coordinate big data activities and promote cutting-edge research in big data.
The Big Data Summit has already led to tangible outcomes, such as the Jefferson Trust grant. And in response to the Obama initiative, Don Brown, a professor in systems and information engineering, has pulled together a large interdisciplinary team of faculty and submitted a proposal to the National Science Foundation to take advantage of new funding opportunities for training students on the handling and integration of large data sets. The overarching goal is to develop the infrastructure and skills needed to handle big data and advance U.Va.
"We are very excited about the Big Data initiative,” said Kim Tanzer, dean of the School of Architecture. “We will benefit from the expertise our colleagues across grounds can provide through a coordinated effort, and we have much to contribute: Faculty in the School of Architecture manipulate large data sets, especially through visualization processes such as GIS [Geographic Information Systems] and BIM [Building Information Modeling]. We generate visual representations of data, useful in understanding quantitative information. And we use large data sets to design by prescribing data-driven parameters to inform design choices."
Examples of the way architects use big data, Tanzer said, include when landscape architects use different parameters to illustrate how land areas might change as a result of different sea-level rise scenarios, or for showing how a building would weather as it ages, or how a neighborhood would change as a result of population growth or loss. To do this requires big data management and manipulation.
The School of Engineering and Applied Science, like other science and technology units at the University, works with vast amounts of data for research and development of technologies. James Aylor, dean of the school, cites the pervasiveness of big data in everyday life – from sensors used to monitor medical devices and machinery on aircraft and other vehicles, to electronics and advanced manufacturing – as proof that universities are and must be on the leading edge of developing techniques for big data acquisition, storage, management and analysis.
“The use and management of big data increasingly will involve everybody at the University,” he said. “We are a great experimental station because of our comprehensiveness – with our libraries, medical center and research units, all generating and using enormous amounts of data – we have the opportunity for collaboration and for developing new ways to reduce that data into understandable and valuable information.”
Aylor noted that the school’s introductory engineering course is being redesigned and, most likely, will be “all about big data” and will be an opportunity for students to begin learning early how to manipulate data and gain resolution from it.
Michael McPherson, associate vice president and deputy chief information officer for Information Technology Services, who is also involved in big data, is looking at ways to improve infrastructure to support data uses. He said that data storage and management is “not just a problem of a few well-funded scientists and engineers; it is facing everybody. We need to develop new and better ways to move and analyze data and put structure to it.”
The big data initiative is being designed to provide the structure to make that happen.