‘Big Data’ Traces Language Origins to Track the Americas’ Earliest Migrants

Mountain covered in snow

University of Virginia linguistic anthropologist Mark A. Sicoli and colleagues are applying the latest technology to an ancient mystery: how and when early humans inhabited the New World.

Their new research, which uses “big data” techniques to analyze more than 100 linguistic features, suggests complex patterns of contact and migration among the early peoples who first settled the Americas.

The diversity of languages in the Americas is like no other continent of the world, with eight times more “isolates” than any other continent. Isolates are “languages that have no demonstrable connection to any other language with which it can be classified into a family,” Sicoli said.

New linguistic research methods with “big data” computing compare similarities and differences between early languages, some of which are extinct.

New linguistic research methods with “big data” computing compare similarities and differences between early languages, some of which are extinct. (Photo by Dan Addison, University Communications)

Linguists have identified 26 isolates in North America and 55 in South America, mostly strung across the western edge of the continents, compared to just one in Europe and nine in Asia.

“The high number of isolates in America suggests that there may have been related languages that went extinct without documentation,” Sicoli said. “Linguists are now asking the question of how to infer the existence of these missing languages from their effects on languages they were in contact with.”

“Scientists in the past decade have rethought the settlement of the Americas,” Sicoli said, “replacing the idea that the land which connected Asia and North America during the last ice age was merely a ‘bridge’ with the hypothesis that during the last ice age, humans lived in this refuge known as ‘Beringia’ for up to 15,000 years and then seeded migrations not only into North America, but also back into Asia.”

In a presentation Friday to the annual meeting of the American Association for the Advancement of Science in Boston, Sicoli joined other scientists discussing “Beringia and the Dispersal of Modern Humans to the Americas.” Since much of Beringia – theorized to have been located generally between northwest North America and northeastern Asia – has been under water for more than 10,000 years, it has been challenging to find archaeological and ecological evidence for this “deep history,” as Sicoli called it.

Recent ecological, genetic and archaeological data support the notion of human habitation in Beringia during the latest ice age, between 25,000 and 14,000 years ago. The new linguistic research methods, which use “big data” to compare similarities and differences between languages, suggest that such a population would have been linguistically diverse, Sicoli said.

Subscribe-Daily Report

In “Linguistic Perspectives on Early Population Migrations and Language Contact in the Americas,” Sicoli’s analysis points to the existence of at least three now-extinct languages of earlier migrations that influenced existing Dene and Aleut languages as they moved to the Alaska coast. The data comparing dozens of indigenous languages support phases of migration and multilingual language contact along the Alaska coast, which potentially involved languages related to current linguistic isolates.

In his presentation, Sicoli described several comparisons from computational work with multiple languages – for example, from the Yeniseian of Siberia, the Dene family of North America, plus Eskimo, Aleut and Haida (an isolate) languages. He combined geographical maps with language networks from the database to show relationships between languages and where language-mixing with the three “ghost” languages occurred.

“The computational methods give us traction on questions that have been unanswered,” said Sicoli, who is continuing to work with Anna Berge of the University of Alaska and Gary Holton of the University of Hawaii on early language contact. “They help us understand how people migrated and languages diversified, not simply through isolation, but through multilingual contact.

“Such language contacts support that the mixing populations also mixed their languages as part of human adaptation strategies for this region and its precarious environment,” Sicoli said.

Media Contact

Anne E. Bromley

Office of University Communications