Replication is integral to the scientific method. It’s the lather-rinse-repeat approach that either increases the likelihood that a theory holds legitimacy, or forces reconsideration. If results fail to replicate, was something off in the experimental process? If not, are the conclusions wrong?
But with careers and grant monies so often in the balance, the incentives align with generating new discoveries over running old experiments repeatedly.
“Replication is a core concept of science, and yet it’s not very often practiced,” said Nosek, who is also co-founder and executive director of the independent nonprofit Center for Open Science, which advocates for transparency in research methods and data. “Innovation without verification gets us nowhere. But any individual researcher has choices to make about how they spend their time. The answer is pretty easy when thinking about career advancement: Try something new.”
Ten years ago, mindful of the trend of research being increasingly generated, but not increasingly revisited, Nosek and colleagues decided to re-run a series of published scientific experiments, creating the “Many Labs” project. The global effort, which at times has been both headline-grabbing and apple cart-turning, wrapped up at the end of April.
“It’s hard to overstate how central Brian Nosek’s role in the reform movement, sometimes called the ‘replication crisis’ or ‘credibility revolution,’ has been,” commented Simine Vazire, a professor at the University of Melbourne who studies psychology ethics and science’s ability to self-correct. “Brian has been a leader in the movement, a diplomat reaching out across sharp dividing lines in our field, and someone who gets things done.
“Among the many projects that Nosek and the Center for Open Science made possible, the Many Labs projects, which Brian collaborated on, but which were individually led by Richard Klein and Charlie Ebersole, are among the most impressive and important for science. Each of these five projects was a gargantuan effort, and tackled a question that had been the topic of much debate, but virtually no empirical tests.”
‘#Repligate’ Lights Up the Scientific Community
The Many Labs revolution has a complex history, with Nosek running similar projects simultaneously. He began his initial Many Labs replication studies in 2011. It wasn’t long before the scientific community was feeling “twitchy,” The Guardian reported.
Nosek reflected, “Because performing replication was so unusual, when someone said, ‘I want to replicate your study,’ instead of the person taking it as a compliment, it was often considered a threat. Like, ‘What, you don’t trust me? What’s up?’”
He thinks of 2012 as when Many Labs officially started, however. That’s when the journal Social Psychology accepted his and Dutch researcher Daniël Lakens’ pitch to serve as guest editors for a special issue with a unique approach. They would invite researchers to submit proposed replications and put the designs through peer review before knowing the results, so that no one would be biased for or against the studies based on the outcomes.
The first Many Labs project was one of the papers in this special issue, and its approach was slightly different: It tested the replicability of 13 classic and contemporary studies.
Many Labs 1 (Klein et al., 2014) was part of a special issue of Social Psychology that introduced Registered Reports edited by @lakens and I.— Brian Nosek (@BrianNosek) April 29, 2022
The issue spawned the infamous #repligate, summarized here: https://t.co/xNSPRaZii4
A sampling of the inquiries for that component included: Can people’s behavior be “primed” by visual cues? Would a quote be perceived differently if someone thought it came from Thomas Jefferson versus Soviet founder Vladimir Lenin? Can a chance observation – in this case a dice roll – influence thoughts about what might have happened prior?
Overall, the crowd-sourced Many Labs experiments spanned 36 independent research sites, utilizing a total of 6,344 participants.
Recognizing the depth of the challenge, Nosek embarked with a simple design.
“With Lab 1, we wanted to look at a number of different findings in an easy-to-transfer protocol,” Nosek said. Using a combination of online and in-person methods, “We chose research findings in which the study and procedure could be done in 1 to 3 minutes.”
At first blush, the results might not have seemed revolutionary. There was arguably good news: 10 of the 13 studies replicated their original findings, a much higher percentage than Nosek and colleagues might have expected.
But that was bad news for the remaining three studies. There was weak support for one theory, that imagining contact with someone can reduce prejudice, as well as in general for the theory of “priming,” which was integral to the other two studies.
In one of the two, participants were shown a subtle image of an American flag to see if it would influence their sense of conservatism. No strong evidence could be found for the visual cue having an effect on their subsequent behavior. With priming having previously been a well-accepted theory, social media users wagged their tongues at this and related findings from the special issue, often flagging the conversation as “#repligate.”
Nosek, with 269 collaborators, followed up Many Labs the next year with the concurrent research titled “The Reproducibility Project: Psychology.”
While not formally part of Many Labs, the work continued its themes. The researchers conducted replication attempts on research published in three journals in 2008. They were only able to reproduce results in fewer than 50 of the 100 cases – worse odds than flipping a coin.
The marquee observation was that, ironically, 97 of the original studies claimed significant effects. Even the studies that did successfully replicate didn’t usually do so with as much oomph as originally touted. In a few worst-case scenarios that were realized, the new researchers even found effects that were opposite of the initial findings.