The Myth of Self-Correcting Science

Recent academic scandals highlight a history of data falsification and questionable research in social psychology, and serve as calls to action. 

RTR36NSS615.png
Carlos Jasso/Reuters

Over the last two years, the field of psychology has endured a wave of scandal bookended by fraud cases involving Harvard primatologist Marc Hauser and Dutch social psychologist Diederik Stapel. Even researchers desensitized by scandal-fatigue did a double take when the final report on Stapel's case came out last month. The extent of his creative misinterpretation of the facts make the Hauser case look like child's play. Stapel not only manipulated and fabricated data, he invented entire schools where said data was allegedly collected.

As if the fraud files weren't enough, then come the mea culpas -- salt in the wounds for students and colleagues still recovering from shattered reputations and a shaken faith in science. The two men released two very different statements telling very similar stories of reckless, ruthless ambition and playing the odds against getting caught. Stapel's "narcissistic wail" was so emotional and contrite as to seem a bit unhinged, while Hauser's read as a cold, calculating non-admission of guilt.

Hauser deftly concedes chagrin for errors made within his lab "whether responsible for them or not," implying that the same students bullied into committing academic fraud were somehow responsible for the car veering off the cliff. Stapel faults a noxious combination of publication pressures, addictive tendencies, and assorted personality issues for his downfall. And while publication pressure was among those issues, he caps off his mea culpa with a plug for his new book -- Derailment, a collection of his therapeutic diaries.

The Slippery Slope

It's easy to revel in the high drama surrounding the downfall of a Hauser or Stapel, but what about the journals that published these scholars? Stapel was a widely cited and highly revered figure. His fraud went undetected for decades in spite of eerily perfect data sets and improbable statistical values. According to Tilburg University's final report, Flawed Science, "There was a general neglect of fundamental scientific standards and methodological requirements from top to bottom."

Scientists fought back, noting that it is rare for reviewers in any field to detect fraud and demanding an apology for the 'slanderous conclusions' drawn in the report. Social psychologist Kate Ratliff, teaching at Tilburg when the scandal broke noted, " It's a small community and people considered Diederik a friend and mentor...No one understands why these young researchers didn't realize that it was weird that Diederik was giving them datasets. But you learn from watching others. And if there are no others, how would you know what's weird or not? I think that people started out being really sympathetic toward them and have gotten more and more punitive as time passes and hindsight bias kicks in. I think that's really, really unfair."

Almost more alarming than the few individuals committing academic fraud are the high percentage of researchers who admitted to more common questionable research practices, like post-hoc theorizing and data-fishing (sometimes referred to as p-hacking), in a recent study led by Leslie John.

For the uninitiated: post-hoc theorizing involves creating or revising a hypothesis after you've collected the data; data-fishing entails running a study, continually checking the data after each participant, and stopping as soon as you see a significant result. These practices are eschewed by some, but plenty of others embrace them. Joseph Simmons and colleagues ran a simulation showing how unacceptably easy it was to attain statistical significance using these 'degrees of researcher freedom.' By employing four of these questionable practices at once, they managed to find statistically significant evidence for the absurd hypothesis that listening to a Beatles song could make you 1.5 years younger.

"Clearer identification of the problems associated with some research practices is incredibly helpful," writes Linda Skitka, who sits on numerous journal editorial boards. "Because I'm guessing at least some scholars who engaged in questionable practices did not recognize the full implications of doing so. Given the intense attention these issues are now getting in the field, they certainly know better now."

So are the social sciences more prone to misconduct and fraud than biomedicine and other fields? A recent study titled " Scientific Misconduct and the Myth of Self-Correction in Science" found no such evidence. Even Stapel's wildly narcissistic mea culpa can't make you forget Yoshitaka Fujii, the Japanese anesthesiologist with a record-breaking 172 retractions.

Biomedicine shares some of the more nebulous concerns regarding data transparency, collection and dissemination as well. Citing the current drama surrounding Tamiflu , Nick Genes notes, "This is a hot topic [in medicine] right now ... There's a movement to bypass what's published and dive into the original data that's kept by drug companies and/or given to regulatory agencies like the FDA." Much as with the social sciences, the raw data from clinical trials is not made publicly available, and many fear that the temptation to tell a self-serving story with the data in journal articles (for individuals or pharmaceutical companies) will be too great.

F1.large.jpg


The Old Guard and the New

Though a wave of ignominy is cresting at the moment, these problems are not new. Back in the "golden era" of the NIH during the fifties and sixties, David Guston writes, if fraud occurred, the director would make a few phone calls, look into the alleged misconduct, and the offending scientist would be "quickly and quietly removed from the map of science." At that time, "the social contract for science was highly informal and contained entirely within the community."

It's safe to say this gentleman's agreement handling of scientific integrity had some issues. Philip Handler, then president of the National Academy of Sciences, insisted that the charges of misconduct sparking the first hearings on scientific integrity in 1981 were overblown, defensively declaring complete confidence in a "system that operated in an effective, democratic, and self-correcting mode." Today the cast of characters is different, but the claims and tensions are the same.

Then as now, underlings and younger scientists were often at the forefront of reform, trying to convince their elders to take the problem more seriously. The old guard tends to claim that critiques are overblown, that outside reforms and practices will hinder or hurt science, and that science is a self-correcting process. The new guard tends to embrace transparency and openness, seeing reform as the best way to salvage damaged reputations and keep the field from falling into disrepute. Incidentally, most of the recent fraud cases were unearthed by whistle-blowers (usually graduate and undergraduate students) working within the lab or... Uri Simonsohn. But none were revealed by the "self-correcting process of science."

Brian Nosek has emerged as one of the key reform figures with his work on the Open Science Framework and the Reproducibility Project. His professional commitment to ferreting out injustice and implicit bias (and a lifelong obsession with Star Wars) would seem to undergird a life-long fixation with good and evil. He's the kind of man you can see investing considerable amounts of time and energy trying to save science from its own dark side.

Even before Nobel Laureate Daniel Kahneman issued an open letter telling researchers to embrace reform and set up a replication protocols, Nosek was hard at work on his Open Science Initiative. In one of his "Scientific Utopia" articles, he imagines a world where researchers will pre-register their hypotheses, openly share and archive raw data in one central location, and check one another's work through replication. While fraud has captured the lion's share of attention as of late, the more mundane matters of keeping track of research done by migrating students and post docs, tracking patient records, and modernizing the archival infrastructure for the digital age is one of the essential undertakings in 21st-century science.

The blowback to Nosek's effort has mostly centered around the reproducibility project. "People are mostly afraid that the replications won't pan out, and that could look bad for the field; but no one is opposed to the Open Science Framework," Nosek says. "I don't know how any scientist in good conscience could be opposed to transparency."

The Problem of Frogishness

Social psychology is perhaps best known for examining the implicit roots of human error and bias; so there's a sad but all-too-human irony in the wave of suspicion emerging around scandals based on human error and bias. But with vulnerability also comes strength. "Social psychology already has the tools at its disposal to confront these issues and lead the pack when it comes to reform," Nosek enthuses. But even if every study was conducted in a digital-age utopian orgy of scientific openness and transparency, some would doubt the accuracy of the field's claims, likening it to pseudoscience -- a faddish line of inquiry walking a fine line between frontier and fringe.

The social sciences don't have the luxury of physical object variables like frogs; the components of studies are often more abstract concepts like morality or intelligence. "There's no such thing as 'frogishness,'" sighs Nosek, addressing the issue. "Well," he recants, ever the scientist, "I suppose you could have differing degrees of frogishness; but basically, everyone agrees on what a frog is."

People have different concepts of what intelligence is. "There are more and less useful ways of trying to define these things," says Nosek. But basically, the intellectual subjectivity inherent in the social sciences leaves more room for self-serving interpretation of the data than with hard variables. "When you're operating on the frontiers of what is known, you're going to make mistakes," Nosek explains. "Knowledge acquisition is messy...but science doesn't become pseudoscience unless people stop questioning themselves, stop seeing the need for criticism and correction."

Science 2.0

While it's too early to tell where the chips will fall, signs of consensus between the old guard and the new are on the horizon. Even researchers who think the scandal blowback is overblown are fairly well convinced that the issue isn't going away on its own. As Barbara Spellman, the editor of Perspectives on Psychological Science writes, "The tumbrels have rolled, the guillotines have dropped, and I'm hoping that the publication of the Stapel report represents the end of Revolution 1.0."

If Revolution 1.0 was about head-rolling, Spellman (like the vast majority of "law-abiding" scientists) hopes that Revolution 2.0 will be the quieter work of enacting reforms while getting back to science 2.0 -- a science with greater emphasis on replication and transparency. Maybe these scandals can result in a little scientific utopia to ring in the New Year.