clock menu more-arrow no yes mobile

Filed under:

Can you trust a Harvard dishonesty researcher?

The hard problem of faked data in science.

A photo of a river in the foreground, with kayakers and a bridge. In the backround is a clock tower and buildings of Harvard University campus. Sergi Reboredo/VW Pics/Universal Images Group via Getty Images
Kelsey Piper is a senior writer at Future Perfect, Vox’s effective altruism-inspired section on the world’s biggest challenges. She explores wide-ranging topics like climate change, artificial intelligence, vaccine development, and factory farms, and also writes the Future Perfect newsletter.

Francesca Gino is a Harvard Business School professor who studies, among other things, dishonesty. How often do people lie and cheat when they think they can get away with it? How can people be prompted to lie or cheat less often?

Those are some great questions. But it’s been a rough few years for the field of dishonesty studies because it has turned out that several of the researchers were, well, making up their data. The result is a fascinating insight into dishonesty, if not the one that the authors intended.

This story starts with a 2012 paper about academic dishonesty co-authored by Gino. The paper claimed that if you ask people to sign an honesty commitment before doing a project they have the opportunity to cheat on, they’re much less likely to cheat compared to if they sign the honesty pledge at the end of the experiment.

“Signing before — rather than after — the opportunity to cheat makes ethics salient when they are needed most and significantly reduces dishonesty,” the paper claimed. It featured three different experiments: two in a lab setting and one field experiment with reporting odometer mileage when applying for car insurance.

In 2021, that paper was retracted when it turned out the data from the third experiment — the one about the car insurance — didn’t add up. Other researchers tried to replicate the paper’s eye-popping results and ran into a bunch of inconsistencies.

The spotlight then quickly fell on one of the paper’s authors, Dan Ariely, a behavioral economist at Duke University and the author of The Honest Truth About Dishonesty. Ariely admitted that he “mislabeled” some data but denied that he deliberately falsified anything, proposing it may have been falsified by the insurance company he partnered with. But records show that he was the last to modify the spreadsheet in which the falsified data appeared.

That seemed to be the end of it. With the paper more than a decade old, it’d be hard to reach any definitive conclusions about what exactly happened. But it turns out that it was only the beginning. In a report published last week, a team of independent investigators laid out their evidence that there was actually a lot more fraud in the academic dishonesty world than that.

“In 2021, we and a team of anonymous researchers examined a number of studies co-authored by Gino, because we had concerns that they contained fraudulent data,” the new report begins. “We discovered evidence of fraud in papers spanning over a decade, including papers published quite recently (in 2020).”

Gino has been placed on administrative leave at Harvard Business School, and Harvard has requested that three more papers be retracted. In a statement on LinkedIn, Gino said: “As I continue to evaluate these allegations and assess my options, I am limited into what I can say publicly. I want to assure you that I take them seriously and they will be addressed.”

I highly recommend the series of blog posts in which the report authors explain, paper by paper, how they detected the cheating. Some impressive work went into proving not just that the data must have been tampered with, but that the tampering was deliberate. The investigators used Microsoft Excel’s version control features to demonstrate that the initial versions of the data looked quite different and that someone went in and changed the numbers.

Take that 2012 study I mentioned above. The third experiment, the insurance fraud one, had data that appeared fabricated. But when researchers looked more closely, so did the first and second experiments. Gino was entirely responsible for data collection for the first experiment and is the one suspected of having a hand in its fabrication. But she had nothing to do with the data collection for the third experiment.

This, of course, means that it looks like that single 2012 paper on dishonesty had two different people fabricate data in order to get a publishable result.

What we’ve learned about dishonesty

There’s a lot of discussion about the pressure to publish in academia and how it can lead to bad statistical practices aimed at fishing for a good p-value, or pumping up a result as much more impactful and important to the field than it really is.

There’s less discussion of actual straight-up fraud, even though it’s disturbingly common and can have a huge impact on our understanding of a subject. Early in the Covid-19 pandemic, bad claims about treatments popped up thanks to fraudulent studies and then took lots of good research to disprove.

The problem is that our peer review process isn’t very well suited to looking for outright, purposeful fabrication. We can reduce many kinds of scientific malpractice by preregistering studies, being willing to publish null results, looking out for irresponsibly testing lots of hypotheses without appropriate statistical corrections, and so on. But that does nothing against someone who just switches data points from the control group to the experimental group in an Excel spreadsheet, which is what it appears Gino did.

That’s not to say that frauds can’t be caught. One huge thing that I hear about from experts every single time I cover a scientific fraud case: Publishing the data is the way the fraud gets detected. It’s not that hard to manipulate some numbers, but it’s hard to do it without a trace. Many of the fraud cases highlighted by the team investigating Gino are downright clumsy.

Some journals now enforce an expectation that you publish your data when you publish your research. Some academics hesitate — it takes a lot of work to build a dataset, and they may want to write more papers using the same data and not be scooped by other researchers — but I think the pros of a policy about data publishing strongly outweigh the cons. It’s bad for everyone when fraudulent science gets published. It’s an injustice to scientists who are really doing the work but can’t manufacture such clean and eye-popping results. In cases like Covid-19, it resulted in research funding being badly directed and people taking medications that couldn’t help them.

Sign up for the newsletter Today, Explained

Understand the world with a daily explainer plus the most compelling stories of the day.