The question is hugely important for one reason: Intestinal parasites (or worms) are a massive health threat in low- and middle-income countries, afflicting up to a quarter of the world's population, and causing pain, nutritional deficiencies, and cognitive impairment.
Over the past 20 years, many organizations and leaders have pushed large-scale "deworming" initiatives, which often involve giving every child in a school or community deworming pills on a regular basis, regardless of whether they're infected. This effort was influenced by a study that ran in the late '90s in Kenya, in which researchers found that deworming not only treats children, but also improves their school performance and overall health — with spillover effects on kids at nearby schools who didn't get the pills at all.
In other words, programs to give out these very inexpensive and safe pills seemed to be "one of the most potent anti-poverty interventions of our time" and a key example of evidence-based policymaking in global health.
But lately, researchers have been calling the evidence into question. This year, a group of epidemiologists at the London School of Hygiene & Tropical Medicine replicated the study and reanalyzed the original data of that Kenya trial, and uncovered a number of flaws in the research.
Many media outlets and pundits reported on that work this summer with headlines suggesting deworming had been debunked, but what was missing from the debate (including my own original coverage at Vox) was any word from the researchers behind the original Kenya study.
So I reached out to one of them, Harvard's Michael Kremer. Over a series of phone calls and emails, he shared his perspective on what it was like to be on the receiving end of one of the most talked-about replications in recent science, what he thinks about large-scale deworming programs, and what the scientific and journalism communities can learn from this experience.
Julia Belluz: How did it feel to be on the receiving end of such a high-profile and controversial replication?
Michael Kremer: A reader of some of the initial media coverage who did not follow the later discussions would easily walk away with the idea that deworming has been debunked. And I think that’s really unfortunate, because hundreds of millions of children have untreated worm infections.
There is complete consensus, including among skeptics of mass deworming, that people who are known to have worms should be treated. There is also no realistic way to reach the hundreds of millions of people who have worm infections other than through mass treatment campaigns because testing is actually much more expensive than treating, and there are no known side effects of being treated for people who do not have worms.
Replications are great, and I’m glad that the practice of conducting replications is increasing. As this practice increases, the research community as well as the media are going through a learning process on how to do replications well and how to interpret them. Replications need to be scrutinized as carefully as original studies by the research community. So it’s important for the media to get a full sense of exactly what was done in a replication and of the reaction by the scientific community to the replication before printing headlines like "Deworming debunked."
The initial reports in the media, which influenced later coverage, were written without consulting us, and it does not look like they were written with knowledge of the detailed responses we posted online months ago on the website of the International Initiative for Impact Evaluation (3ie), which sponsored the replication. So I do think the media coverage was problematic.
On the plus side, the reaction by the scholarly community was very rapid and very quickly got to the key points. A number of researchers and organizations — from the World Bank, the Center for Global Development, GiveWell, Columbia University, for example — posted detailed analyses of the replication. A consensus quickly emerged that the key results of the original analysis held and that deworming makes policy sense. The World Health Organization reaffirmed its support for mass deworming in areas with high prevalence.
Media coverage is now improving, as well. For example, the Guardian, which ran the initial story, just published a piece citing the many scholars and organizations which have reaffirmed that the key conclusions of the original study still hold.
JB: Why does it matter if deworming does anything more than get rid of worms?
MK: This question only becomes relevant when we think about the cost-effectiveness of the intervention compared to other possible uses of the money spent on deworming.
Should there be mass treatment in areas with high prevalence, or should children first be tested and then only treatment if found to be infected? While testing first may sound intuitive, there are strong reasons in this case not to do that in areas with a high prevalence of worms.
First, there are no known side effects of being treated for deworming without having worms. Second, testing itself is actually more expensive than treatment. Finally, but importantly for a large-scale policy, logistically it’s very expensive and difficult to take samples in schools, test everyone, and treat only the infected. Long before our study, the World Health Organization was recommending mass deworming in high-prevalence areas.
There is evidence on the long-term educational and economic impact of deworming from a number of other studies: for example, Kevin Croke’s work on Uganda, Owen Ozier's work on Kenya, and our own long-term follow-up in Kenya.
There is also a very carefully done historical study by Hoyt Bleakley for the United States. It analyzes an early-20th-century deworming campaign in the US South and finds that following deworming, there were remarkable improvements in literacy and income in counties that had worms compared to other counties.
Bleakley’s work is a carefully done, convincing, observational study; the other three are all experiments. These findings of additional benefits of deworming beyond health increase the overall benefits of treatment, which makes deworming even more cost-effective.
JB: Do you think the pro-deworming community overstated the case for mass deworming programs?
MK: I have of course not read every statement made by every deworming advocate, but in general I feel like the organizations involved in this have not made the sort of sweeping claims about impacts on economic growth that have been associated with advocates for other diseases.
Deworming is not a panacea, but it is a highly cost-effective policy with evidence from multiple studies on educational and economic outcomes. Many of the organizations which have come out with favorable analyses of deworming, like the World Health Organization, the Disease Control Priorities Project, GiveWell, and the Copenhagen Consensus, are organizations which are focused on evidence and that have no institutional reason to favor deworming over any other program.
JB: What do you think this debate says about the entire enterprise of evidence-based global public health built on principles from evidence-based medicine?
MK: More evidence is always useful, but policymakers need to take decisions based on the available evidence. And in this case, the available evidence is sufficiently strong to move ahead.
JB: There were a number of accusations that the reanalysis team deliberately ran tests on the data to get a negative answer. What do you make of those?
MK: I obviously can’t know what was going on in the minds of this particular replication team and don’t want to attribute motives. The key question here is how we can create good protocols for replication processes in general.
I would make a few specific suggestions. For example, it is important to distinguish between identifying mistakes in the original analysis and doing a new analysis. The authors wrote two different papers, but the media did not always distinguish these concepts. Michael Clemens has a useful paper here where he explains these distinctions.
Second, it is important to have a pre-analysis plan to prevent the possibility of running a large number of analyses and then selectively reporting only those that find different results from the original paper. In this case, the team departed from their pre-analysis plan.
Third, any analytic choices should be justified in terms of using the techniques with most statistical power that will not be biased or through some other standard criteria, such as minimizing mean squared error. In this case, the authors did not do that. They used statistical techniques for which it was clear ex ante that there would not be enough statistical power to arrive at a statistically significant result — for example, because the number of observations they included would be too small — as their own pre-analysis plan makes clear.
Fourth, in cases where the replication leads to different findings from the original results, it is important to be clear about whether these new results are statistically significantly different from the original results.