This has been a rough year for pollsters and pundits, with prediction after prediction going painfully awry. Even those supposedly unflappable data journalists have found themselves stepping in it.
But it’s not just the journalists and pollsters. Since I’m a professor of statistics as well as a blogger who often comments on academic papers that I think misuse numbers, I have a front-row seat to some of the least persuasive academic takes on politics and elections. And it’s been a big year for bad studies.
In journalism and polling, premature obituaries of Trump have been one common problem. In July 2015, the New York Times’s Nate Cohn remarked on "a shift that will probably mark the moment when Trump’s candidacy went from boom to bust." (That was a reference to Trump crudely dismissing the war record of John McCain, the former Republican presidential nominee.) "His support will erode," Cohn wrote confidently, "as the tone of coverage shifts from publicizing his anti-establishment and anti-immigration views … to reflecting the chorus of Republican criticism of his most outrageous comments and the more liberal elements of his record."
Whoops. Only a month later, famed number cruncher Nate Silver gave Trump a 2 percent chance of winning the Republican nomination.
Gallup throws in the towel
A couple of months after that, Gallup made the historic announcement that the organization would no longer do horse race–style election polling. You can see why this might be a smart time to get out of the predictions game.
I'd love to claim that I'm above all this myself, but really I too had no idea what would happen during the primary season. Whenever anyone asked me, I'd point them to an article I wrote in 2011 explaining why primaries are hard to predict.
In short, in the general election voters have months to make their decisions, the choice is between two candidates who are ideologically distinct, and most voters can rely on party cues. In contrast, primaries come in a rushed sequence, competing candidates tend to be similar in ideology, and (of course) they come from the same party. And with multiple candidates comes the opportunity for strategic voting (casting a vote for someone you dislike to defeat someone you dislike even more), which is a hard thing to model.
In short, I avoided making any embarrassing predictions about primary election winners only by the tactic of avoiding making predictions, period — an option that was not so available to the Nates Cohn and Silver, who were expected to make real-time predictions (and who, to their credit, examined their errors afterward).
One study alleged that the Democratic primary was rigged
But academia has had no shortage of errant "findings" as well. This year, perhaps more than others, the internet has been swarming with conspiracy theories — some of these defended with statistical arguments.
In June, various people pointed me to a paper by Axel Geijsel and Rodolfo Cortes Barragan, graduate students at Tilburg University and Stanford, respectively, with the portentous title "Are we witnessing a dishonest election? A between state comparison based on the used voting procedures of the 2016 Democratic Party Primary for the Presidency of the United States of America." (Yes, indeed: that presidency.)
The paper, issued before the primary race between Hillary Clinton and Bernie Sanders was decided, made the case that Sanders tended to win in states where electronic voting could be double-checked with a paper trail. Clinton, suspiciously — or "suspiciously" — tended to win when there was no paper trail. Moreover, Geijsel and Barragan wrote, the inaccuracy of exit polling supposedly rose in states without a paper trail, and the official results seemed biased toward Clinton.
The paper itself did not convince me, as there can be all sorts of differences between different states, and there’s no reason to pick just one of these factors and give it a causal interpretation. It’s what we call an observational comparison. You never know, fraud could always happen, but the paper supplied no useful evidence that this difference was the one driving the election results. (Not that you’d need an explanation as to why a 74-year-old socialist fails to win a major party nomination in the United States.)
But if going viral among Bernie followers counted in academia, these students would have tenure already.
How much of a kingmaker is Fox News?
Closer to the mainstream, in June, economics professors Ray Fisman and Andrea Prat, of Boston University and Columbia, posted a piece in Slate claiming that Fox News support for Donald Trump "could erase a 12-percentage-point Democratic lead in the popular vote."
I’m skeptical that this number is anything close to reasonable. After looking at the cited study, by professors Gregory Martin (political science, Emory) and Ali Yurukoglu (Stanford Business School), it seems to me that Fisman and Prat improperly extrapolated an estimate that was already probably too high.
Martin and Yurukoglu estimated that watching Fox News an extra 2.5 minutes a day increased a voter’s probability of voting Republican by 0.3 percentage points. But it’s not reasonable to assume that if the time watching the channel continued to grow, the shift in vote preference would continue to be strong and linear — all the way to 12 percent!
In addition, while I trust that the authors found what they reported, there is a well-known tendency for small but variable effects to be overestimated in this sort of statistical study. In general, estimates near zero are discarded and high estimates are reported. We call this the "statistical significance filter," which can turn weak results into robust-seeming ones.
Regarding partisan news sources, I have more trust in a study by political scientists Dan Hopkins and Jonathan Ladd of Georgetown University, who analyze data from a 2000 pre-election poll and find a positive effect of Fox News on support for George W. Bush, but only for Republicans and independents. In summarizing this study, Hopkins writes that media influence "fosters political polarization. For Republicans and pure independents, Fox News access in 2000 reinforced GOP loyalties." Not a lot of room for a 12 percent swing in that claim.
Or does Google pick our presidents?
The next month came a piece, based on work by the research psychologist Robert Epstein — Epstein also publicized it last year — called "How Google Could Rig the 2016 Election." It claimed that "Google’s search algorithm can easily shift the voting preferences of undecided voters by 20 percent or more — up to 80 percent in some demographic groups — with virtually no one knowing they are being manipulated. … Given that many elections are won by small margins, this gives Google the power, right now, to flip upwards of 25 percent of the national elections worldwide."
Quite a claim. The numbers, however, came from a highly artificial set of lab experiments in which participants were asked questions about unfamiliar political candidates after being shown unrealistically rigged search results. The researchers put extremely biased articles favoring one candidate on page one, moderately biased articles on page two, and so on, so participants had to go to pages four and five of a five-page search to find anything strongly favoring the other candidate.
Epstein then compounded his exaggerations by claiming, ridiculously, that the real-world impact of Google on elections would "undoubtedly be larger" than in his loaded experiments.
In fact, the real presidential election is not being held in an isolated lab: Voters have many sources of information about Clinton and Trump, beyond those found in (hypothetically) rigged search results. (Full disclosure: Some of my research is funded by Google.)
And it’s still only early September! Just wait till next month, when just about any election-related study will get 15 minutes of fame. In recent years we have seen claims that political attitudes and preferences were determined by menstrual cycles, smiley faces displayed near survey questions for subliminally short durations, and the mood swings caused by the results of college football games (really). All of these studies struck me as flawed, either in design or in the analysis of the data. (Follow the links for more details about my doubts.)
I'm not saying that these studies shouldn't have been done (well, in most cases). Researchers should be free to try out all sorts of outside-the-box ideas, and, indeed, in some of these cases I’m not criticizing the studies so much as the accompanying hype. But respected news organizations should think twice about dramatic claims about voting and elections, even if they are published in reputable scientific journals.
When it comes to research, election season is silly season, and there always seems to be room for one more story about how irrational those voters are. Who knows what else they’ll come up with before November 8?
Andrew Gelman is a professor of statistics and political science and director of the Applied Statistics Center at Columbia University. He blogs at Statistical Modeling.