One of the big stories of the 2016 presidential election is just how wrong the polls were. The surveys in the final days of the campaign settled on Hillary Clinton defeating Donald Trump by around four points and winning a solid majority of Electoral College votes. The votes are still being counted, but she has only a slight edge in the popular vote and, of course, lost the Electoral College and the contest. How did the polls do so poorly?
Pollsters and political scientists will be wrestling with this question for a long time. People's faith in polling, one of our most trusted tools for understanding voters and predicting behavior, has quite legitimately been shaken. But we should first figure out exactly what the polls missed and what they got right.
I looked at the aggregate vote forecasts for both Trump and Clinton across all 50 states and compared those results with the actual vote outcome (data from which is still incomplete). Here's the interesting lesson so far: The polls were quite close in their forecast for Clinton's support. Clinton was expected to get 47 percent of the vote; she's got 47.7 percent so far. On average, the polls understated her state-level support by about a percentage point.
Trump, however, was expected to get 44 percent of the vote, and he now has 47.5 percent; the polls undershot his support considerably.
The scatter plot below shows Clinton's predicted vote share from polling plotted against her actual vote share. The red line shows where her vote would have been if she'd received exactly her predicted share. As we can see, the state results hew pretty close to the line and appear both above and below it.
Compare this with the Trump relationship below. Trump met or exceeded his expected vote share in every state except Nevada.
What can we take from this? Nate Silver took some heat recently for giving Trump an unusually high chance of winning, which he did because of the uncertainty in the polls and the unusually high number of voters who said they were undecided or considering a third-party candidate. As the undecided made up their minds, and as third-party supporters moved toward the major party candidates (as they tend to do in a campaign's final days), it looks like Trump picked up almost all of them.
It's hard to know whether to call these folks "shy" Trump voters or people who were legitimately torn between support for Trump and another candidate until pretty late in the game. For what it's worth, race seems to matter here. Using a regression equation to predict Trump's share of the vote, both Trump's polling position and the percentage of the state that is white had positive and statistically significant coefficients. But such studies at the state level are likely insufficient to draw useful conclusions.
Nonetheless, regardless of the reasons these people turned out for Trump, our polls missed them. If polling is still going to be a part of our election system, we'll need to figure out how to avoid breakdowns like this in the future.