clock menu more-arrow no yes mobile

Filed under:

The terrifying uncertainty at the heart of FiveThirtyEight’s election forecasts

The myth of the election prediction wizard is no more.

Nate Silver
Nate Silver.
Jeremy Sutton-Hibbert/Getty Images
Andrew Prokop is a senior politics correspondent at Vox, covering the White House, elections, and political scandals and investigations. He’s worked at Vox since the site’s launch in 2014, and before that, he worked as a research assistant at the New Yorker’s Washington, DC, bureau.

The forecasts are in, and they say the 2018 elections can go any number of ways.

If you’re following election coverage and forecasting models, you know the conventional wisdom at this point: Democrats are the favorites to take the House, and Republicans are the favorites to hold on to the Senate.

FiveThirtyEight’s “classic” forecast — which has become the gold standard in elections forecasting — gives Democrats an 85.6 percent chance of retaking the House and Republicans a 81.3 percent chance of holding the Senate, as of Tuesday evening.

So both of those are highly likely to happen, right?

Well, one person who’s been trying to complicate that assessment is FiveThirtyEight founder Nate Silver himself.

One point Silver has made over and over again in recent weeks is that even if you take his House and Senate forecasts at face value, when you think about both of them together, there’s around a 40 percent chance that one of them will be wrong.

He elaborated on this on Twitter this week, making a point that’s important to understand — that a “very normal-sized polling error” in either direction could result in a dramatically different outcome.

Let’s pause on this. Republicans holding the Senate and narrowly holding the House would, of course, be an enormous victory for the party. Conversely, a Democratic takeover of both chambers would be a stunning win for them. Either would have seismic consequences for the Trump administration’s future.

And either of those outcomes is just “a very normal-sized polling error” away or so from happening.

These days, savvy election watchers have to keep two ideas in their heads at the same time:

  • The best way to get some sense of what the Election Day outcome will be is to look at polling averages or models like FiveThirtyEight’s.
  • But polls of state or House races often get the final margin wrong by several points, and just because a forecast shows an outcome as unlikely, that doesn’t mean it’s impossible.

In other words: uncertainty, uncertainty, uncertainty.

What FiveThirtyEight’s model does

Before the advent of model-mania, a common way to think about polling was that if the polls are close, the race is a “toss-up,” and it could go either way. RealClearPolitics, for instance, continues to classify any race where the polls average out to a 5-point lead or less for one candidate as a toss-up, instead of making more specific forecasts of some kind.

Though based on most of the same underlying polls, FiveThirtyEight and its kin try to come up with a numerical percentage of how likely each candidate is to win, with the help of historical data and mathematical modeling.

“Almost all statistical models are grounded in history,” Silver recently explained in an appearance on The Ezra Klein Show. “The implicit idea is that you are hoping that history will repeat itself, at least in a probabilistic way.”

So they start from polling averages of the current races. They modify that a bit with certain other technical tweaks depending on the particular version of the model. For instance, in sparsely polled House races, they incorporate other polls that could be helpful (similar districts, state, or national numbers). Some versions of the model incorporate other factors that appear to be historically important, like fundraising: the “fundamentals.”

Out of all that, FiveThirtyEight calculates a candidate’s expected vote share. This looks pretty similar to a polling average — for instance, in the Florida Senate race, they show Democratic Sen. Bill Nelson leading his Republican opponent Rick Scott by 3 percentage points (as of Tuesday afternoon).

Then, though, comes the big leap. Working off that expected vote share, they simulate the election thousands of times — comparing to a wealth of historical data on how past elections turned out, and trying to incorporate uncertainty. What comes out at the other end is the candidate’s projected chance of victory. For Bill Nelson, that’s 69 percent.

Once they do that for every House and Senate race up this year, they come up with an overall estimate of the chance each party will win control of each chamber. Something like an 81.3 percent chance Republicans will keep the Senate, and an 85.6 percent chance Democrats will take the House.

What happened with the forecast models in 2016

An 80 percent chance of winning compared to a 20 percent chance seems like a huge advantage. Yet in commentary about their models, Silver and the rest of the FiveThirtyEight team (I recommend their podcast) repeatedly stress that an 80 percent chance of victory is not a done deal — not even close. Their argument is that an outcome that’s 20 percent likely will happen 20 percent of the time.

The most infamous example of something like this was, of course, the 2016 election. FiveThirtyEight’s model gave Hillary Clinton a 71.4 percent chance of winning, and she, er, didn’t win.

But though this put an end to the legend of Silver as the forecaster who “called” all 50 states “correctly” in 2012, he ended up something of a winner anyway. In the days before the election, even though his model showed Trump as the underdog, it gave Trump a substantially higher chance of winning than any other mainstream model out there. For that, some accused him of “panicking the world” and “putting his thumb on the scales,” and others opined he was “cautiously” trying to ensure his forecast looked right whatever happened.

The Upshot

In the days before the election, I dug into FiveThirtyEight’s modeling choices and concluded they made “a whole lot of sense.” Most broadly, I wrote, this was because “Silver’s forecast is just more uncertain that the result will match what the current polling data shows.” He understood quite well that there could be a polling error, a last-minute swing, or both.

In the end, the worst error made by other 2016 forecasters was that they drew far-too-confident conclusions from Clinton’s relatively narrow, single-digit poll leads nationally and in key swing states. Silver did not make this mistake — and so he came off looking, relatively, the best (or, as some have put it, the “least wrong”).

The FiveThirtyEight forecast in 2016: under the hood

Still, a closer look at the FiveThirtyEight model’s pre-2016 forecasts, state by state, should be enough to give any political observer some agita as November 6 approaches.

First, let’s look at the “expected margin of victory” — the amount they thought Clinton or Trump was probably ahead in each state — as compared to Trump’s final outcome. Here I’ll look at a rather broad set of 15 swing states. (These, of course, were many of the most heavily polled races in the country.)

FiveThirtyEight’s 2016 expected victory margins in swing states

State 538 expected victory margin Actual outcome Who'd 538 underestimate?
State 538 expected victory margin Actual outcome Who'd 538 underestimate?
Georgia Trump +4 Trump +5.1 Trump by 1.1
Iowa Trump +2.9 Trump +9.5 Trump by 6.6
Arizona Trump +2.2 Trump +3.5 Trump by 1.3
Ohio Trump +1.9 Trump +8.1 Trump by 6.2
Florida Clinton +0.6 Trump +1.2 Trump by 1.8
North Carolina Clinton +0.7 Trump +3.7 Trump by 4.4
Nevada Clinton +1.2 Clinton +2.4 Clinton by 1.2
New Hampshire Clinton +3.6 Clinton +0.3 Trump by 3.3
Pennsylvania Clinton +3.7 Trump +0.7 Trump by 4.4
Colorado Clinton +4.1 Clinton +4.9 Clinton by 0.8
Michigan Clinton +4.2 Trump +0.3 Trump by 4.5
Maine Clinton +7.5 Clinton +2.9 Trump by 4.6
New Mexico Clinton +5.8 Clinton +8.3 Clinton by 2.5
Wisconsin Clinton +5.3 Trump +0.7 Trump by 6
Virginia Clinton +5.5 Clinton +5.4 Trump by 0.1

What are the takeaways?

  • They underestimated Trump’s margin by 6 to 6.6 points in Iowa, Ohio, and Wisconsin
  • They underestimated Trump’s margin by 3 to 5 points in Pennsylvania, Michigan, North Carolina, Maine, and New Hampshire
  • They underestimated Trump’s margin by 1 or 2 points in Georgia, Arizona, and Florida
  • They were within a point of the actual outcome in Virginia and Colorado
  • They underestimated Clinton’s margin by 1 to 3 points in Nevada and New Mexico.

So out of 15 swing states, FiveThirtyEight’s “expected margin of victory” was within a point of the outcome in just two. And it was off by more than 3 points in half of them, all in the same direction (underestimating Trump). Overall, the “misses” were enough to flip the outcome in five states: Wisconsin, Pennsylvania, Michigan, North Carolina, and Florida.

My point is not to say the model was wrong — it mainly reflects what the polls were showing, and most polls did show Clinton ahead in these states. My advice is just that you really, really should not read this margin of victory number with excessive certainty. It is usually off by a few points and often off by more than that.

So then what about the projected chance of victory in each state? On Election Day 2016, FiveThirtyEight projected the following:

FiveThirtyEight 2016 projected chance of victory, by state

State 538 chance of victory Actual outcome
State 538 chance of victory Actual outcome
Georgia Trump 79.1% Trump +5.1
Iowa Trump 69.8% Trump +9.5
Arizona Trump 66.6% Trump +3.5
Ohio Trump 64.6% Trump +8.1
Florida Clinton 55.1% Trump +1.2
North Carolina Clinton 55.5% Trump +3.7
Nevada Clinton 58.3% Clinton +2.4
New Hampshire Clinton 69.8% Clinton +0.3
Pennsylvania Clinton 77% Trump +0.7
Colorado Clinton 77.5% Clinton +4.9
Michigan Clinton 78.9% Trump +0.3
Maine Clinton 82.6% Clinton +2.9
New Mexico Clinton 82.6% Clinton +8.3
Wisconsin Clinton 83.5% Trump +0.7
Virginia Clinton 85.5% Clinton +5.4

The takeaways here are:

  • 21 to 35 percent chance of Clinton winning in Georgia, Iowa, Arizona, and Ohio (all of which she lost)
  • 55 to 58 percent chance of Clinton winning in Florida and North Carolina (which she lost) and Nevada (which she won)
  • 69.8 percent chance of Clinton winning New Hampshire (which she barely won)
  • 77 to 79 percent chance of Clinton winning Colorado (which she did win) and Pennsylvania and Michigan (which she lost)
  • 82 to 86 percent chance of Clinton winning Maine, New Mexico, and Virginia (which she won) and Wisconsin (which she lost)

So the favored candidate won in 10 of these 15 swing states. Since Clinton barely had more than a 50 percent advantage in Florida and North Carolina, the biggest discrepancies are naturally the famous trio of Michigan, Pennsylvania, and Wisconsin, where practically everyone was shocked by the outcome on election night.

Do FiveThirtyEight’s 77 to 83 percent estimates of Clinton winning those three Rust Belt states look too high in retrospect? Or was this the sort of ordinary outcome we should expect from a probabilistic forecast (after all, something with a one in four chance does happen one in four times)?

I’m not a forecasting or modeling expert, and I don’t have a firm view on this. Still, when you’re poring over this year’s forecast, I think what happened last time is helpful to keep in mind — that of the seven swing states where Clinton was given a 77 to 86 percent chance of winning, she lost three.

Applying 2016’s lessons to 2018

To their credit, the FiveThirtyEight team understands all this and has been trying to sound the uncertainty alarm at every turn, rather than voicing excessive and specific certitude in their forecasts.

In their explanation of this year’s models, they write a great deal about uncertainty, and explain that they try to account for four historically common types of polling mistakes: local error (in particular states or districts), regional or demographic error, incumbency-based error, and the possibility that one party will overperform in the polls nationwide. It’s useful to think about all of these in thinking about how the polls might go awry.

For another, they made an excellent visual change. In 2016, FiveThirtyEight topped their presidential forecast with Clinton’s 71.4 percent chance of winning and Trump’s 28.6 percent chance. The visual was a long blue horizontal bar for Clinton that was far bigger than Trump’s small red one. The Senate forecast was illustrated similarly. They seemed to be geared toward answering readers’ simple and most urgent question: Who will win?

But this year’s FiveThirtyEight Senate and House models are now each topped with a bell curve of sorts, showing various possible outcomes of seats for each party and how likely they appear to be. They’ve also shaded in the middle 80 percent of the bell curve, where the likeliest outcomes lie. This is a smart way to drive home that they are not “predicting” one certain outcome here.


To some of FiveThirtyEight’s critics, this may seem to make their forecast maddeningly unfalsifiable. If the outcome ends up anywhere in that rather wide 80 percent confidence interval, does that mean the forecast was “accurate”?

Yet what seems far more ridiculous to me is any pretension to correctly pinpoint the outcome in every race when the polls themselves clearly cannot. And I’m glad the myth of the forecasting wizard who reads the outcomes from a crystal ball is finally a thing of the past.

Now, in an article several months back reviewing polling error, Silver crunched the numbers and concluded that “the average error in all polls conducted in the late stage of campaigns since 1998 is about 6 percentage points.” He continued:

If the average error is 6 points, that means the true, empirically derived margin of error (or 95 percent confidence interval) is closer to 14 or 15 percentage points! ... This means that you shouldn’t be surprised when a candidate who had been trailing in the polls by only a few points wins a race. And in some cases, even a poll showing a 10- or 12- or 14- point lead isn’t enough to make a candidate’s lead “safe.”

Yet Silver’s conclusion from all this was that “the polls are all right” — because, well, this type of polling error is historically normal.

Yes, models like his attempt to account for all this by building in uncertainty and a wide range of potential outcomes.

Still, I look at a 6-point or so average error, and the number of close races up this year, and how dramatically they could swing the outcome in either direction — and none of this really makes me feel “all right.”

For more on election forecasting, listen to Ezra Klein’s recent interview with Nate Silver below, or at this link.

Sign up for the newsletter Sign up for Vox Recommends

Get curated picks of the best Vox journalism to read, watch, and listen to every week, from our editors.