A shadowed voter at a booth.
A voter casts his ballot at a polling place at Highland Colony Baptist Church, in Ridgeland, Mississippi, on November 27, 2018.
Drew Angerer/Getty Images

The difference between good and bad state polls, explained

When you don’t weight by education, you might massively underrate Trump.

A poll released Tuesday conducted by the broadly respected Atlanta Journal-Constitution offered the surprising finding that 54 percent of voters in Georgia support the impeachment inquiry into President Donald Trump, with voters split 47-47 so far on the question of actual removal.

Georgia is far from the reddest state in America, but it’s not the bluest either. Trump won a majority of the vote there in 2016, beating Hillary Clinton by 5 percentage points — stronger than his performance in Wisconsin, Michigan, Pennsylvania, Florida, North Carolina, or Arizona.

The state features not one but two Senate races in 2020, making his apparent unpopularity a huge deal for the future of American politics. It could all but guarantee Democrats a win in the Electoral College and potentially deliver them a majority in the Senate.

After waiting in line for 1.5 hours, Olando Narcisse casts his ballot on Election Day in Atlanta, Georgia on November 8, 2016.
Jessica McGowan/Getty Images

The bad news for optimistic Democrats is that the fine print on the poll contains a sentence that should be a huge red flag to contemporary consumers of political polling: The data are weighted based on race, age and sex to accurately reflect the demographics of the state.

There’s nothing wrong with weighting your sample based on race, age, and sex to match the demographics of the state. That’s standard practice in the industry. The problem is what the poll didn’t weight on — educational attainment. Many state-level polls omitted this factor in 2016, leading them to underestimate Trump’s strength in key swing states. The most responsible pollsters responded to 2016 by making sure to improve their weighting. But many pollsters — especially those doing state-level polling — continue not to weight by education.

This failure to weight not only leads to errors (which could be compensated for by averaging), it leads to systematic bias against Trump and the GOP, meaning everyone who publishes or disseminates unweighted polls ends up contributing to misinformation about the real state of American politics.

Poll weighting, explained

The most basic idea of polling is that you can get a pretty good idea of what a population of several million people thinks by asking a sample of just a few hundred of them.

The trick is that for this to work, you want a random sample of the state’s population. If you sample a few hundred people coming out of an exurban megachurch, you’re going to get a sample that’s quite biased toward Republicans. If you sample a few hundred college students, you’ll get a sample that’s quite biased toward Democrats. Traditional telephone polls avoid this by calling people at random. That would work great if everyone you called picked up the phone and agreed to answer your poll. But, of course, they don’t. And experience has taught pollsters that proclivity to answer polls is not randomly distributed across the population.

Voters line up to cast their ballots at a polling station set up at Grady High School in Atlanta, Georgia, on November 6, 2018.
Jessica McGowan/Getty Images

Consequently, pollsters “weight” certain respondents more or less heavily in the sample in order to construct a virtual survey pool that matches what they know about the overall demographics of the state.

Typically, that means giving extra weight to younger and non-white respondents, who are harder to reach, and giving reduced weight to older and white ones.

A long-time methodological disagreement among pollsters is whether you should use party identification as an additional demographic weight. The case for doing so is that by matching the partisanship of your sample to what you think you know about the underlying partisanship of the state, you can avoid creating noisy or biased samples. The case against partisan weighting is that the popularity or unpopularity of the incumbent president could, in fact, cause people to change which party they identify with.

Education weighting is a newer issue. Pollsters used to not do it because it didn’t seem very important. These days, however, college graduation has become clearly correlated with both tendency to answer polls and tendency to vote for Democrats.

The double divide on education

In the AJC poll, about 62 percent of respondents have at least a bachelor’s degree, with 26 percent having completed some graduate study.

Back in the real world, Census Bureau statistics show that in the best-educated state (Massachusetts), just 42 percent of the adult population has a college degree. The national average is 31 percent, and in Georgia it’s 30 percent. College graduates turn out to vote at a higher rate than non-graduates, so it’s not totally wild to imagine a Georgia electorate that’s somewhat better-educated than the census data.

But the AJC poll is way off the mark. And it’s not alone.

For somewhat mysterious reasons, a huge gap has opened up in the demographics of who is willing to answer pollsters’ questions with better-educated people much more likely to take surveys. At the same time, the partisan affiliation of white voters has come to be sharply stratified along the lines of educational attainment. These two facts in combination mean that any state poll that does not explicitly weight by education ends up over-counting college graduates and thus over-counting Democrats.

A recent Emerson poll, for example, showed Democrats with a huge lead in Michigan — and also showed Michigan with roughly the educational attainment of Massachusetts.

Given that failure to weight by education leads to very predictable problems, it’s unfortunate that so many outlets don’t do it.

One reason may simply be apathy — all else being equal, it’s easier not to change procedures. Another reason is that precisely because non-college people are less likely to answer pollsters, it’s annoying and expensive to get enough of them in your sample to have a reliable survey. Since response rates are falling in general, thus bringing up costs, there’s an understandable reluctance to change methodologies in ways that raise costs even more.

Last but by no means least, the reality is that the people doing this kind of polling have only weak incentives to actually get things right. Unfortunately, bad polling can have a significant impact on the real world.

Good polling matters

National opinion polling, which is available in large quantities from well-known pollsters who do proper weighting, makes it pretty clear that Trump is unpopular nationwide and would likely lose the popular vote were the election held tomorrow.

Both of those things, however, were true of the 2016 campaign, and he became president anyway. So there’s a critical question of how the race looks in the main swing states. Polls that don’t weight by education typically show the pivotal states as mirroring national polling in showing substantial Democratic leads. But in reality, these are states where non-college whites are a larger share of the electorate than you see nationwide. Consequently, properly weighted polls generally show Trump stronger in these states than he is nationwide — exactly the result we saw in 2016.

The question of whether 2020 is likely to be a blowout (as the improperly done polls indicate) or a nail-biter (as the better ones usually show) isn’t necessarily a decisive factor in one’s thinking about the Democratic primary, but it’s definitely relevant. More realistic polls make worrying about electability seem a lot more reasonable than polls that are calling Trump’s political viability in Georgia into doubt.

