close
close

Five mistakes to avoid when looking at the 2024 election polls

Five mistakes to avoid when looking at the 2024 election polls

Photo: Graeme Sloan/Bloomberg/Getty Images

With the 2024 presidential election being “realigned” by replacing Joe Biden with Kamala Harris, it looks like it will return to the very close election we originally expected for November. And why not? There has been exactly one election this century that was clearly decided (Barack Obama's victory over John McCain in 2008), the two major parties are in the balance, and three-time Republican nominee Donald Trump has polarized American politics to an almost unbelievable degree.

Right now, with an increasingly attentive public paying close attention to the presidential election, some out of excitement, some out of fear, it's a good time to review some of the mistakes people often make when trying to follow and interpret the polls.

If a poll comes out that is favorable to one candidate or the other, this “team” will most likely inflate the numbers and present them as absolutely true, predicting an impending landslide victory (this is especially true for Trump's MAGA fans; Democrats have been burned too many times by poll-driven irrational exuberance). Some pollsters are prone to partisan bias (intentionally or not), but any poll by any pollster can turn out to be an outlier for statistical reasons.

To avoid the temptation to overreact to individual polls, there are two simple ways: (1) you can use poll averages, which tend to greatly reduce the significance of outliers, and (2) you can look at trends in the results of particular pollsters over time.

There are a variety of polling averages this year. Some (notably RealClearPolitics and, to a lesser extent, Decision Desk HQ) use simple arithmetic averages with no adjustments or weighting of the results, while others (FiveThirtyEight, the New York Justthe Washington postand FiveThirtyEight founder and now independent analyst Nate Silver) have sophisticated methods aimed at bringing higher quality, more timely data to the forefront. Personally, I prefer FiveThirtyEight's averages, which are easy to navigate and don't include quite as many worthless polls as RCP's. But any of the averages above are much better than relying on a single pollster or result.

In close elections, where small swings in poll results can seem huge (especially when the lead, however small, is constantly changing), it's easy to forget that any serious poll comes with a “margin of error” that reflects the size of the sample and thus the likely range of the underlying numbers (which in turn is slightly modified by a “confidence interval” that is usually 95 percent). A recent national poll from Emerson College showing Harris ahead of Trump by 50 percent to 46 percent had a margin of error of 3 percent, meaning that the results for either or both candidates could differ by that percentage. So a relatively clear lead for Harris is actually “within the margin of error” (which is 6 percent in terms of the difference between the candidates) and could be misleading. Put another way, no truly narrow lead is safe and could be an illusion.

The margin of error can get very large when subsampling certain parts of the electorate (e.g., voters under 30, voters with or without college degrees), which some pollsters compensate for by “oversampling” certain interest groups. When you see a poll with a really odd result (such as Trump leading among young voters or calling in 30 percent of black voters), always pay attention to the sample size and margin of error. For this reason, large-sample polls (all else being equal) are generally more reliable, and state-level polls tend to be less accurate than national polls.

Until recently, pollsters preferred surveys based on live telephone interviews, until (a) cell phones began to replace landlines in homes and (b) Americans' unwillingness to respond to telephone polls made it very difficult (and expensive) for old-school pollsters to obtain a representative sample. Now, there are still “gold standard” pollsters (e.g., New York Just-There are also perfectly legitimate polls that use sophisticated online voting panels and other methods. Pew found after the 2022 midterm elections (when pollsters had an excellent track record) that “17% of national pollsters used at least three different methods of sampling or interviewing people (sometimes in the same poll), up from 2% in 2016.”

FiveThirtyEight's database of opinion polls remains an important tool for distinguishing good polls from bad ones. It is based not only on accuracy but also on transparency (pollers who don't say how they arrive at their results shouldn't be trusted). In general, though, beware of small-sample, one-day polls that are obviously designed to grab headlines.

For various reasons, pollsters (or, more commonly, the media that fund and sponsor polls) don't always release poll data as soon as it's collected, so it's possible that a “new” poll might contain old data. For example, some media people have pounced on a Fairleigh Dickinson poll on the Harris-Trump race that was released the day after the Democratic National Convention ended and showed a “post-convention bounce” even though much of the polling was done before the convention began. It's important to keep in mind the gap between poll collection and the release of results when looking for a “bounce” following a significant event (especially a candidate debate). In fact, it's wise to wait a few days after such an event before looking for poll data, since much of the impact is likely to come from secondary coverage rather than live viewers.

It is obviously very important whether respondents who favor one candidate or the other actually vote. But it is not always easy to distinguish the participating sheep from the non-participating goats until Election Day is relatively close. That is why most pollsters stick with samples based on registered voters until they “switch” to polls of likely voters shortly before early voting begins (others, like Just-Siena, offers both registered voter and likely voter results much earlier).

There are various forms of “likely vote screenings,” each with different strengths and weaknesses. Some focus on stated voting intentions, which can overestimate turnout because people don’t like to admit that they might have something better to do on Election Day than fulfill their civic duties. Others emphasize past voting behavior, but that obviously doesn’t work with newly eligible voters and may miss a spike in turnout among voters who didn’t previously participate (this is a factor that was frequently cited as a reason for under-polling Trump voters in 2016 and especially in 2020). Likely vote screenings are especially important in non-presidential elections, because turnout is often low and volatile.

Presidential elections have many more registered voters, so estimates of the likelihood of voting are less important. In the past, the use of voter screening has often led to better results for Republican candidates because they were disproportionately drawn from segments of the electorate most likely to vote (e.g., older voters). This may be less true in the Trump era, where Democrats have improved their results among both highly educated and older voters, while Republicans have performed better among voters without college degrees, who are less likely to vote.

Finally, it should be noted that some polls (typically “issue polls” that do not measure candidate preferences, and some polls that measure approval or popularity of jobs) do not even check voter registration status, but use samples of “adults.” These results should be viewed with some caution.

A side effect of this era of close elections and partisan balance is that polls can “mispredict” the outcome even when they are reasonably accurate. It’s also important to note that national presidential polls estimate the national popular vote, not results in the Electoral College (both George W. Bush in 2000 and Trump in 2016 won the latter while losing the former, and in 2020 Trump missed victory by a whisker but lost the popular vote quite significantly). For example, RealClearPolitics’ final polling averages in 2016 showed Hillary Clinton ahead of Trump by 3.2 percent. She won the national popular vote by 2.1 percent. That’s a pretty small error. But sparse state polling gave no indication that Trump would win the historically Democratic states of Michigan and Wisconsin—and thus the election. So when Trump did win with the equivalent of an inside straight, many shocked observers felt betrayed by the polls, and some concluded that they were worthless. They weren't – not at all – but of course they weren't flawless either.

In 2020, the polling errors were actually even more pronounced. RCP's final averages showed Biden ahead of Trump by 7.2 percent; in fact, he won the popular vote by 4.5 percent, a margin small enough to put Trump within reach of another candidate in the Electoral College. The aftermath of that relatively weak result did not lead to any clear conclusions, but explanations often focused on either pandemic conditions, which heavily influenced both polling and voting, or a lingering problem pollsters faced in identifying Trump voters. Both explanations were consistent with the excellent polling results of 2022, when the pandemic had subsided and Trump was not on the ballot. So there is no reason to assume the 2024 polls will be right or wrong. But Harris' supporters will be praying that she is far enough ahead in voting to have a chance of winning in the Electoral College. And a clear victory could also reduce the very high chances that Trump and his supporters will once again fight to certify defeat.

Show all

Related Post