A Philosopher's Blog

How did the polls get it wrong?

Posted in Philosophy, Politics, Reasoning/Logic by Michael LaBossiere on November 11, 2016

The pundits and polls predicted that Hillary Clinton would win the presidency of the United States. They were, obviously enough, wrong. As would be expected, the pundits and pollsters are trying to work out how they got it wrong. While punditry and polling are generally not philosophical, the assessment of polling is part of critical thinking and this is part of philosophy. As such, it is worth considering this matter from a philosophical perspective.

One easy way to reconcile the predictions and the results is to point out the obvious fact that likelihood is not certainty. While there was considerable support for the claim that Hillary would probably win, this entailed that she could still lose. Which she did. To use the obvious analogy, when it is predicted that a sports team will win, it is obviously possible that it can lose. In one sense, the prediction would be wrong: the predicted outcome did not occur. In another sense, a prediction put in terms of probability could still be right—the predictor could get the probability right, yet the actual outcome could be the unlikely one. People who are familiar with games that explicitly involve probabilities, like Dungeons & Dragons, are well aware of this. For example, it could be true that there is a 90% chance of not getting killed by a fireball, but it would shock no experienced player if it killed their character.  There is, of course, the question about whether the estimated probabilities were accurate or not—unlike in a game, we do not get to see the actual mechanics of reality. But, I know turn to the matter of polls.

As noted above, the polls indicated that more people said they would vote for Clinton than for Trump, thus her victory was predicted. A critical look at polling indicates that things could go wrong in many ways. I will start broadly and then move on to more particular matters.

Polling involves what philosophers call an inductive generalization. It is a simple inductive argument that looks like this:

  • Premise: X% of observed Ys are F.
  • Conclusion: X% of all Ys are Fs.

In a specific argument, the Y is whatever population the argument is about; in this case it would be American voters. The observed Ys (known as the sample) would be the voters who responded to the poll. The F is whatever feature the argument is concerned with. In the election, this would be voting for a specific candidate. Naturally, a poll can address many candidates at once.

Being an inductive argument, it is assessed in terms of strength and weakness. A strong inductive argument is one such that if the premises were true, then the conclusion is probably true. A weak one is such that if the premises were true, then the conclusion is probably not true. This is a matter of logical support—whether the premises are true or not is another matter. In terms of this logic, all inductive arguments involve a logical leap from what has been observed to what has not been observed. When teaching this, I make use of an analogy to trying to jump a chasm in the dark—no matter how careful a person is, they might not make it. Likewise, no matter how good an inductive argument is, true premises do not guarantee a true conclusion. Because of this, a poll can always get things wrong—this is the nature of induction and this unavoidable possibility is known as the problem of induction. Now to some more specific matters.

In the case of an inductive generalization, the strength of the argument depends on the quality of the sample—how well it represents the whole population from which it is drawn. Without getting into statistics, there are two main concerns about the sample. The first is whether or not the sample is large enough to warrant confidence in the conclusion. If the sample is not adequate in size, accepting the conclusion is to fall victim to the classic fallacy of a hasty generalization.  To use a simple example, a person who sees two white squirrels at Ohio State and infers all Ohio squirrels are white would fall victim to a hasty generalization. In general, the professionally conducted polls were large enough; so they most likely did not fail in regards to sample size.

The second is whether or not the sample resembles the population. Roughly put, a good sample recreates the breakdown of the population in miniature (in terms of characteristics relevant to the generalization). In the case of the election polls, the samples would need to match the population in terms of qualities that impact voting behavior. These would include age, gender, religion, income and so on. A sample that is taken in a way that makes it unlikely to resemble the population results in what is known as biased generalization, which is a fallacy. As an example, if a person wanted to know what all Americans thought about gun control and they only polled NRA members, they would commit this fallacy. It must be noted that whether or not a sample is biased is relative to its purpose—if someone wanted to know what NRA members thought about gun control, polling NRA members would be what one would do.

Biased samples are avoided in various ways, but the most common approaches are to use a random sample (one in which any member of the population has the same chance of being selected for the sample as any other) and a stratified sample (taking samples from the various relevant groups within the population).

The professional pollsters presumably took steps to ensure the samples resembled the overall population; hopefully using random, stratified samples and other methods. However, things can still go wrong. In regards to a random sample, there are obviously practical factors that preclude a truly random sample. Also, even a random sample can still fail to resemble the population. For example, imagine you have a mix of 50 plain M&M and 50 peanut M&Ms. If you pulled out 25 at random, it would not be shocking to have more plain or more peanut M&Ms in your sample. So, these random samples could have gotten things wrong.

In terms of a stratified sample, there are all the usual problems of pulling out the sample members for each stratum as well as the problem of identifying all the strata that are relevant. It could be the case that the polls did not get the divisions in American voters right and this biased the sample, thus throwing off the results.

Polls involving people also obviously require that people participate, that they honestly answer the questions, and that they stick to that answer. One concern that has been raised is that since the polls are conducted by the media and people who supported Trump tend to hate and distrust the media, it could be that many Trump supporters refused to participate in the polls, thus skewing the results in Hillary’s favor. A second concern is that people sometimes lie on polls—often because they think they should give the answer they believe the pollster wants. A third concern is that people give an honest answer at the time, then change their minds later. All of these could help explain the disparity between the polls and the results.

Conspiracy theorists could also claim that the media was lying about its results in order to help Hillary, presumably reasoning that if voters thought Trump was going to lose they would either vote for Hillary to be on the winning side or simply stay home because of a lack of hope. As with all conspiracy theories, the challenge lies in presenting evidence for this.

And that is how the polls might have gone wrong in predicting Hillary’s victory.


My Amazon Author Page

My Paizo Page

My DriveThru RPG Page

Follow Me on Twitter


11 Responses

Subscribe to comments with RSS.

  1. ronster12012 said, on November 11, 2016 at 9:57 am

    I would have thought that the bookies would have got it right, but they got it wrong too.

    Here’s an article how social media patterns predicted a Trump win back in August.


    Perhaps the concept of polling is flawed?

  2. wtp said, on November 11, 2016 at 10:08 am

    We go to tremendous effort to protect the privacy of the voting process. Polling data is an intrusion into what is a sacred right to privacy. Why anyone would tell a perfect stranger, whom they have never met, never seen, who simply called them anomalously on the phone, knowing full well that that pollster knows who they are but they have no way of knowing who that pollster is, what their political beliefs are and/or who they planned to vote for, is not very bright. And on the other end of the phone, that pollster (or more accurately the polling entity) has no way of knowing if the person answering the questions is being honest in their replies.

    Polling is based on stupid people doing stupid things. Any wonder you get stupid results.

  3. nailheadtom said, on November 12, 2016 at 2:27 pm

    There’s only one meaningful poll. It was taken on November 8. The others are a business of offering predictions that people pay for, like Madame Antonia’s Palm Reading.

  4. DH said, on November 12, 2016 at 8:09 pm

    Actually, the polls got it exactly right. They never claimed to predict who would win, they only said, “X is ahead in the polls as of this date”. Some of them predicted that a candidate had an X % chance of winning the election.

    Long shots win horseraces all the time; underdogs win in sports too.

    • WTP said, on November 12, 2016 at 9:04 pm

      Actually, the polls got it exactly right.

      So did some Tarot card readers. At any given time there are a number of different prognosticators using whatever data available to predict a broad range of outcomes. Someone, somewhere is bound to be “right”. Doesn’t mean the have a handle on the real situation.

      • DH said, on November 12, 2016 at 11:36 pm

        My point exactly. They never said who would win, they only said who was ahead at any given point.

  5. DH said, on November 13, 2016 at 1:13 pm

    Right, well, I was really only playing a semantic game here. The polls showed only who was ahead as of the time the poll was taken – or to be more precise, they only reported their own data. It was the people interpreting them that got it wrong.

    • nailheadtom said, on November 13, 2016 at 2:00 pm

      Nobody can be ahead at the time the poll is taken. A poll response, assuming there even is one, isn’t the same as a vote. The only time anyone can be ahead is during the counting of the actual votes.

      The amount of media and blog universe verbiage devoted to the polls themselves and their subsequent failure is an astonishingly meaningless use of pixels that could have been more usefully devoted to cat videos.

  6. TJB said, on November 13, 2016 at 2:12 pm

    I think there was plenty of uncertainty in the polls such that a Trump win was well within the envelope of probable outcomes.

    The problem was motivated reasoning on those reporting on the polls.

    • Michael LaBossiere said, on November 13, 2016 at 5:29 pm

      What sort of motivational reasoning was going on? Was it that the data showed Trump leading and they elected to ignore that?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: