Originally appeared at MoonOfAlabama
Today I discussed the U.S. election with a friend who studied and practices statistics. I asked about the failure of the polls in this years presidential election. Her explanation: The polls are looking at future events but are biased by the past. The various companies and institutions adjust the polls they do by looking at their past prognoses and the real results of the past event. They then develop correcting factors, measured from the past, and apply it to new polls. If that correcting factor is wrong, possibly because of structural changes in the electorate, then the new polls will be corrected with a wrong factor and thus miss the real results.
Polls predicting the last presidential election were probably off by 3 or 5 points towards the Republican side. The pollsters then corrected the new polls for the Clinton-Trump race in favor of the Democratic side by giving that side an additional 3-5 points. They thereby corrected the new polls by the bias that was poll inherent during the last race.
But structural changes, which we seem to have had during this election, messed up the result. Many people who usually vote for the Democratic ticket did not vote for Clinton. The “not Clinton” progressives, the “bernie bros” and “deplorables” who voted Obama in the last election stayed home, voted for a third party candidate or even for Trump. The pollsters did not anticipate such a deep change. Thus their correction factor was wrong. Thus the Clinton side turned out to be favored in polls but not in the relevant votes.
Real polling, which requires in depth-in person interviews with the participants, does not really happen anymore. It is simply to expensive. Polling today is largely done by telephone with participants selected by some database algorithm. It is skewed by many factors which require many corrections. All these corrections have some biases that do miss structural changes in the underlying population.
The Clinton camp, the media and the pollsters missed what we had anticipated as “not Clinton”. A basic setting in a part of the “left” electorate that remember who she is and what she has done and would under no circumstances vote for her. Clinton herself pushed the “bernie bros” and “deplorables” into that camp. This was a structural change that was solely based in the personality of the candidate.
If Sanders would have been the candidate the now wrong poll correction factor in favor of Democrats would likely have been a correct one. The deep antipathy against Hillary Clinton in a decisive part of the electorate was a factor that the pseudo-science of cheap telephone polls could not catch. More expensive in depth interviews of the base population used by a pollster would probably have caught this factor and adjusted appropriately.
There were some twenty to thirty different entities doing polls during this election cycle. Five to ten polling entities, with better budgets and preparations, would probably have led to better prognoses. Some media companies could probably join their poll budgets, split over multiple companies today, to have a common one with a better analysis of its base population.One that would have anticipated “not Hillary”.
Unless that happens all polls will have to be read with a lot of doubt. What past bias is captured in these predictions of the future? What are their structural assumptions and are these still correct? What structural change might have happened?
Even then polls and their interpretation will always only capture a part of the story. Often a sound grasp of human and cultural behavior will allow for better prediction as all polls. As my friend the statistician say: “The best prognostic instrument I have even today is my gut.”