It has been the recent meme among commentators, especially those in the bag for one side, to claim that the election is already “in the bag”. Besides the insidious capability of such statements to be self-fulfilling prophecy by demoralizing supporters from voting (certain members of the media with an agenda? Well I never!), such hasty conclusions are often anchored on a very questionable assumption. Although one could very reasonably average up the polls and claim that one candidate is more likely to win than the other, many commentators have gone beyond the pale, claiming that Obama has an indestructible, impervious electoral college lock because state polls show him with narrow leads, and state polls are inevitably completely superior to national polls. This has also been the implicit assumption of a certain election probability forecasting model that shall go unnamed. However, not only have state polls been consistently wrong before, they have consistently missed the mark largely due to polling complications that not only still exist in 2012, but have been intensified.
In 2008, the state polls had a mixed track record. Although many very intelligent, fairly agenda-less political commentators have also pushed the “state polls are better!” argument, they justify so by arguing that if one were to extrapolate national standings from the state polls, they would have cranked out a somewhat more accurate result than the national polls. Perhaps one might take the state polls into account when evaluating the national polls. However, it's an entirely different stories when we start trying to predict individual states with absolute certainty. Although state polls overestimating Obama in Arizona by 5 points and underestimating Obama in New Mexico and Nevada by 7.8 and 6 points average each other out to a fairly accurate result, they speak poorly for the infallibility of state polls. If one looks at the 17 commonly polled states considered plausible battlegrounds in 2008 (AZ. CO, FL, GA, IN, IA, MI, MN, MO, NV. NH. NM. NC, OH, PA, VA, WI), the state polling averages were on average, 2.85 points off the mark, with just a slightly left-skewed distribution (the median was 2.5). In short, although the polls were not systematically biased, they were not very accurate either. One might argue that such a result was within the margin of error, but when a poll states that their margin of error is within for example, 3 points meaning that they are 95% certain that the real result is within their six-point wide confidence interval. An average error of 2.85 speaks somewhat poorly of accuracy as it suggests that a very large proportion of polls completely whiffed the actual result.
What does that mean for tomorrow's race? For one, I'd be suspicious of those that say candidate xxx absolutely needs state xxx and suggest either candidate has a solid firewall. If polls underestimate Romney by 3 points in Ohio and underestimate Obama in Colorado by 3 points, it completely throws off the conventional wisdom around the electoral college math. In addition, in 2008, state polls were completely inaccurate in the inland West (NM, CO, AZ, NV), fairly inaccurate in the Midwest (MI, WI, OH, PA, IA, IN) and almost completely accurate in the New South (VA, MO, FL, NC). Which brings us to the ground situation in 2012. The Obama campaign has largely retreated from North Carolina and Florida to fight a delaying action in Virginia/Colorado and bolster their so-called impervious firewall in Nevada, Ohio, Iowa, and Wisconsin. Although Obama's lead in his so-called firewall seems (and in all honesty, probably is) quite formidable, it's also a firewall built exclusively on states who have historically had the least accurate polling.
Why did many state polls miss the mark in 2008, especially in the West? Inaccuracy in the West is easily attributable to a variety of factors that are very challenging to pollsters, including the largest proportion of immigrants (poor response rates, language barrier), highest level of early voting, largest proportion of no-landline households, and many others. The South, on the other hand, was largely predictable as pollsters correctly forecasted the historic levels of (monolithicly Democrat) African-American turnout that lifted Obama in that region. Even still, 2008 was very challenging to pollsters due to the historic nature of Obama's candidacy shredding the traditional rules of voter turnout.
The reason this remains relevant is that no rules have replaced them and that pollsters have a vastly different idea of what will happen. Not only that, but many of the issues that harangued 2008 (consistently declining response rates, growth of no landline households, etc.) have only intensified coming into 2012. And different polling organizations have actually shown massive, irreconciliable disagreements on where turnout will be. Gallup, which has consistently shown Romney ahead by 4-5 points, assumes that turnout will drop to below even 2004 levels. Many polls showing Obama crushing Romney by a similar margin assume that the turnout trends in 2008 will only grow even stronger in 2012. This partially explains why so much polling has been bimodal, with two distinct group of polls in the major battleground states often clustering around two different points (one generally indicating modest, but durable Romney victories and the other indicating crushing Obama victories). These bimodal distributions are the key flaw of the major election forecasting models.
Most probability-crunching relies heavily on “normal distributions” where the bulk of data points fall in the middle and fall off as you approach the margins (a bell curve). If so, one would naturally (and fairly) assume that the highest points on the curve were the most probable, with the probability of outcomes quickly decreasing as you fall away from the center. However, this process becomes very problematic when you are looking at a non-normal distribution. Nate Silver's model looks at this phenomenon and finds that Obama is almost guaranteed to win. Although I typically think fairly well of Silver's work (and don't think he's blatantly pushing an agenda), with all due respect, no matter how good your model is - garbage in, garbage out. Looking at our current polling and suggesting Obama is a “lock” is reminiscent of taking a survey of meteorologists, finding that two-thirds are predicting 3 inches on rain on Monday with one-third predicting clear skies, before naturally including that it is almost guaranteed to rain, with 2 inches of rain being the probable outcome.
Simply put, one can quibble about who is more likely to win, and although the Bob Shrums and Paul Krugmans of the world certainly want you to simply not vote (remember, in lefty-world, only evil Rethuglicans have agendas), none of that takes away from the fact that we have a seriously contested election next Tuesday that could very easily surprise some of us or even all of us. It's an absolutely tired cliché at this point (I'm sure every political article ends on this note) but that only underscores how it's true: it's all about turnout and it won't be over until it's over. Yes, we have a real election this year.
-- Kirk Jing