We Need Better Poll Aggregation, Not Excuses for Using Bad Data

I don't know if it's TPV that's been crawling under Nate Silver's skin, but he's been feeling the need to defend his methods. We're at a point where Nate Silver himself points out the horrific track record of Gallup and insists on including it in his models anyway - and including it as the single largest factor in his trendline calculations. But it was on Tuesday that he actually came up with ways to justify it: he responded to the critics of his poll aggregation with a strawman argument: those of us calling for a better examination of internal data in a poll before it is included in aggregation without question are "cherry-picking" polls, or, at the least, doing a less obvious form of the same.
There is a more subtle form of bias, however, that a lot more of us are prone to. That bias is to look at all the data — except for the two or three data points that you like least, which you dismiss as being “outliers.” [...]
Strawman. Nobody is seriously making the case that you only include polls that we "like," Nate. We are arguing that poll aggregators have a responsibility to examine the poll's internals and demographic data in context of actual voting patterns of the recent past and to an extent, current early voting. No one is arguing that you throw out a poll if it favors Romney. We're saying that you throw out a poll, for example, if it shows conservatives will dominate this election by a 6-point higher margin than the ultraconservative electorate in 2010.

But Nate doesn't like to do that either. I am thinking that's because it makes too much work for him, but he has his excuse:

That is not quite as biased as cherry-picking the best results — but it gets you halfway there, and it is much easier to rationalize. There is something that can be criticized about almost every poll: the methodology, the demographics, the sample size, the pollster’s history or something else.

Often, these critiques have some truth in them. Not all polls are as methodologically sound as others. But frequently people come up with reasons, valid or otherwise, to avoid looking at the polls they don’t like — while giving a pass to those they do.

Likewise, people sometimes make too much of demographic or geographic subsamples within a poll that make their preferred candidate look good.
Notwithstanding a repetition of the aforementioned strawman, this avoids the fundamental flaw of poll-aggregates: Oh, you people are too concerned with demographics. I don't think that the problem is that we are concerned about the demographic breakdowns in the likely voter models. I think the problem is that "analysts" like Nate Silver aren't. Demographic information - especially in likely voter models - should be crucial to determining whether or not a given poll merits inclusion in a poll average. And this isn't a Democratic-Republican, Left-Right. Yes, Gallup doesn't give us demographic information but there are plenty of polling organizations supposedly from the Left that do not tell us who their "likely voters" are, either - Public Policy Polling, for example.

I do understand the "business" part of Nate Silver's argument: if you start parsing polls on their samples and demographics, you will face attacks depending on whom a given poll favored. Being at a corporate organization like the New York Times, the appearance of impartiality is more important than the validity of sampling data, I suppose. I mean,

Do I have a suggestion as to how this should be done? I'm glad you asked. Yes, I do. It requires some work on the part of our analysts, though. I do not suggest dropping any poll writ large simply because of its sponsor, reputation or any other thing. Instead, I suggest looking at every released poll based on a set of criteria:


  1. Demographics data from the poll needs to be contrasted with recent, like-election demographics. 2008, a presidential year is a more apt measuring stick, while 2010, a non-presidential year wave election for Republicans, might not be such a great reference point.
  2. Demographic data from the poll also needs to be contrasted with population and registration trends, as well as with current data for those who are actually voting, when early voting numbers are available. The internal growth in demographics in favor of voters of color (and their documented disproportional preference for Democrats), the growing voting power and registration edge for women (especially single women), etc., as well as current turnout needs to be taken into account when determining just who, in what proportion of the populous, is likely to vote.
  3. Reputable state polling must become part of overall national aggregation data to a greater degree to account for both of the above.
  4. Last, but not least, regional data - like Gallup's assertion that less than a third of the voting population in the south will vote for Obama - needs to be looked at with the above two contrast points as well - like-election demographics and population and registration trends.
Within a reasonable variation, these three should be the essential criteria that determines whether or not a given poll should be included in a polling aggregate. Any poll that doesn't fall into a reasonable margin of each of these three criteria should be tossed. Am I asking that these analysts and aggregators essentially set their own likely-voter demographic (both nationally and region-specific) models and only count polls that conform to those models, with a reasonable margin? Yes, that is exactly what I'm asking for.

Right now, some polls are moving back in the right direction for us, and that's good. But the fundamental problem with poll aggregator models remain the same: they insist on including every poll out there, regardless of their biases, assuming that the biases will cancel each other out over the long term. In the age of astroturf polling and media driven narratives rather than fact driven media, this is no longer a safe assumption.