Monday, October 1, 2012

Really, Really Scary News on Research

Here's the issue - surveys face several sources of error.  The one everyone talks about is statistical sampling error.  That's the +/- x% treated as margin of error in most news reports of polls.  What it measures is the degree to which a sample differs from the target population it comes from.  And it's influenced solely by sample size - because it assumes that sampling is completely random.
  Another source of error comes from what's known as response bias - the result when sampling is not completely random and the people who respond to your survey differs in some meaningful way from the target population that you're trying to extrapolate your survey results to.  The problem is that since we don't know anything about the people who don't participate or how they might differ from the people who did there's no direct way of knowing whether a sample shows response bias (or how badly biased it is). However, survey researchers do know that the likelihood that there's response bias increases significantly as the response rate (the percentage of people contacted who participate) goes down.  The same relationship holds for contact rate and cooperation rate.

  When the Pew Research Center, who do polling about as well as it can be done, reviewed their survey efforts over the last 15 years, they ended up with the table above.  The 1997 numbers are consistent with the historical rates for telephone surveys.  But for 2012, Pew had a 9% response rate - that is, out of every 100 people it tried to contact, 9 participated and 91 did not.  Looking at the table another way, you can calculate that in 2012 pollsters couldn't even contact 38% of the people it tried to contact, and that 53% of all those in the sample refused to participate.  Of those the pollsters could reach, 86% refused to participate. While we don't know why Pew couldn't reach people, and why those they did contact overwhelming did not want to provide responses, it's tough to prove response bias. However, it strains credulity to argue that such samples are random, and there's not some response bias involved.  You can sometimes get an idea of bias by comparing the sample to your target population on various measures - if you know what the numbers for the target population are, and that the measures are the right ones to reflect bias. Pew did this for their 2012 samples in the table to the left, with some measures spot on, but others with differences as high as 28%.
But do we know which measures are the critical ones for indicating response bias?

  A side effect of falling contact and response rates is that it makes polling more expensive - you have to  call more numbers to get the same number of responses. For Pew, you have to try to reach about four times the number of people to try to get the same number of survey participants you had in 1997.

  You can see the problem emerging in the wide range of election polling results this year.  Polls that in the mid 1990s would routinely have samples of 1200-1500 now report samples of 500-600 - driving the statistical (random) error from +/- 2-3% to +/- 5-7%. And despite the continuing focus on which candidate is leading or trailing by the press, very few of those differences are outside the statistical margin of error for the smaller sample sizes.  Some pollsters try to make the random error seem smaller by weighting their sample results - manipulating the data by giving some participants' responses more of an impact than others.  Statistically, this can shrink the apparent statistical error, if your weighting formula accurately reflects the target population.  But as with response error, we often don't know what the target population mix is.  So most pollsters who weight sample use some historic measures and hope that the target population hasn't changed since then.  In the absence of accurate and appropriate target population metrics - which regrettably is the norm more often than not - weighting survey results is as likely to increase bias as to minimize it.

   And that uncertainty about various sources of sampling bias is the real problem - with low response rates and weighting come the likelihood of bias in the actual sample, and the poll results.  And it's a bias that, if we were all honest, we'd acknowledge that probably exists.  But since you don't often have population numbers to compare the sample to, we can't prove it's there (or not), and we can't tell how big the problem is.  So many pollsters will ignore it, or at best stick a comment about the possibility in a footnote.  And most reporters will continue writing up survey results as if they were accurate and final, unaware of the nuances of sampling and question-wording, and the growing awareness of a real problem of bias.

  Here's the really, really, scary news on survey research - the levels of error are higher than we report or recognize.  Most survey samples are not random, and most sample results are skewed by response bias.  Worse, we can't be sure how skewed or imprecise the results are,  We can no longer afford to trust that survey research results derived from telephone samples are unbiased and accurate reflections of the larger population.  They might be, they might be close, and they might be way off - there's just no way to know precisely.  But we can be pretty confident that a sample showing a party identification breakdown of 45D/30R/25I is likely to show different voting intentions than a sample with party ID breaking 30D/30R/40I.  Let's at least start recognizing that.

Sources -  We Are The 91%: Only 9% of Americans Cooperate with PollstersPJ Media
Pew Research Center for the People & the Press,  Assessing the Representativeness of Public Opinion Surveys.


Edit - had to correct some typos and restore a dropped sentence fragment.

1 comment:

  1. I'm wondering if the cost and reliability implications of this trend have anything to do with the decision to only do exit polling in 31 states this year?
    http://www.washingtonpost.com/blogs/the-fix/wp/2012/10/04/networks-ap-cancel-exit-polls-in-19-states/
    This article does say cost tradeoffs are a contributing factor

    ReplyDelete