Friday, September 14, 2012

A Primer on "cherry-picking" stats

Most people treat numbers as real and reliable - and in most circumstances they are.  But they can also be used to be purposefully misleading, or picked through to find whatever set of numbers makes the best case for a given circumstance.  It's just part of "How to Lie With Statistics."
  The professional association of pollsters, AAPOR, knows well how this is done in the selective reporting of polls, and has guidelines for what information people really need to provide when making arguments based on polling results - guidelines that are for the most part ignored in journalistic reporting.  "Cherry-picking" results can be as problematic, particularly when only a pre-selected set of results is reported.  It's quite natural for someone who wants to use stats to support their argument to pick among the myriad variations of numbers and results those that best support their position.  And, regrettably, its quite natural for journalists to accept those numbers, or at most just verify their accuracy, rather than wondering why that particular set of numbers was used and looking at the larger picture and whether the cited results accurately reflect the underlying reality.
  There's an interesting video out that illustrates the potential issues of cherry-picking.  (While this particular one dissects an Obama campaign claim, cherry-picking is an almost universal tendency).

  Journalists in this increasingly competitive news market don't always have the luxury of time to critically evaluate every claim, even with the access to information and analysis that the Internet offers.  So when should they wonder if a claim's an accurate reflection of the larger reality?  As the video points out in its conclusion, one dead give-away is if the claim is very precisely phrased, or uses ranges or measures which don't seem to be the typical usage.    In the video's example, picking "27 months" and "7 years" as the time periods to consider.  Another is when different terms are used for the stats being provided.  In another case I recall, it was comparing "new jobs" (only those created - not considering those ending or lost) to "gains in employment" (comparing net job increase - newly created jobs minus jobs lost).
   When journalists run into these, it might be useful (not to mention enlightening) to look a bit deeper into the claims and measures.  At the very least, though, they need to be careful to use exactly the same phrasing in their reporting of the claim - even though it may make for better headlines to generalize that narrowly defined claim into statements like "Obama's created more jobs than Bush did."  Because that broader statement is not even "technically" accurate, and is easily refuted. Misinterpretations of that sort, intentional or not, can raise questions about the competence or fairness of the reporter and news organization. That kind of error or sloppiness may contribute to the public's growing mistrust of news media, and might possibly be a contributing factor to its decline.

edited to fix typos in title and body.

No comments:

Post a Comment