Tuesday, March 19, 2013

The Internet, Big Data, and the Surveillance State

In a recent opinion piece for CNN, Bruce Schneier proclaimed: "The Internet is a surveillance state."

The Internet's never been secure or private - by design.  It's design goals were to be open and shared - to make it universally accessible, flexible, and adaptable.  And for those who remember the DARPA (defense-related) roots, even there the primary goal was survivability rather than security.  There's a reason the military's never relied on it.
  Sure, there are things you can do to make Internet use somewhat more private - use encryption, route through anonymizers, etc.  But still, every bit of data carries addresses, and all that flexibility and sharing requires that basic information on users and connected devices be readily available.  Add the fact that every data packet travels public routes where they can be duplicated, and ISPs and servers regularly back-up content and messages, and you realize that the Internet is a very public place.  As for encryption, industries trying to rely on encryption for copyright protection (as well as governments) have found that every encryption system is beatable, given enough brains, computing power, and time.  That many governments seek to restrict the use of encryption technology is a matter of laziness and cost rather than a fear of totally private communications.
  For a long time, the sheer volume of Internet traffic provided a bit of privacy protection for common users - searching through the volume of packets and files, identifying and matching traffic through multiple sites, etc. was just too problematic.  But if you had the resources, you could often break through whatever privacy/security roadblocks used (if any).  Schneier offers three recent illustrations -
  • the Chinese military hackers that have been attacking U.S. and European government, military, and commercial sites, were identified in part as they accessed their Facebook accounts through the same networks and hardware used for the hacking.
  • a leader of the LulzSec hacker collective was identified and arrested, reportedly because he slipped up and once logged into an IR chatroom in the clear - without masking his IP address as was his normal practice.
  • Paula Broadwell, who had an affair with then CIA Director David Petraeus, was identified despite only logging into the anonymous email account created and used for the affair from public internet sites.  The FBI reportedly identified her by matching hotel and service receipt records from the times of the emails, and finding hers was the one name in common.
Schneier's point is that Internet traffic is widely tracked, and not only by governments and counter-espionage organizations.  Google does it on everything running through one or another of their sites.  Google also tracks and records websites and content for its search engines.  Blogger.com, for example, lets me know who's visited this blog, where you're from, what OS you're running, and how you found me.  Apple tracks user behaviors on iPhones and iPads.  Facebook tracks its members and their behaviors, and backs up their content and submissions.  They've also admitted tracking their members non-Facebook activities, and using cookies to track online behaviors of non-members who visit Facebook pages.  Pretty much every commercial site builds profiles of users and customers.  And metrics firms collect and track data (anonymized, they say) on users and their Internet behaviors in terms of data traffic flows..

Now if all these were separate, private, and secure, they may be seen by many as the acceptable cost for the services and benefits provided by the Internet and various online services.  Even if they were shared, it might not be so bad, if it would take significant time and effort to try to link things together (particularly if you're looking for patterns in behavior).  If "surveillance" was too costly or inconvenient to be used regularly or for trivial purposes.
However, that's increasingly not the case, due to technology advances and the rise of Big Data.  If you haven't heard the phrase before, Big Data refers to a range of programs and techniques for trolling extremely large data sets (such as online tracking data) to tease out and identify patterns and links.  With Big Data to help, the sheer volume of online data is no hindrance.  Automated systems can scan millions of emails in real time looking for key words or phrases.  Automated systems can match online searches, or the use of certain apps, to purchasing behaviors and location data from mobile devices to send users a coupon for a nearby store or restaurant.  And data storage costs keep falling.  (And while not exclusively Internet, facial recognition software and the myriad private and public video cameras can be used to track a person's movements).
  As Schneier puts it,
This is ubiquitous surveillance: All of us being watched, all the time, and that data being stored forever. This is what a surveillance state looks like, and it's efficient beyond the wildest dreams of George Orwell.
Nor does there seem to be an easy solution, or a means of opting out.
There are simply too many ways to be tracked. The Internet, e-mail, cell phones, web browsers, social networking sites, search engines: these have become necessities, and it's fanciful to expect people to simply refuse to use them just because they don't like the spying, especially since the full extent of such spying is deliberately hidden from us and there are few alternatives being marketed by companies that don't spy.
So, Schneier concludes, welcome to an Internet without privacy; welcome to the Internet surveillance state.  While public interest groups try to raise concerns about privacy, and individuals rant, the public doesn't seem to mind - as long as Amazon and Netflix make good recommendations, YouTube lets you know about the latest "cute kitty" viral video, and social media don't charge fees.  The Internet was never truly private in the first place, and isn't likely to ever significantly shift in that direction.  In part because one of the significant public values of the Internet comes from the lack of privacy and the ability to find and make connections.  What international and national regulatory moves there are are about giving governments more control over the Internet, and more access to the information it transmits and generates.  Which means even fewer real privacy protections.

  If that worries you - and it should - you could go offline.  But in a modern global information society, that comes at a high cost.  Or you could try to level the playing field, as David Brin suggests in The Transparent Society - let us, as citizens, have the same access to surveillance of government activities as the government has over our activities.  Make government truly transparent, rather than settling for "transparency" being defined as giving people access to information the government wants to provide them.  Turn the cameras around, open records, and let the public see what government actually does, rather than only what the government claims it's doing (true or not).  Or hoping that an (increasingly scarce) honest and aggressive press will investigate and report, and do the monitoring for them.

Sources -  The Internet is a surveillance stateCNN Opinion
David Brin's Transparency website

Edit - fixed some typos

