Tuesday, August 14, 2012

Two Posts on Big Data & Social Media

The Internet allows for the collection of immense amounts of information - on users, on outlets, and on content and its flows.  Aside from continuing concerns about privacy issue, there's been a rise in the attempts to make sense of, and use, all this constantly-generated "Big Data."  Furthermore, the rise of social media, and its transparency, is creating huge amounts of comments, thoughts, recommendations, of millions of users - as well as their follies and foibles - all of which can be mined for data.  
  We've clearly past the first stage - the explosion of information.  That's been happening for decades, and the amount of information generated around the world continues to explode exponentially.  We're also well into the second stage - developing tools and techniques for sorting, managing, and even filtering information.  We're also in the early stages of developing analytics - the means to measure and analyze all that information (although measuring compounds the information explosion as it continuously creates new data about information and data).  And that leaves us with the continuing problem of making sense and finding value.  We're making inroads, but have yet to fully step into the third wave - using all those tools to find value in the information haystack.

Dion Hinchcomb, posting at The Brainyard, argues that
the social world, by dint of a billion people engaging with each other around the clock, is now the richest source of open innovation, product ideas, marketing and sales opportunities, customer care capacity, and much more. One thing we've learned in the last eight years of the mass collaboration era is that, whatever an organization cares about, crowds can help us conceive of it, build it, test it, market it, support it, and fix it--and do all of that at scale.
The problem's been to find the gems or spot the trends in this morass of information and data.  Thankfully, there's been a lot of people and companies working to develop analytics and techniques to sift and sort Big Data. This has opened Pandora's Box - a potential of finding value for Big Data users, as well as the potential for harming social media and Internet users.  Hinchcomb focuses on the positive, positing that Social Media's Big Data can generate positive returns on organization's investment in utilizing and analyzing social media.  Hinchcomb suggests that we're well past the first wave - that organizations are finding and making use of social media.
With the continuing rapid growth of social media, Hinchcomb suggests that some organizations will transition to social businesses.  In an information economy, knowledge workers recognize the benefits of the opportunities the Internet and social media provide - greater access to information, enhanced opportunities for collaborations beyond your own "silo" of expertise/focus, and the ability to focus on project-related tasks.  Particularly if enterprises can transcend internal barriers, such as embedded legacy enterprise-specific applications.  Still, the fairly rapid adoption of social media provides hope that we'll make the transition.

Still, it's not all blooming roses out there in the world of Big Data.  At a recent Kontagent Konnect user conference, Josh Williams talked about the "Seven Deadly Sins of Data Science."  As with any analysis, you can do it well, or poorly (particularly if you don't understand the limits inherent in any analytic technique).  Here's some of the possible ways to mess up.
  1. Sloth - Lazy Data Collection:  Also known as GIGO (garbage in, garbage out), the first limit on analysis is the quality of the data.  It's easy to grab and use numbers that are there, rather than the numbers that you need.
  2. Negligence - Misapplied Analysis: It's easy to use an inappropriate technique - one that doesn't provide relevant results. (An example from one of my first stats classes - "The average American is 47% male and three days pregnant."  Think about it.)
  3. Gluttony - Too Many Reports: Too much information and too many tools can result in too much analysis - and the key results can get lost in the mix.
  4. Polemy - Data Definition, User Disagreements:  One of the problems with "Big Data" at this stage of development is that there are few widely-accepted metrics or analytics, which contributes to arguments over the definition and utility of data measures and procedures.
  5. Imprudence - Jumping to Conclusions: When you have a lot of data and a lot of results, it's easy for something to jump out and seem significant.  Stats people will remind you that even at 95% confidence level, there's a 1 in 20 chance of a false positive (i.e., what you see isn't really there).  If you place too much importance on a single finding, without examining the broader context, definitions, or limits of measures, you could be making a big mistake.
  6. Pride - Decision-Driven Data Making:  If you look at Big Data to confirm your beliefs, it's easy to construe or manipulate things, even subconsciously.  You might define things a certain way, pick supportive data, manipulate datasets - all of which might bias the analysis to foster confirmation.  The true scientist looks for answers rather than confirmation, and is more likely to get to the truth (or reality or whatever).
  7. Torpor - Learning and Acting Slowly:  Historically, data and analysis have been delayed - collecting, publishing, and analysing data took time (in academia you can easily have delays of 1-2 years in getting results out and being able to apply them).  However, the Internet, social media, and Big Data are all racing in real-time.  Being successful there requires collecting and examining data in real time, and being able to react to what you see happening quickly.  A delay can put you on the wrong end of the trend.
What all of these suggest is that when seeking to use Big Data or social media, you need to learn the lessons of data definition, data collection, and data analysis that decades of research and practice in other areas have yielded.  But you also have to be aware of the unique aspects of Big Data, social media - particularly the speed at which things occur online.  Being aware of the 7 Deadly Sins and how they might lead to poor results can help.

Sources - Why Big Data Will Deliver ROI for Social BusinessThe Brainyard
The Three Waves of Enterprise 2.0: Climbing the Social Computing Maturity CurveebizQ
Social business holds steady gap behind consumer social mediaZDNet
7 Deadly Sins of Big Data Users,  Information Week

1 comment:

  1. Big Data management has a significant place in many
    different organizations.

    ReplyDelete