My 2011 Enterprise 2.0 Conference Notes: Big Data Analytics for Social Media
Posted by Bill Ives on Thu, Jun 23, 2011
Here is another in a series of notes on the 2011 Enterprise 2.0 Conference in Boston. This covers the session: Big Data Analytics for Social Media. The Moderator is David Carr, Editor, The BrainYard (InformationWeek.com enterprise social media) The panelist are: Zach Hofer-Shall, Analyst, Forrester Research, Inc., David Gutelius, Chief Social Scientist, Jive Software, and Michael Wu, Principal Scientist, Lithium
The session description states: “Big Data analytics technologies like Hadoop first took hold in the realm of very large scale Internet operations such as search engines, but the need for them is becoming more widespread as enterprises seek to make sense of an overwhelming volume of social media data. We will look at how social media monitoring and management vendors are tapping these once exotic technologies and making them accessible to mere mortals, as well as the analytic skills their customers need to put Big Data to work.”
David began by noting they are taking two hot buzz words and mashing them together. Big data is a term for internet scale data that does not fit into a traditional database. It would explode. The other issue is the amount of unstructured data that does not fit into traditional database structure.
David started by asking the other David about what are the situations were an organization needs their own big data expertise. David said big data has both scale, volume and acceleration. Some is structured and some unstructured. There is a growing need to make sense of these data sets. He sees a new set of capabilities and layers in the enterprise architecture to address this. You will want combine the unstructured data with existing structured data. There can be a competitive advantage to this so having the right expertise is important.
David then asked Michael the same question. He agreed with what David just said. How can you get value form big data? One issue is interest and expertise in the content. Context can get you important information. For example, location with mobile devices can supplement the unstructured content coming from that location.
David now asked Zach to address the issue. He sees more companies hiring data scientists to deal with all this data. Organizations need to do more than rely on the vendors to create the right questions.
Audience question: Do you see different categories that vendors are falling into or is there a common base of capabilities. Michael said it is the latter case for now at least. Zach added that there is the potential for winners. There will be specialists and generalists as we move forward.
Audience question: are you seeing trends between social media demographics and what makes sites sticky. David said that there are different use cases. There are demographics based on user characteristics, then other factors such as time. Zach added that accessing different aspects requires the same technology. Michael added having context can help make predictions more accurate. Zach said this question is the type of things we should be thinking about to provide value.
Zach mentioned one of the questions is when do you have enough data to answer the question. When is there too much and you are just drowning in too much data? You can start simple and then scale. That is better than starting too big and getting overcome by too much.
David Carr said all the vendors say they can access everything. He asked is there a framework for looking at different approaches. David Guteius said they have built an ingestion engine that can look at subsets of metadata. They can also look at the full set of metadata to have the full context. They are using this metadata to map out the dynamics and the social context.
Zach, the non-vendor on the panel, said that every vendor says they can look at Twitter but there is actually a short list of those who have full access. But is full access to all the tweets a necessity? That is question to answer. Michael said you do not need all the data to answer most business questions. Start with the business question and then look to your data requirements. David added that if you are a financial service org and need to track everything you employees do, then you need the full fire hose.
Audience question: How do get the data scientist and the business people asking questions to best work together. Zach some schools are starting programs in this area. It is a real challenge. Michael added that there are different levels of understanding and you need people are good at translating and this can be hard. David said he spends a lot of time in the field to understand how people look at problems. Building a new set of skills to bridge gaps of scientist and business people. Zach said the data scientist has become the cool guy. Michael added that he used to be the un-cool guy and this has changed.
David pointed how big data work saved a firm over 100 million on a product rollout so there is an increasing understanding of the value here. Zach added a story on tracking influencers through big data who were sent marketing messages with some benefits once they were identified. There was a huge return on these highly targeted emails.
David challenged the concept of big influencers who remain so over time. This can be a myth. The influence is fluid. Searching for these big influencers can be a false quest. Michael said the issues are credibility, bandwidth, relevance, time, channel alignment, and trust. You have to look at all six. Zach said that if you go to the Palms in Vegas you will be upgraded based on your Klout score. I would add that it used to be based on whether you were a big loser at the tables as places like Harrahs did traditional data mining inside rather than looking outside for things like Klout score.
David said that big data analytics gives us the ability to experiment quickly to compare things like marketing campaigns and do continuous refinement and measurement.