Big Data is big business. Companies like Google and LinkedIn make no secret about their practice of mining their data to gain insights about customers and their behaviors. The job description of the people who do this at organizations like Facebook are “data scientists,” and consumers are increasingly wondering: Who are these data magicians, and what do they do?
Job requirements for data scientists are both concrete and nebulous: the person must know at least one programming language such as Python or Java Script, be able to work with large sets of data, and be able to extrapolate from that data. These data wizards must be strong in statistics, be skilled communicators, and have a mind flexible enough “to ask questions you didn’t know you had” about data. For these skills, they are paid an average salary of almost $132,000.
Ooyala data scientist, Matt Pasienski, says his job requires proficiency with a myriad of visualization tools and with processing data using programs like MySQL, Hadoop and STORM. He stresses that data scientists must understand the business and marketing side of business, as well as the technology side. If a company cannot afford to hire someone to ask key questions about their data, they can use software instead. Metamarkets founder, Mike Driscoll, describes a data scientist as “a person who can take big data, analyze it, and tell stories from the data.”
Finding people with such a broad bag of skills, and having the budget to pay them, is often a challenge for smaller start-ups, especially as data scientists generally work in teams. LinkedIn’s teams consist of five people while Facebook originally had a team of twelve. Big Data companies like Tableau and Metamarkets provide software and services that fill the data scientist niche. These companies provide dashboards that allow business analysts and company executives to analyze their own data, without the need for such deep technical skills.
Scores of companies are advertising for data scientists, especially in areas like Silicon Valley. A 2012 McKinsey and Company report predicted a huge shortage of talent in this area, citing that “the United States alone will need 1.5 million new analysts and data managers to consume and question Big Data.” Experts are seeing a shortage of data scientists worldwide.
Companies want to know why customers are making buying decisions, how they are moving through websites, and how companies can engage these customers. Dr. Hadley Wickham, a statistics professor at Rice University, cites an example of “great data science” at Progressive Car Insurance. One of their data scientists noticed an upswing of web searches of “how to make a Flo costume for Halloween.” The company set up a “Dress like Flo” page, received tons of traffic and Progressive saw a spike in sales as a result.
Technology Review stated in 2012 that Facebook planned to double their data team in 2013. So who are Facebook’s data scientists, and what are they doing to influence the way you use their site? This is the team that helps Facebook make money off their users’ personal information, be it life achievements, photos or conversations. These sociologists and data scientists also created features like “people you may know” to encourage friending of acquaintances. Facebook’s data scientists also routinely conduct experiments examining user behavior in areas as diverse as bullying and shopping.
Earlier this summer, Facebook came under fire for its now-famous “emotion manipulation study.” One of Facebook’s ex-data scientists, Andrew Ledvina, told a Wall Street Journal reporter that the company’s data scientists had free reign to run tests, “as long as they didn’t annoy Facebook’s users.” Ledvina was with the company from 2012 to 2014, and was part of the study, conducted on 700,000 unknowing Facebook customers. For seven days, Facebook manipulated News Feeds to show users more positive or negative content in order to study whether customers responded more positively or negatively themselves. Spokespeople for the company state “their user agreement has always stated that user information could be used to enhance services” but a formal complaint has been lodged by the Electronic Privacy Information Center. Forbes stated the word “research” was added to the user agreement four months after this study was completed.
The science of analyzing Big Data, particularly at Facebook, is more complex than users might think. Justin Moore, a Facebook engineering manager in New York, describes the data teams as diverse and “full of problem solvers.” In answer to “who we are,” Moore states that Facebook’s data science team is made up of PhDs who “combine machine learning with crowd-sourcing to create a better user experience. There might be a sociologist working alongside a web product engineer.” The important thing to remember in terms of Big Data: when the software is free, user’s information is the real product.
By Jenny Hansen