A recent survey conducted by the McKinsey Global Institute indicates that as a company prepares a “big data” project, properly structuring the project team is a major determinant of that project’s likelihood of success. Using big data to glean useful business intelligence from large quantities of stored information is a different practice from many other corporate IT tasks, and it requires a different way of thinking about the roles of the professionals that work on it. Instead of just hiring computer science or statistics experts, companies would be well-served to look for people who fit into five distinct roles:
- Data cleaners. This front-line position works with incoming data to ensure that it’s not only accurate but also formatted in a way that keeps it consistent. For instance, if a company attempts to merge two sets of accounting data — one that stores quarterly data and one that stores yearly data — the database will be corrupted. Cleaners also ensure that field names are consistent throughout the entire database.
- Data spotters. Many “big data” projects start out with unreasonably big data pools. Spotters go through the database to identify which data is useful to the overall project and which data adds very little useful information or is stored in a way that makes it unsuitable for use in the project.
- Analysis organizers. Once cleaned, correct data is in the database, it needs to be indexed and structured so that it can be accessed and analyzed. The organization team is also responsible for setting up policies for the frequency of data updates.
- Model builders. The model building team has the expertise to create sophisticated models that turn the mountains of data into useful information that explains and predicts customer behavior. Once the models are created, the builders also continue updating and tweaking them so that they provide data of consistent quality and reliability.
- Implementation experts. Big data projects don’t exist in isolation. Implementation experts take the data from the model builders and develop systems that leverage the new information. For instance, they would be responsible for designing new and more effective cross-sale prompts on e-commerce product listings or at the point-of-sale in a retail business based on data from the big data models.