Sergey Nivens - Fotolia
Published: 02 Oct 2017
Big data can mean big insights, big trends, big forecasts and big returns. And it can also mean big problems in managing the data that delivers all those goodies. Metadata -- the data that describes all the collected big data -- is the first line of defense. But if your metadata program isn't well planned and properly implemented, a big problem can get worse.
Let's suppose you're mining historical sales data to plot buying trends within a certain demographic, and your results will be passed on to marketing to plan a new campaign. If your demographic is millennials, and marketing defines that group as being in an age range different from what you're using, then marketing's results -- based on your results -- will be off-target. If you and marketing aren't using the same definition, your parameters won't deliver what marketing needs.
The point is clear: The business that operates best is the one that works from the same page when it depends on the data that describes data. There are several rules to consider in a metadata management process.
Work from the top down. Metadata was probably a localized corporate tool in the past. But as organizations de-silo their stores of information and the data is shared across several departments and lines of business, it's increasingly important to create an institutional metadata management process and taxonomy for your entire business with an eye toward eliminating small usage differences between departments. If that sounds bureaucratic, well, maybe it is -- but it's the kind of roll-up-your-sleeves effort that's ultimately worth the pain. This top-down approach means parsing data according to how it's used by the entire company, between departments and in concert with unstructured outside data. Intradepartment variations should be addressed, and custom usages eliminated or replaced.
Make it a party. To achieve a level of agreement between the various departments, it's not enough to issue a proclamation from the mountaintop. It's essential to gather the people who actually use the terms in the same room to hash things out. They need to explain how and why they use a certain data description. Subtle uses of metadata date back to the days when every corporate and government office was filled with rogue Microsoft Access databases, which were built to circumvent an overworked IT department. Before the advent of big data, the people in the trenches invented clever uses for metadata. Be sure to invite those intrepid warriors to the party.
Centralize. Create a centralized metadata store that can be accessed by all the major systems your big data touches. The trend these days is toward cloud-based metadata stores, which major cloud vendors can provide.
Big data's big push
Forty-one percent of North American respondents to a TechTarget survey said big data and business analytics were one of their broad initiatives for the year.
Plan for changes and updates. A strong institutional metadata store will be used heavily and inspire new uses and innovations for existing processes. In anticipation of that, design a process for easy submission of new ideas, thorough evaluation of merit and rapid deployment when necessary.
Don't forget your partners. Remember that you're increasingly sharing your data and therefore opening your systems to partner companies, which are most likely doing all the things you're doing with a metadata management process to handle your collected information. Think about any overlap with your partners and how they define the data that both parties consider important. Those conversations are at least as critical as the ones you have in-house.
Well-managed metadata and well-managed big data are inseparable. Doing a great job with one requires doing a great job with both. Clean, well-defined metadata makes all the difference in delivering actionable business intelligence results.
Data management teams rely on big data preparation
Actionable business intelligence improve with data management
Metadata management increases Pentaho efficiency