Doctors say that "prevention is the best medicine." This holds true for data quality management, too. The sooner you identify potential problems the sooner you can handle them – and with less cost. This is the fundamental underpinning of any data quality management initiative in business.
Preventing data quality problems may seem like a "no-brainer," but you'd be surprised at how many data warehouse and business intelligence (BI) project teams don't plan for data quality issues. When you ask them why, the conventional response is that data quality management is the problem of the source systems. Although on one level that may be true, it's an awfully caviler response from a team that's responsible for providing accurate data to business users.
Data quality management recommendations
So where does data quality management fit in your data warehouse and BI project plans? You need to make it an integral part of every phase of your project – from requirements through design and production. A few specific recommendations:
- Ask for data quality performance measures as part of your business requirements gathering and prioritizing process. The business needs to define data quality and its metrics. You can't fix it until you know there is a problem that you can measure.
- Determine, along with the business, how you are going to handle data quality issues both during the development process and when your processes are operational. The business must prioritize the problems that need to be monitored and fixed. And, most importantly, the business needs to agree on the price they will pay, in terms of resources, time and costs, to achieve their desired levels of data quality. If they want high levels of data quality they have to be willing to pay for it.
- Monitor your data quality using the agreed performance measures from data sourcing through information consumption. This includes all your data extracting, loading and transformation processes to all the data stores used by your reporting and BI environments. These data stores may include data warehouse(s), operational data stores, data marts, cubes and data shadow systems. You need to be able to measure the data quality at every stage where the data is touched – until the business user consumes it in a report or during an analysis.
- Create a data quality management dashboard to monitor the agreed upon data quality performance measures. This allows business users and IT to understand your current data quality levels and then take the appropriate actions. These dashboards should include data quality trending reports to analyze if the data is getting better or worse. Also create data quality alerts to enable corrective action on a more proactive basis. Don't wait for the business user to discover that the numbers were wrong after they have already used them to make critical business decisions.
Data quality shouldn't be an afterthought in your data warehouse and BI projects. But if you aren't following these recommendations already -- at least you're not alone. Many companies struggle with data quality management. That doesn't excuse you, but it means you have a lot of company. The best course of action is to take steps to prevent data quality problems up front, so you can avoid the bitter medicine of dealing with them later.
About the author
Rick Sherman has more than 18 years of business intelligence and data warehousing experience, having worked on more than 50 implementations as an independent consultant and as a director/practice leader at a Big Five accounting firm. He founded Athena IT Solutions, a Stow, Mass.-based business intelligence consulting firm. Send Rick an email.
- Check out the complete list of Rick Sherman's contributions to SearchDataManagement.com -- including articles, podcasts and more.
This was first published in April 2007