Guide to managing a data quality assurance program
A comprehensive collection of articles, videos and more, hand-picked by our editors
Forward-thinking business executives recognize the value of establishing and institutionalizing best practices...
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
for enhancing data usability and information quality. But problems can arise if companies make piecemeal investments in aspects of data cleansing and correction. The absence of comprehensive data quality assurance and management processes leads to replicated efforts and increased costs; worse, it impedes the delivery of consistent information to the community of business users in an organization.
What's needed is a practical approach for aligning disparate data quality activities with one another to create an organized program that addresses the challenges of ensuring and maintaining high quality levels. Aside from engaging business sponsors and developing a business case for data quality investments -- both requirements in their own rights -- here is a list of five tasks and procedures that are fundamental to effective data quality management and improvement efforts.
Document data quality requirements and define rules for measuring quality. In most cases, data quality levels are related to the "fitness" of information for the purposes of business users. Begin by collecting requirements: Engage business users, gain an understanding of their business objectives and solicit their expectations for data usability. That information, combined with shared experiences about the business impact of data quality issues, can be translated into rules for measuring key dimensions of quality, such as consistency of data value formats in different systems, data completeness, currency and freshness and consistency with defined sources of record. As part of the process, create a central system for documenting the requirements and associated rules to support the development of data validation mechanisms.
Assess new data to create a quality baseline. A repeatable process for statistical data quality assessment helps to augment the set of quality-measurement rules by checking source systems for potential anomalies in newly created data. Statistical and data profiling tools can scan the values, columns and relationships in and across data sets, using frequency and association analyses to evaluate data values, formats and completeness and to identify outlier values that might indicate errors. In addition, profiling tools can feed information back to data quality and governance managers about things such as data types, the structure of relational databases and the relationships between primary and foreign keys in databases. The findings can be shared with business users to help in developing the rules for validating data quality downstream.
Implement semantic metadata management processes. As the number and variety of data sources grows, there is a corresponding need to limit the risk that end users in different parts of an organization will misinterpret the meanings of common business terms and data concepts. Centralize the management of business-relevant metadata and enlist business users and data management practitioners to collaborate on establishing corporate standards to reduce the situations in which inconsistent interpretations lead to data usage problems.
Check data validity on an ongoing basis. Develop automated services for validating data records against the quality rules you've defined. A strategic implementation enables the rules and validation mechanisms to be shared across applications and deployed at various locations in an organization's information flow for continuous data inspection and quality measurement. The results can be fed into a variety of reporting schemes -- for example, direct notifications and alerts sent to data stewards to address acute anomalies and high-priority data flaws, and data quality dashboards and scorecards with aggregated metrics for a wider audience.
For more on data quality assurance:
Get execs on board with a complete data quality plan
Learn tips for improving data quality while reducing costs
Offer data quality training to enhance business user participation
Learn if you're on the best QA career path
Keep on top of data quality problems. Develop a platform for logging, tracking and managing data quality incidents. Measuring compliance with your data quality rules won't lead to improvements unless there are standard processes for evaluating and eliminating the root causes of data errors. An incident management system can automate processes such as reporting and prioritizing data quality issues, alerting interested parties, assigning data quality improvement tasks and tracking the progress of remediation efforts.
Done properly, these activities form the backbone of a proactive data quality assurance and management framework with controls, rules and processes that can enable an organization to identify and address data flaws before they cause negative business consequences. In the end, fixing data errors and inconsistencies and making sure their root causes are dealt with will enable broader and more effective utilization of data, to the benefit of your business.
About the author:
David Loshin is president of Knowledge Integrity Inc., a consulting, training and development company that focuses on information management, data quality and business intelligence. He also is the author of four books, including The Practitioner's Guide to Data Quality Improvement and Master Data Management.