Home > Data quality issues management
Chapter Download:
EMAIL THIS LICENSING & REPRINTS

Data quality issues management

27 Feb 2007 | Jack E. Olson

Tips, expert advice and sample chapters
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google

The following excerpt from Data Quality: The Accuracy Dimension is printed with permission from Morgan Kaufmann, a division of Elsevier. Copyright 2003. Click here to read the complete Chapter 5.

Data quality investigations are all designed to surface problems with the data. This is true whether the problems come from stand-alone assessments or through data profiling services to projects. It also does not matter whether assessments reveal problems from an inside-out or an outside-in method. The output of all these efforts is a collection of facts that get consolidated into issues. An issue is a problem with the database that calls for action. In the context of data quality assurance, it is derived from a collection of information that defines a problem that has a single root cause or can be grouped to describe a single course of action.

That is clearly not the end of the data quality effort. Just identifying issues does nothing to improve things. The issues need to drive changes that will improve the quality of the data for the eventual users.

It is important to have a formal process for moving issues from information to action. It is also important to track the progress of issues as they go through this process. The disposition of issues and the results obtained from implementing changes as a result of those issues are the true documentation of the work done and value of the data quality assurance department.

Figure 5.1 shows the phases for managing issues after they are created. It does not matter who performs these phases. The data quality assurance department may own the entire process. However, much of the work lies outside this department. It may be a good idea to form a committee to meet regularly and discuss progress of issue activity. The leader of the committee should probably be from the data quality assurance department. At any rate, the department has a vested interest in getting issues turned into actions and in results being measured. They should not be passive in pursuing issue resolution. This is the fruit of their work.

An issue management system should be used to formally document and track issue activity. There are a number of good project management systems available for tracking problems through a work flow process.
Data Quality: The Accuracy Dimension
For more information about this title and other similar titles, please visit www.mkp.com.

The collection of issues and the management process can differ if the issues surface from a "services to project" activity. The project may have an issues management system in place to handle all issues related to the project. They certainly should. In this case, the data quality issues may be mixed with other issues, such as extraction, transformation, target database design, and packaged application modification issues. It is helpful if data quality issues are kept in a separate tracking database or are separately identified within a central project management system, so that they can be tracked as such. If "project services" data profiling surfaces the need to upgrade the source applications to generate less bad data, this should be broken out into a separate project or subproject and managed independently.

Turning facts into issues

Data quality investigations turn up facts. The primary job of the investigations is to identify inaccurate data. The data profiling process will produce inaccuracy facts that in some cases identify specific instances of wrong values. Other cases identify where wrong values exist but identification of which value is wrong is not known, and in yet other cases identify facts that raise suspicions about the presence of wrong values.

Facts are individually granular. This means that each rule has a list of violations. You can build a report that lists rules, the number of violations, and the percentage of tests performed (rows, objects, groups tested) that violated the rule. The violations can be itemized and aggregated.

Metrics

There is a strong temptation for quality groups to generate metrics about the facts and to "grade" a data source accordingly. Sometimes this is useful; sometimes not. Examples of metrics that can be gathered are

  • number of rows containing at least one wrong value
  • graph of errors found by data element
  • number of key violations (nonredundant primary keys, primary/foreign key orphans)
  • graph of data rules executed and number of violations returned
  • breakdown of errors based on data entry locations
  • breakdown of errors based on data creation date

The data profiling process can yield an interesting database of errors derived from a large variety of rules. A creative analyst can turn this into volumes of graphs and reports. You can invent an aggregation value that grades the entire data source. This can be a computed value that weights each rule based on its importance and the number of violations. You could say, for example, that this database has a quality rating of 7 on a scale of 10.



Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


RELATED CONTENT
Data quality best practices
Uncovering the new data governance trends, with Ventana Research
SaaS-based data quality and integration tools gaining momentum
Data quality buoys U.S. Naval Institute
Best practices for designing and implementing sustainable, long-term data quality programs
Effective data quality program management: Tips and advice
Creating successful data stewardship programs, with Jill Dyché
Data quality management for data warehouses
Gartner's data quality management software rankings show convergence with data integration
Master data management must start with data governance
Data quality assessment helps identify, fix data quality problems

Data quality management software tools
Exec explains IBM's Information On Demand (IOD) initiative
SaaS-based data quality and integration tools gaining momentum
Data quality buoys U.S. Naval Institute
IBM unveils new data management software, services
Microsoft to buy Israeli data quality startup
Gartner's data quality management software rankings show convergence with data integration
Data quality assessment helps identify, fix data quality problems
2007 data management products of the year
Business Objects adds another data quality firm
Data quality management tools: Where to get unbiased information

Data stewardship
Customer data integration helps credit union personalize customer experience
Creating successful data stewardship programs, with Jill Dyché
Data governance success: No pain, no gain
Data governance committee project pays off for Blue Cross
Data governance: How to get started and find success, featuring Jill Dyché
Data quality management pitfalls: Three common mistakes to avoid
How to develop and maintain an enterprise data quality management strategy, with Larry English
Data management podcast library
Data governance and data stewardship strategies and best practices
Data governance: Information ownership policies and roles explained

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
data  (SearchDataManagement.com)
data governance  (SearchDataManagement.com)
data quality  (SearchDataManagement.com)
data scrubbing  (SearchDataManagement.com)
fixed data  (SearchDataManagement.com)
raw data  (SearchDataManagement.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary




Data Compliance Articles and Research: Data Privacy, Financial Data Management, Healthcare Data
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




All Rights Reserved, Copyright 2005 - 2008, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts