This article originally appeared on the BeyeNETWORK.
Successful companies are usually skilled at wringing as much value as possible from the majority of their identified assets. These days, the amount of data being created and managed by even small companies is enormous. Data was once measured by megabytes and gigabytes; terabytes and petabytes are now the norm. Many companies are now realizing that this information is a substantial potential asset, if it can be harnessed properly. Unfortunately, all too often, data quality gets in the way. However, there are ways to improve data quality substantially and, with it, business results.
That’s good news because data quality is important, especially with today’s executives forced to make vital decisions quickly. High quality data – complete, accurate and timely – helps companies accomplish goals such as improving customer service and client satisfaction. It can also help accomplish other initiatives, such as improving profitability by targeting the right customers and prospects.
Conversely, there are consequences for poor data – sometimes dire ones. Organizations that are uninformed about the marketplaces in which they compete, for example, make ill-advised business decisions; they pursue empty leads or ignore potentially fruitful ones.
Data Quality Defined
Data quality is, simply put, the process of providing complete, accurate, timely and consistent information. Data quality can be divided into two primary concepts, data quality and data consistency. Quality measures data’s correctness, completeness and accuracy. Does a mailing address, for example, feature the correct name? And does the postal service accept the address as valid? Consistency refers to a data element’s similarity across numerous applications.
From a consistency standpoint, different versions of the same name, such as Robert, Bob or Bobby Smith of 104 East Main St. are fine. In the same vein, 104 E. Main Street would be an acceptable variation of Mr. Smith’s address. However, if Mr. Smith’s name was Robert in one system and Francis in another, the data would be inconsistent.
Considering key data characteristics collectively and individually further helps organizations understand data quality. These characteristics include:
Testing for data existence is simply determining if it is present when it should be. Has a field on a form been populated, or left empty? Sometimes, a non-blank value represents blank data. That is, if a “date” field shows “January 1, 1900,” it could be a default entry – no data was entered in that field. Although there is a value, the data may not exist.
Companies can assign criteria to certain data fields to determine form. Social security numbers, for instance, are not valid unless they have nine digits. Addresses may be considered improper without a street number or name, or an identifier such as Road, Way or Drive.
Does the data seem legitimate? Does the space marked for a person’s first name feature something that resembles, or could resemble, a name? Does it feature consonants and vowels? Or is it populated by non-alphabetic and numeric characters?
Verification that data is correct means determining if its content is accurate. Correctness can often be verified through software programs. Postal software, for instance, helps companies determine if data represents a Coding Accuracy Support System (CASS)-certified address, i.e., one to which mail can be delivered.
- Domain Testing
Companies can often validate data quality by establishing parameters. For instance, a field marked “gender” can only be acceptable if the answer is Male, Female or Unknown. Other values, such as dates, numbers or standard deviation, can also be validated. A standard deviation domain test could reject pads of paper that cost more than $100, for example.
Results of one characteristic frequently enable an organization (or system) to infer other values. Someone named Betsy can be inferred to be female, while someone named Frank can be assumed to be male. Or, a product and quantity can determine a price. As such, even data that does not exist can often be uncovered. In fact, using data to derive other data is an excellent way to improve data quality. A word of caution though: inferring values or information sometimes leads to creation of data quality problems. The name Bobby, for example, may infer the person gender should be set to male; however, occasionally, that inference may be incorrect.
The Value of Quality Information
High quality information offers real, often considerable, business value. If it did not, it would not be worth the effort to make data quality improvements. However, the term high quality is subjective and does not usually mean perfect. If the business need is relatively general, such as a market analysis to determine trending, a lower accuracy level may suffice. Financial transaction processing requires a higher level of quality, but only for critical fields. The difference in cost and effort between creating records that are 100 percent accurate and complete and 80 percent accurate and complete can be dramatic. Instead of chasing perfection, companies should identify the level of data quality that meets their business needs, and then focus on the tasks and methods associated with achieving it.
Concrete examples of quality data making a profound difference are not uncommon. Consider a bank with a customer named Mrs. Margaret Casey who has a checking account with $500, a relatively modest sum that would not lead most banks to consider her a critical customer. She may even be considered problematic if her account requires significant maintenance (that is, if she calls frequently). However, Mrs. Casey also has a brokerage account worth $5 million under the name Meg Casey, which features a different mailing address. Finally, she is CFO of a company with $100 million in assets with the bank.
With a business focus on data quality, the bank will recognize Margaret, Meg, and CFO Casey as the same person, and interact with her accordingly. As a result, Margaret Casey may become a more satisfied customer and lock in her business with the institution.
Similarly, just as quality data helped the bank identify Mrs. Casey as an important customer, it can help an organization pinpoint its most profitable customers or brightest prospects. That kind of information can reduce the costs of winning new business dramatically and make a company far more focused and efficient.
Consequences of Poor Data Quality
When companies cannot make such connections, trouble ensues. Often, insufficient data quality manifests itself during mailings. Mailings, which can cost a company $10 to $15 per piece, require accurate data in order to ensure they reach their target audience. When items are sent to an undeliverable or wrong address, or when several identical pieces accumulate in the same customer’s mailbox, unnecessary expenses add up quickly. For large-scale mailings, poor quality data can cost tens of thousands of dollars in unnecessary costs, not to mention the annoyed customers who either do not receive the piece or receive multiple copies of the same solicitation.
Failing to recognize a customer properly can also be disastrous. If Mrs. Casey received poor customer service on her $500 savings account or if she received three copies of the same brochure, she might be angry enough to transfer all of her funds to a competitor. Perhaps worst of all, the bank may never know what drove her away.
Although data enhancement is almost always a good idea, there are exceptions. For instance, when a company undertakes a data quality initiative, it must weigh “lift,” or data improvement, with errors that invariably occur. Software packages can help an organization achieve dramatic data completeness and accuracy gains. However, if the software actually introduces inaccuracies that were not previously there, which happens sometimes, the problem that it creates may negate the positive value it provides. There are tradeoffs between improving data accuracy and introducing new errors.
The vast majority of the time, however, data quality programs are more than worthwhile. They help turn data into a strategic asset, something organizations can use to optimize sales, marketing and almost every other function. When a company can depend on its data, the benefits are felt throughout the enterprise, as well as by customers, partners and suppliers. To be successful, a company must begin with a clear business objective, define the appropriate parameters for data quality, measure the investment versus the benefit, and understand the issues critical to maximizing the value of information assets.