News Stay informed about the latest enterprise technology news and product updates.

Mastering Master Data

There is a lot of information out there about master data, yet many still find it confusing. This is the first part of a series on strategic thinking about master data.

This article originally appeared on the BeyeNETWORK.

After my last article was posted, I heard from a data professional in a large financial services company that is “solving” master data with a technical fix that will be expensive, but it still won't get to the heart of the data issues – such as knowing who the customer is. This person is frustrated because, once again, an IT-driven initiative very focused on the technology platform is seen as the solution to the problem. I value hearing about readers’ concerns, so this first article on master data defines what it is and why it is a business problem, first and foremost.

Master data is emerging as an essential business capability. It can provide accuracy and control over critical business information and objects. Yet success with master data is not being realized because it remains poorly understood in the business community. The root cause of this situation is viewing master data as a data, rather than business, problem.

What is master data? Unfortunately, this simple question is difficult to answer because master data has consistently been viewed as a technical problem arising from duplicated data used in an inconsistent manner by business units and application systems. Consider this definition from Wikipedia: “Master Data Management (MDM), also known as Reference Data Management (NB: I’ll make the distinction between master and reference data later in this analysis), is a discipline in Information Technology (IT) that focuses on the management of reference or master data that is shared by several disparate IT systems and groups. MDM is required to warrant consistent computing between diverse system architectures and business functions.”

Or, consider this view of master data from S. Jae Yang as: “A new way to correct the age-old problem in companies that the left hand does not know what the right hand is doing. The goal: Merge all the disparate, oft-conflicting records you have on customers and transactions into one authenticated master file.”

And, consider one last definition from David Loshin, a respected expert on data and data quality: “Master data sets are synchronized copies of core business entities used in traditional or analytical applications across the organization, and subjected to enterprise governance policies, along with their associated metadata, attributes, definitions, roles, connections and taxonomies.”
With the prevailing view that master data is a technical issue, it is no wonder that master data is poorly understood by businesspeople. Master data is first and foremost a business issue, and treating its technical aspects treats the symptoms, not the underlying problem. Unless this basic fact is accepted and then acted upon, a master data initiative will fail because it will influence the information technology environment only, not the business and its operation.

Why Master Data Is a Business Problem
Where can a businessperson go for an accurate list of all the company’s customers, products, suppliers, or contracts? Business needs access to pertinent data on these business entities for many business purposes, including validating customers and suppliers, orders and invoices, and receivables and payables. Every business has this critical information distributed among several transaction-processing and back-office systems, but having this data easily available in one place allows the business to easily and effectively deal with these issues and more. If no such single place exists, these business entities cannot be controlled or managed effectively. Creating this single place of business information for critical business entities is the purpose of master data.

My definition of master data is this: “Master data is the official data representation of the real entities that are part of the business and where real objects are those that physically exist in the world.” Let’s look at this definition piece by piece:

  • Official means that master data is always, always the correct, accurate, complete, and, for all purposes, official information about the physical business entity. As such, master data is a critical element for regulatory compliance and audit support.

  • Data representation means that master data is data, not the thing itself. However, images of products, offices, employees, contracts, and so forth can also be elements of master data.

  • Real entities are those that physically exist in the world including customers, suppliers, products, contracts, office locations, employees, and other entities that exist in the world that a company creates or uses in doing business.

  • That are part of the business means that only the entities and facts pertinent to the business are included in master data.

Looked at in this manner, master data is the accurate recording of data attributes about critical business entities that can be correlated with the “real world.” Master data is the representation of all that is real in the business.

When viewed as an important business element, master data has a clear impact on the business. For example, if you sell a product to a customer, it is increasingly necessary to know what other products that customer has bought from you. If you are a financial services company, the Patriot Act requires this, as does HIPAA for hospitals, and so forth. Even if you sell products that have warranties, such as auto parts, it can be necessary to know that other required parts were purchased so the warranty can be honored. Customers, products, locations, contracts, sales, and all other master data are central to the operation of business today.

Some important business elements are not included in this definition. These include organization (used to organize information, reports, metrics, and so forth), categorizations (such as a hierarchy of products, classification of offices or locations, and so forth), and other means used to understand the business. These elements are used throughout a business, but they are abstract – that is, they don’t physically exist in the real world.

This is reference data, for which my definition is this: “Reference data is the official data representation of the abstract structures used throughout a company to understand the organization, classification, or other perspectives of a business’s real entities.” Business intelligence professionals will recognize reference data as consistent with dimensions in dimensional models.

Master data and reference data cover the real (companies, customers, products, offices, vendors and so forth) and abstract (organization structure, hierarchies and classifications of products, customers, vendors and so forth) elements of a business. Master and reference data are business problems and not data problems because the business, not IT, is responsible for knowing, managing and keeping these elements of the business correct and consistent. Business elements and associated master data are subject to compliance and audit; and reference data, though less so, should be handled as rigorously as master data in all other regards. For this reason, master data will be the term used to refer to both master data and reference data for the remainder of this series.

In my next article, I’ll describe how IT and data management contribute to the problem of master data.

Until then, let me know what you’re thinking. You can e-mail at

Dig Deeper on MDM best practices

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.