News Stay informed about the latest enterprise technology news and product updates.

The Rising Importance of Product Data

"It would be idiotic to think that ""customer"" data is not one of, if not the most significant candidate for ""mastering,"" but that does not mean that other conceptual data sets should be ignored at its expense."

This article originally appeared on the BeyeNETWORK.

In most vendor venues, the concept of master data management seems to be tightly bound to customer data. In almost every venue, the acronym MDM is coupled with CDI, implying that (of course!) the only relevant master data is customer data – identities captured as identifiers plus enough demographic data to distinguish one party from another. It would be idiotic to think that “customer” data is not one of, if not the most significant candidate for “mastering” (a term that has started to appear in reference to integration into a master data set), but that does not mean that other conceptual data sets should be ignored at its expense.

Consider product data. It is reasonable to think that if customers are the number one priority, then the information about the things they buy should rank pretty high as well. This seems to be reflected in the market as well, as is evidenced by three emerging ideas and trends: the exponential growth of electronic commerce within the context of Web 2.0, the increased economic value of semantic/taxonomic categorization and the continuing growth in web-inspired collaboration.

From a practical standpoint, what is electronic commerce about other than enabling the presentation of product to consumer in a manner that optimizes the acquisition process? There are a few embedded ideas in that last sentence. In the traditional sales model, the salesperson manages an inventory of items, displays them on shelves, and the consumers parade around the store, looking at what is available and selecting the ones they want. Physical shelf space, commodity risk and spoilage risk (among others) are all factors that limit the salesperson’s inventory, which in turn limits the consumer choice. In the electronic commerce world, these limits are eliminated. In the age of drop shipping, a large virtual inventory can be managed without the requirement of physical shelf space, prices can be managed in real time, and there is no need to take on either commodity or spoilage risk. In turn, a vendor can configure a product catalog that best meets its consumer profile and, through their website, present the products in a way best suited to closing the sale to the right customer at the right time.

In essence, this has turned the traditional sales model inside out. Customers are analyzed and profiled into categories, and instead of customers parading around inventoried products, the e-commerce vendor arranges for the products to visually parade around the “inventoried” customers. In turn, customers/visitors are presented to the most suitable products that most closely match customer profile characteristics and desires in real time, enabling a much greater variety of opportunities to make sales.

But to do this, the other aspects mentioned earlier must be addressed: semantic taxonomic categorization and collaboration. Every product has characteristics, ranging from description, usage, engineering, constraints, etc. For example, a lightbulb might have a particular shape, might be usable within a certain type of environment, may only work within a specific temperature range and may only fit into certain kinds of sockets. However, every product manufacturer employs its own style, language and naming convention for the same or similar products.

Consider these different descriptions of compact fluorescent bulbs found on various websites:

“13 watt compact fluorescent. It is tubular with a GX23 pin base and an overall length of 7.5 inches. It has a Kelvin temperature of 4100 and is rated at 10,000 hours. PL13/41.”

“18 Watts energy requirement, 4-pin base lamp, 10,000 hour life, 2700k color temperature, Compact fluorescent space-saving design, Instant-on flicker-free operation, Uses 76% less energy than an equivalent incandescent bulb (100 watts). Lamp will provide 9 years of trouble-free use.”

“13-watt 2-Pin Quad Tube Compact Fluorescent Light at 3500K - FQ13E35.U/2.”

Each of these descriptions carries a lot of the relevant information, but in a non-standard way. On one site, though, I found this style of description for each of the listed bulbs: 

  • Manufacturer's Part Number: FEIIA11W27 
  • 11W/TWIN-TUBE/27K 
  • Life Hours: 6,000 
  • Wattage: 11 
  • Approximate Incandescent Equivalent: 45 Watts 
  • Initial Lumens: 550 
  • Color Temp: 2,700K
  • Warranty: 12 Months
  • Case Quantity: 100

Here, each item’s critical data elements had been parsed and associated with predefined metadata related to this type of product. By doing this, we can gain a lot more value through both the semantic analysis (which parses out the relevant data within each free text description) and subsequent placement within a set of product taxonomies. These taxonomies can either be developed proactively, or perhaps even automatically as a by-product of the semantic analysis. This categorization within the taxonomic structure is, in fact, an enabling factor for customer product presentation.

Taxonomies can be used to drive intelligent product searches. When a customer searches on one set of criteria, finding the product also finds its place within the various taxonomies, which then allows the vendor to present to the customer similar or related products that might also be of interest to the purchaser. Alternatively, the terminology used by one manufacturer may not be the same as another, but through the semantic analysis, the vendor might discover the equivalence of terms used to describe similar products from different vendors. For example, “leatherette,” “faux leather,” “patent leather,” “Naugahyde” and “pleather” are among the many descriptive or brand names for synthetic leather. These terms essentially refer to the same concept – synthetic leather – and when a customer searches for Naugahyde, the intelligent product catalog will display products carrying any of those equivalent terms.

Integrating semantic taxonomic analysis to product data simplifies the processes involved in creating master product data systems, which are also referred to as product information master (or PIM) systems. This leads to our third trend: collaboration. E-commerce relies on the ability to share information between vendors and suppliers to best enable the materialization of a cohesive product catalog. A close look at mega-vendors such as Amazon, Yahoo!, eBay, etc., exposes that often, the vendor is only providing limited services on behalf of a range of suppliers. For example, Amazon provides order processing services for many independent vendors, each of which is responsible for their own order fulfillment and subsequent customer service. Other times, the product is drop-shipped from the manufacturer, but the website provides all customer services. But at the higher level, the customer just sees a single “storefront,” and that is all that matters. Enabling this requires collaboration, which yet again relies on ensuring the quality and communicability of product data.

Other super-vendors that manage large product catalogs are seeing the value in creating master product data environments, and, in fact, product data is emerging as one of the most critical data categories for master data management. In my upcoming Business Intelligence Network report on MDM and data quality, we look at this in some detail, and examine a case study that is making great use of both MDM and data quality applications to deliver high value via their product information management system.

David LoshinDavid Loshin
David is the President of Knowledge Integrity, Inc., a consulting and development company focusing on customized information management solutions including information quality solutions consulting, information quality training and business rules solutions. Loshin is the author of The Practitioner's Guide to Data Quality Improvement , Master Data Management, Enterprise Knowledge Management:The Data Quality Approach  and Business Intelligence: The Savvy Manager's Guide . He is a frequent speaker on maximizing the value of information. David can be reached at or at (301) 754-6350.

Dig Deeper on MDM best practices

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.