What is metadata? Solutions for leveraging technical and business metadata

This article introduces Lowell Fryman's Metadata Expert Channel where he will provide insights and solutions on the capture, governance, management, usage, and dissemination of metadata to enhance your organization's knowledge and effectiveness.

This article originally appeared on the BeyeNETWORK.

Yes, I’m borrowing a song title (“While My Guitar Gently Weeps”)  from George Harrison for our first article on the Business Intelligence Metadata Expert Channel. I just love the analogy. All of our organizations have a huge amount of metadata that is waiting to be found, has not been integrated and that cannot be delivered to the individuals that need it, when it is needed to conduct business operations and processes. Most organizations have metadata distributed in many spots; they have it residing in many technologies and many repositories, without a vision or the capability of managing that metadata. Our metadata is lonely, frustrated and crying to be leveraged in our daily business functions.

I would like this channel to be dedicated to identifying metadata solutions and not just the same old discussion of the problems that have been previously defined. I have a passion for metadata and believe this channel will be a great forum to present solutions for making this valuable resource a happy and productive enterprise asset. I want to take the opportunity of this first article to deviate from my goal, though. We need to stimulate discussion and thought about metadata and present you with my personal dilemma, a dilemma I believe many individuals share.

My dilemma is that I just don’t understand why every organization doesn’t have well defined, actively managed and integrated metadata strategies. Why isn’t every “C” level business and technology executive demanding metadata solutions? Why are we still questioning the value of metadata? Are the scope, depth, and breadth of metadata in our organizations just too complex for us to manage? What has been keeping us from implementing metadata solutions (or what have we been doing while our metadata gently weeps)?

What is Metadata?

First things first, we need to have the same context (or the same metadata) for our understanding of metadata. I expect that readers of this channel already have some understanding of what metadata is. Metadata is not a new concept, yet we often relate the term “metadata” as being relatively new and created by information technology (IT). The popularity of metadata can certainly be traced to the increased implementation and realization of business intelligence and enterprise-level systems in the last decade. The word “metadata” can be broken down into two parts, meta and data. Thus, the simple definition for metadata was “data about data.” That was a difficult definition for most of us to understand, myself included. By this definition, metadata only exists to describe some data or datum. Many industry authors such as Inmon, Kimball, Macro, and Tannenbaum, have written books on the subject, and each has extensive definitions for metadata.

For the purposes of this channel, we will use the definition of metadata found in Wikipedia. The simplest definition of metadata is that it is data about data – more specifically, information (data) about a particular content (data). Wikipedia goes on to say: an item of metadata may describe an individual datum (content item) or a collection of data (content items). Metadata is used to facilitate the understanding, use and management of data. The metadata required for this will vary with the type of data and context of use. So, in the context of a library, where the data is the content of the titles stocked, metadata about a title might typically include a description of the content, the author, the publication date and the physical location. In the context of a camera, where the data is the photographic image, metadata might typically include the date the photograph was taken and details of the camera settings. In the context of an information system, where the data is the content of the computer files, metadata about an individual data item might typically include the name of the field and its length. Metadata about a collection of data items, such as a computer file, might typically include the name of the file, the type of file and the name of the data administrator.

Most of our industry thought leaders have discussed metadata as being separated into two categories, business metadata and technical metadata. The solutions for each are very different as they have a very different usage and context.

Technical Metadata

Technical metadata is most often thought of as the definition and technology description of columns in a database. I first understood technical metadata a few decades ago when we called it the “data dictionary.” Our objective then was to document all of the programs, job control, and databases for primarily two reasons: first, so it was easier for another individual to understand what we had created and, secondly, so it would be easier and less costly to maintain. Both of those objectives were aimed at reducing the total cost of ownership (TCO) and increasing the longevity of the application. Of course, there were many reasons why we did not complete the definition or maintain the accuracy of that metadata. However, let me give one example that I vividly recall that forced the organization I worked for to create a technical metadata environment.

I was employed at a very large international medical products firm and just promoted to be the project manager over the manufacturing systems. One of the key applications was shop-floor tracking. It was a custom-developed application built over years within the organization, mostly by one individual. Let’s call him Dave. Dave was a very bright individual and had a lot of responsibility as he was the primary person to maintain and enhance a key application for a multibillion dollar firm. However, absolutely no metadata was captured in any manner other than Dave’s gray matter, which could not be leveraged by anyone other than Dave. We manufactured more than 3,000 finished medical products/components. When new components were added, we generally had to make changes to the programs and databases, as well as the workflow and reporting. The ability for us to implement new changes was dependent upon Dave’s availability and desire. I felt like I had to essentially beg Dave to get implementations finished; and, of course, there was never time to create the metadata that was necessary to have someone else be able to help Dave. All of the metadata was in Dave’s head, accessible to others only if we read the code or analyzed the data and database structures. Sound familiar? And my metadata gently weeps.

By not having the necessary metadata available, our ability to implement business changes was constrained by our human resource, Dave, who had the knowledge critical to managing business change. At the time, we called it an “application maintenance problem,” but it was really a lack of metadata (outside of Dave’s head) problem. Only after the competition beat us to market on a new and innovative product, which we had in the planning stages, did we get to the root of the problem and take the time to build our technical metadata. Capturing the metadata from Dave was a painful process for both Dave and me. Dave was a willing participant, but Dave did not know what was relevant, did not know what he forgot, and did not recall everything that was important. It was unfortunate that it took a catastrophic business loss for us to realize the cost of not having the metadata we needed to effectively manage technology changes.

Our focus then was in the capture of the metadata, not just from Dave’s head, but also from the disparate technologies that existed. Then, and to a great extent today, to view the technical metadata requires special training, access, and user licenses. Most often, only an individual from the technology organization can use the technical metadata. Technical metadata has great value in conducting technical change analysis, determining where data is used for impact analysis, as well as providing data lineage analysis, Fortunately, we now have lots of examples for identifying the economic value of technical metadata. I’ll present many of those to you in a future article.

Business Metadata

The common metadata definition, “data about data,” assumes that you have the existence of data. However, you do not have to have data existing in a database to have business metadata. Business metadata was created with the first business created many thousands of years before databases were first implemented. Today business metadata “provides the context of or for the data.” The essence of business metadata is in reducing or eliminating the barriers of communication between human and human, as well as human and computer, so that the data conveyed from reports, information systems, or business intelligence applications can be crystal clear, can facilitate business operations, and can be leveraged for all business decision-making processes. The critical success factor for business metadata is not in its capture, but in its delivery. Business metadata exists all over the organization, in many fashions, in many technologies, and in many individuals. Finding business metadata is not a challenge. Business metadata must be delivered to all business processes in a manner that is integrated into the workflow of the individual businessperson.

For those of you that are not familiar with business metadata, the following is a limited list of the objects that can be considered as business metadata:

  • Company annual report and other unstructured data

  • Policy and procedure manuals, such as the HR hiring practices

  • Application procedures and workflow manuals, such as the Oracle finance manual

  • Organization chart

  • SOX guidelines or other regulatory compliance procedure manuals

  • Business rules

  • Data quality analytics

  • Business terms dictionary

One of the obstacles business people have had in the past has been the technologies available for the delivery of business metadata. Until the last few years, IT was limited in the development tools that could be used to deliver business metadata. Today, portal technologies, blogs, wikis, groupware, and mashups are some of the technologies that are available to deliver business metadata in ways that can be integrated into the workflow of individuals.

Business metadata not only communicates the context of the data on reports and screens, but also communicates how we conduct business, who our partners are, where we conduct business, what our business initiatives are, and the status or progress of our business in meeting those objectives.

One of the very popular business metadata initiatives today is the enterprise business terms dictionary. Many organizations have found that the enterprise business terms dictionary is an extremely valuable foundation for enterprise applications such as customer data integration (CDI) and master data management (MDM) initiatives, as well as sharing business terms internally and externally with partners and contractors.

Next month’s article will discuss solutions for the implementation of a business terms dictionary as a means to enable some of your metadata to be a highly valuable asset (and to reduce the weeping).

Dig Deeper on Data governance strategy