Home > Data model patterns: A metadata map
Chapter Download:
EMAIL THIS LICENSING & REPRINTS

Data model patterns: A metadata map

20 Jul 2006 | Written by: David Hay

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   

The following is an excerpt from Data model patterns: A metadata map, by David Hay.

About Metadata Models

There once was a fellow named Corey
Whose career was not covered in glory
He had a bad day
When he just couldn't say
Me-ta-da-ta Re-pos-i-TOR-y.


What are metadata?*

During the 1990s, the concept of data warehouse swept the information technology industry. After many years of trying, it appears finally to be possible for a company to store all of its data in one place for purposes of reporting and analysis. The technology for doing this is still new, and the first attempts have had mixed results, but the effort has been quite serious.

One of the problems that arose from this effort was the realization that if a senior executive is going to ask a giant database a question it is necessary to know just what is in the database and what types of questions to ask. In addition to the data themselves, therefore, it is necessary to keep data about the data. The term coined for "data about data" during the 1990s was metadata.

Since then, numerous books and magazine articles have been published on this subject, but most have focused on why metadata are important and on technologies and techniques for managing them. What these publications have left out is a clear description of exactly what the stuff is. After a decade, there is still no simple, clear description of metadata in a form that is both comprehensive enough to cover our industry and comprehensible enough that it can be used by people. This book is an attempt to produce such a description.

As with all buzzwords, once invented the term metadata has taken on a life of its own. It is variously described as:

  • Any data about the organization's data resource [Brackett 2000, p. 149].
  • All physical data and knowledge from inside and outside an organization, including information about the physical data, technical and business processes, rules and constraints of the data, and structures of the data used by a corporation [Marco 2000, p. 5].
  • The detailed description of instance data. The format and characteristics of populated instance data: instances and values, dependent on the role of the metadata recipient [Tannenbaum 2002, p. 93].

Several significant points come out of these definitions. First, as Mr. Marco pointed out there is a difference between business metadata and technical metadata. The business user of metadata is interested in definitions and structures of the language as terms for the types of information to be retrieved. The technician is concerned with the physical technologies used to store and manage data. Both of these points of view are important, and both must be addressed.

Second, the subject is concerned with more than just data. It is, as Mr. Brackett said, "any data about an organization's data resource." Once you have started looking at the structure of an organization's data, you have to also account for its activities, people and organizations, locations, timing and events, and motivation.

Third, as Ms. Tannenbaum pointed out, the "meta" aspect of the question is a matter of point of view. There is metadata relative to the data collected by the business. There is also meta-metadata, which is used to understand and manage the metadata.**

This last point is illustrated in Figure 1–1. Here, the bottom row shows examples of things in the world that are often described in information systems. "Julia Roberts" is a real human being. The "Wall Street branch" of a bank is a physical place were business is performed. Checking account "09743569" is a particular account held in that bank by a particular customer (Julia Roberts, for example). The customer of that account may then perform an actual "ATM Withdrawal" at a specific time.

WHAT ARE METADATA?

This Book (Metametadata) Elements of metadata (metadata model) Objects: "Entity Class" "Attribute" Objects: "Entity Class" "Attribute" "Role" Objects: "Table" "Column" Objects: "Program module" "Language"
Data Management (Metadata)Data about a database (a data model) Entity class: "Customer" Attributes: "Name" "Birthdate" Entity class: "Branch" "Employee" Attributes: "Employee.Address" "Employee.Name" Role: "Each branch must be managed by exactly one Employee" Table: "CHECKING_ ACCOUNT" Columns: "Account_number" "Monthly_charge" Program module: ATM Controller Language: Java
IT Operations (Instance Data) Data about real-world things (a database) Customer Name: "Julia Roberts" Customer Birthdate: "10/28/67" Branch Address: "111 Wall Street" Branch Manager: "Sam Sneed" CHECKING_ ACCOUNT. Account_number: = "09743569" CHECKING_ ACCOUNT. Montly_charge: "$4.50" ATM Controller: Java code
Realworld things Julia Roberts Wall Street branch Checking account #09743569 ATM Withdrawal

Fig. 1–1: Data and metadata.

The next row up shows, in the first three columns, the data that might describe those three things: (1) A Customer has the name "Julia Roberts" and the "Birthdate" of "10/28/67". (2)A Branch has the address "111Wall Street" and a manager, "Sam Sneed". (3) The checking account has an account number "09743569" and a monthly charge, "$4.50". In the fourth column, the first row from the bottom shows that a particular program, called here "Java code", is responsible for a "Withdrawal Transaction". These are the things that would concern a person managing data for a banking business. Note that each of the terms was described as to what it was: customer name, branch manager, account number, and so forth.

The third row from the bottom collects those descriptors and labels them in turn. This is to create what we in the data administration world call the metadata. There are two components to these labels. First are the names of the things of significance being described by the business data, such as the entity classes "Customer" and "Branch". Second, each of these is in turn described by attributes, such as "Name", "Address", and "Birthdate". We also discover, in the case of the bank branch, that there is really an additional entity class, "Manager", and that it is related to "Branch". ("Each Branch must be managed by exactly one Employee.")

In the checking account column, we see that a checking account is actually the subject of a table in a database. The table is called "CHECKING_ACCOUNT" and has columns "Account_number" and "Monthly_charge". The ATM program described in the second row simply as "Java code" is actually a program module with the name "ATM Controller" written in the language "Java". As we can see, the metadata row itself encompasses several different types of objects ("Entity class", "Attribute", "Table", "Column", "Program module", and "Language"). The assignment of this book, represented by the top row, is to show how these objects relate to one another.

Metadata don't just describe data. They describe how the organization understands not only its data, but also its activities, people and organizations, geography, timing, and motivation. Yes, metadata describe the entity classes and attributes of an entity-relationship model, and the tables and columns by which these are implemented in a computer system. They also provide, however, structure for describing the activities of the organization and the computerized processes that implement these activities. They describe who has access to data, and why. They describe the types of events and responses that are the nature of an organization's activities. They describe where the data and processes are, and they describe the motivation and business rules that drive the entire thing. So, from all of this comes the following definition of metadata.

Metadata are the data that describe the structure and workings of an organization's use of information, and which describe the systems it uses to manage that information.
More info on this book

Printed with permission from Morgan Kaufmann, a division of Elsevier
Data Model Patterns: A Metadata Map
By David Hay

Publication Date: 23 June 2006
ISBN: 0-12-088798-3
For more information about this title and similar books, please visit www.books.elsevier.com.

One anomaly has revealed itself in the line between business data and metadata. The information about what constitutes a legal value for a product category or an account type in the business model is often captured in separate reference tables. To reflect these validation structures, a typical data model often has many "type" entity classes (account type, status, day of the week, and so on) describing legal values for attributes. These are part of the business data model.

But because they are in fact constraints on the values of other attributes in the same data model, they are also included in the category of metadata. Where a table designer would be required to specify the domain of a column, the data modeler (who is instructing the designer) must now provide the values that constitute that domain. Here you have business data acting as metadata.

Be aware, of course, that even this line between business data and metadata is not as clear-cut as it seems. product type, for example, is about reference data that constrain many attributes in a business model. Even so, specification of the list of product types is very much the domain of the business, not the data administrator. This plays both the roles of business data and metadata. Probably more in the metadata manager's domain would be product category. There should be relatively fewer of these, and the list should be relatively stable.

*Ok, it's true. I studied Latin in high school and have always held that data is the plural form of the word datum. I realize that I may be swimming against the current, but, hey! It's my book!

**While delivering a lecture on cosmology one day, Sir Arthur Eddington gave a brief overview of the early theories of the universe. Among others, he mentioned the American Indian belief that the world rested on the back of a giant turtle, adding that it was not a particularly useful model as it failed to explain what the turtle itself was resting on. Following the lecture, Eddington was approached by an elderly lady. "You are very clever, young man, very clever," she forcefully declared, "but there is something you do not understand about Indian cosmology: it's turtles all the way down!"



Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


RELATED CONTENT
Meta data management
Integration competency centers centralize data integration projects
Unlocking and integrating unstructured data, with Bill Inmon
Gartner data integration Magic Quadrant 2007: Platforms, market expand
Developing quality metadata and designing workflow
Data and the enterprise architecture framework
Content management software: Who will leverage semi-structured and unstructured data?
Experts address government data problems
Turkey store makes data easier to swallow
Application vendors to dig into data mining
Peering into Bill Inmon's data warehousing crystal ball

Data warehouse management
Data warehouse appliances -- in a nutshell
Microsoft to acquire data warehouse appliance specialist DATAllegro
On-demand business intelligence takes aim at the midmarket
Data warehousing five-year outlook: Technology trends and tips
Do business intelligence tools require a data warehouse?
How to evaluate data warehouse software in five steps
Data quality management for data warehouses
On-demand data warehousing has arrived, but will customers embrace cloud-based business intelligence?
Data warehouse software: Top five headlines
Data warehousing, data mining and data querying: Terms and definitions

Data warehousing / DBMS
DB2 tools and products for Linux, UNIX and Windows: The basics
Understanding IBM DB2: Product history and strategy
Database management: How to protect your electronic security systems
Definitions of design and data modeling
What is a data model?
Introduction to network analysis, architecture and design
Business process change: A guide for business managers and Six Sigma professionals
A guide to the IBM DB2 9 Fundamentals certification exam
Multiuser application design strategies for Microsoft Access
XML basics: What is XML?

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
data modeling  (SearchDataManagement.com)
extract, transform, load  (SearchDataManagement.com)
OLAP  (SearchDataManagement.com)
tree structure  (SearchDataManagement.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




All Rights Reserved, Copyright 2005 - 2008, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts