This article originally appeared on the BeyeNETWORK.
At first, it seems like a good idea. What better way is there to start the day than to have a complete selection or library of metadata. But the deeper you go, the more you realize that there is no such thing as completeness of metadata.
Oh, sure. There are the familiar tables, attributes, and indexes. Everybody knows about them. But when you look closely, there are physical attributes, foreign keys, m:n relationships, 1:n relationships, and 1:1 relationships. Then, there are the number of rows in a table, the number of bytes in a row and variable length rows. And while you are at it, there are average and maximum row lengths, unused blocks and reserved row space. And there are date fields, encoded fields, and parent/child relationships. Not to mention networked relationships, inverted files, and sparse indexes. Don’t forget entities, relationships and subtypes. And what about nontraditional cardinality and definitions? Data item sets, anyone? Or mapping and transformation rules? Aggregation? Summarization? Summarization algorithms? And the list goes on.
For a seemingly innocent description of data, the different types of metadata are endless, or so it seems.
Then what happens when the size of data and metadata double overnight? And as if this weren’t enough, then there is the issue of the data changing. By the time you have a good solid definition of metadata, it has changed, and you have to go back to the beginning and start all over again.
So maybe this notion of a complete and accurate description of metadata is just a good theory. Maybe it is not possible to actually create a complete description of metadata.
Perhaps we should be looking at necessary metadata, not complete metadata. Maybe there is just too much metadata that keeps changing. Maybe people keep adding systems so that we will never be finished.
Instead of looking for and documenting ALL metadata, maybe it is simply rational to look for necessary metadata. By looking for necessary metadata, the task of gathering and managing metadata becomes doable. Looking for and documenting all metadata in a project – especially an ever-expanding project – is like a heat sink. It becomes a place where you can spend huge amounts of resources and huge amounts of time and never be finished.