This article originally appeared on the BeyeNETWORK.
Data stewardship is a position that comes with the maturity of the information resource management environment. When organizations are first designing and building
But as the information infrastructure grows and matures, the little sheep start to pop up, and soon there is a need for data stewards.
So exactly what do data stewards do?
In a word, data stewards tend the data. For the most part, this means that data stewards look after the physical well-being of the data. The data steward is the person you go to when a data set needs to be found. The data steward is the person you go to when the database goes down and it needs to be brought back up. The data steward is the person you go to when a database load program has failed to execute.
In many ways, the job (or at least parts of the job) of the database administrator has morphed into the data steward. There is so much data and so many things happening to the data that we now need data stewards.
So what doesn’t the data steward do? The data steward is not responsible for the design of the data or the databases. That job is done long before the data steward comes onto the scene. The data steward does not have the responsibility for the accuracy of the data content. If it is found that in a database, Mr. Jones' first initial is “H,” not “A,” then it is the responsibility of the applications manager to make that correction. If it is determined that a new column needs to be added to a table, then it is the responsibility of a database administrator to make that change.
The data steward has the job of tending the sheep, not buying them, selling them or becoming a veterinarian.
One of the big challenges to data stewardship is the sheer volume of data – both the number of bytes and the number of databases that there are to be managed. Once – in the good old days – there were only a few databases and data sets to be managed. In those days, the data steward had the luxury of being able to give individual attention to databases. But today there are so many databases that the data steward has to live in a world of triage. In a world of triage, the data steward must look after data and databases on a basis of monitoring the vital signs of the database. There are standard vital signs – measurements and properties – that are associated with the successful running of a database such as that of running out of capacity, batch loads that fail to execute properly, data exceptions that occur upon access of data inside the database, and so forth.
In a world of thousands of databases – where something can go wrong at any moment – the data steward must monitor the vital signs of each of the databases under his/her care. In many ways, one may be reminded of a control panel such as that found at the helm of the starship Galactica: the control panel shows the vital signs of hundreds and thousands of databases whose vital signs are being monitored.
Another important tool of the data steward is metadata. Metadata allows the data steward to navigate through data efficiently and accurately. Without metadata, the data steward is like Captain Kirk without a compass (or whatever navigation devices one uses in outer space). Metadata simply is an outright necessity for the data steward. It is the shepherd’s crook of data stewardship.
Bill is universally recognized as the father of the data warehouse. He has more than 36 years of database technology management experience and data warehouse design expertise. He has published more than 40 books and 1,000 articles on data warehousing and data management, and his books have been translated into nine languages. He is known globally for his data warehouse development seminars and has been a keynote speaker for many major computing associations. Bill can be reached at 303-681-6772.
Editor's Note: More articles, resources and events are available in Bill's BeyeNETWORK Expert Channel. Be sure to visit today!