This article originally appeared on the BeyeNETWORK.
The other day, I read an article that stated that data marts have an actual life span of about 18 months. This is somewhat ironic for an architectural structure that has had so much written about it and so many debates. The data mart versus data warehouse warfare that has existed in the marketplace for a while never mentioned that data marts only have short life spans.
An interesting question then is this: Why do data marts have such short life spans? There are probably several reasons for this phenomenon, including:
The structure that the data mart is built on – the star schema and the dimensional model – is not conducive to change. The star schema is good for optimal performance. The problem is that the optimal performance depends on the data being accessed in just one way. The minute the data needs to be accessed in a different manner, the entire structure – or at least most of the structure – needs to be torn down and reconstructed. Stated differently, the star schema is good for static requirements, not fluid requirements. It is much easier just to build another on an existing star schema.
The technology – usually OLAP – that the data mart sits on is not conducive to change. It is easier to build a new data mart than it is to reconstruct an old data mart.
Requirements for business change all the time. One minute, the business is interested in profit. When this is the case, expenses are scrutinized carefully. The next minute, the company is interested market share. Money is spent freely and the number of new customers becomes the most important number. Then, the company is interested in competition. A new product has been introduced by the competitor. The emphasis is on the impact of the new product on sales. The truth is that for a viable company, there is no such thing as a static set of key performance variables. The key performance indicators are ALWAYS changing. And every time there is a change of key performance indicators, an old set of data marts is outdated and there is a demand for a new set of data marts.
There is little metadata about data marts. When a new organization wants to start to build a data mart or a set of data marts, there is typically very little to tell the organization about what data marts already exist. Therefore, a data mart becomes attached to a relatively small set of users. When the users of a data mart change their minds about its viability, there is normally no one else around to vouch for the data mart. The data mart becomes abandoned.
There are probably are plenty of other reasons why data marts have such short lives. And interestingly, ALL these reasons are at play at the same time. It is not just one factor that causes a data mart to go into disuse. Instead, it is ALL of these factors working at the same time.
The net result of the fast expiration of data marts is that data marts start to accumulate in the corporation in large numbers. First, there are four or five data marts. Then, there are 50 or 60 of them. Then, there are hundreds of them.
Congregating and abiding with many data marts is not a good idea. There are some real costs to a data mart, including:
- Wasted machine cycles
- Wasted extracts, and so forth.
For a variety of reasons, having lots of unused data marts is a colossal waste.
The natural inclination then is to get people to tell the IT organization when a data mart goes into disuse. But there is a little problem here. For a variety of reasons, people are reluctant to say that a data mart is no longer being used. Some of these reasons might be:
- The end user is not aware that a data mart has fallen into disuse. The end user has enough things to do to keep on top of what information is being used and what appropriate decisions are. It just is unnatural for the end user to know that a source of data is not being used.
- The end user fears that if there is a discontinuance of a data mart, that at a later point in time, if the data in the mart is needed, it will not be available.
- The end user looks on data and data marts as the basis for power. The more data marts there are, the more powerful the end user.
So what is the IT department to do? There is one crude but effective approach to tell whether a data mart should be discarded. That way is to simply stop feeding data into the data mart and see what the reaction is. If upon the cessation of updating the data mart the end user screams, then it is known that the data mart is being used. But if data is not fed into a data mart and there is not a peep from the end user analytical organization, then kill the data mart.
This approach is admittedly crude, but it is effective. And at the end of the day, whatever works, works.