This article originally appeared on the BeyeNETWORK.
Star schemas are unquestionably good for data marts. When you have a group of people who want to access the same (or very similar data) in the same (or very similar) way, then star schemas work quite well. They also work well when data needs to be summarized or when data needs to be aggregated, and when one needs very rapid access to data.
But there is a problem with star schemas – they are brittle. When requirements change, star schemas need to be changed as well. In some cases, star schemas can be changed; but in other cases, they have to essentially to be completely rewritten and redesigned when changes occur. Because of their brittleness, star schemas typically do not make a good foundation for a data warehouse.
Yet there are a few circumstances where star schemas can be implemented in a data warehouse with little wear and tear.
The first circumstance arises where star schemas contain exclusively semantically stable data. Semantically stable data occurs where data is the same year in and year out. Take sales data. Sales data consists of the basic elements of:
- Date of sale
- Amount of sale
- Item sold
- Sipping, etc.
With these basic elements of data, we could be talking about sales in the 1700s in the American colonies or we could be talking about sales in 2010. We could be talking about the sales of horse buggies or we could be talking about sales of Ferraris. This basic information is almost indestructible. It is safe to put these data elements in a star schema. No problem.
The second circumstance where we can get away with a star schema in a data warehouse is in an environment where we can quickly and with agility rebuild the data warehouse – where we can take the basic form of a data warehouse and quickly and automatically reshape it. Such tools do exist. For example, Kalido is the best of breed of this class of tool. When you have tools that are designed to manage change, then you can afford to build a star schema.
There is a third case where a star schema can be used. That case is where the star schema contains data at the lowest level of granularity and where the data is separated according to the rules of normalization. In this case, it is recognized that the star schema has a very close (suspiciously close!) resemblance to a normalized structure. In fact, in this case, the star schema has been reduced to a third-normal form and star schema design is the same thing as relational design.