This article originally appeared on the BeyeNETWORK.
Most data warehouses are of the garden flower variety. A petunia here, a rose bush there. Most data warehouses are created for the purpose of supporting a rather standard and predictable set of queries that might come from accounting, finance, marketing, and so forth.
But every now and then, there is created what can be termed a “research data warehouse.” The research data warehouse is created almost on a whim. The research data warehouse is usually created for the purpose of analyzing one or two burning issues in the organization. Usually (but not necessarily always), the research data warehouse is for the purpose of statistical analysis.
The analysts that use the research data warehouse are almost always hand picked. Sometimes they are actuaries. On other occasions, they are just business users who are savvy in the ways of the business.
The research data warehouse is almost always project-based. This means that the research data warehouse has a beginning of life and an ending of life, unlike a standard data warehouse that is used perpetually.
There arises an interesting question – once a research data warehouse has been created, can data from the research data warehouse be placed in a standard data warehouse? In other words, can you create a research data warehouse, do some interesting analysis, then turn around and put analytical results from the research data warehouse into the standard data warehouse, thereby using the research data warehouse as a source of data for the standard data warehouse?
From a technological standpoint, of course you can do that. There is no technological reason why such a transfer of data cannot be made. However, there are some good reasons why – architecturally – such a practice is questionable.
Some of the reasons why you may not want to update a standard data warehouse from a research data warehouse include:
The finite life of the research data warehouse. If it is OK with you that at some point in time there will be no more data coming from the research data warehouse, then go ahead and update data into a standard data warehouse.
There usually is no discipline or rigor for the data associated with a research data warehouse. The essence of a research data warehouse is that data is free to be calculated and manipulated in any manner possible. This may or may not make a difference to the standard data warehouse.
No documentation and no audit trail. For a research data warehouse, generally speaking, there is no audit trail or documentation as to how data got inside the research data warehouse. If this is no problem, then feel free to update data into a standard data warehouse.
Subsets of data. Research data warehouses are notorious for operating only on subsets of data. If the fact that the data coming from a research data warehouse is incomplete is an issue, then you probably had better not be updating data into a standard data warehouse from a research data warehouse.
And there probably are many more considerations to take into account when deciding whether or not to update a standard data warehouse from a research data warehouse. These considerations merely scratch the surface.
So if you are dealing with a research data warehouse, you need to carefully consider whether you really want data from the research environment finding its way into your standard data warehouse.