This article originally appeared on the BeyeNETWORK.
Have you ever wondered how banks keep track of money? In particular, have you wondered how banks keep track of your account?
The answer is that banks have what is called the system of record. For all of the systems that banks have (and they usually have a lot), there is only one system that has the definitive record of what amount is in your account. There is one and only place where your transactions can be updated in a bank. If the bank ever deviates from this proposition, there is trouble. Either you have too much money in your account (which makes you happy) or you have too little money in your account (which does not make you happy). In any case, not having a well thought out and tightly controlled system of record leads only to chaos and dysfunctionality in the bank.
The system of record concept has been borrowed from the bankers as one of the essentials of the world of data warehousing. The system of record is at the heart of the integrity of data found in a data warehouse. Stated differently, a data warehouse without a system of record is not a data warehouse at all. With no system of record firmly established, there is no believability of data within the warehouse.
So, let’s examine the system of record more closely.
What happens in a bank when there are transaction processing systems? As has been stated, there is one and only one place where a current account bank balance is updated. Now – is the transaction system a part of the data warehouse? No. Transaction processing takes place in the operational environment, not the data warehouse environment.
As data ages (even by a day), it is placed in a data warehouse. The data warehouse then becomes an extension of the system of record. The system of record for the current data is the transaction environment. But the system of record for the historical environment is the data warehouse.
The point is that as time passes, the system of record changes. There still is one and only one place where data is found. Current data is found in one place and historical data is found elsewhere. The system of record is extended. In fact, the system of record definition itself changes.
The extended definition of the system of record is not the place where the definitive source of data resides. Instead, the definition of the system of record becomes the place where the definitive source of data at a moment in time resides. Because data resides in different places over the spectrum of time, the system of record is also in different places.
So it is seen that the system of record extends from the transaction, operational environment to the historical data warehouse environment. But does the extension stop there? Hardly.
Data moves out of the data warehouse environment to the archival environment as the probability of access drops. At some point in time, it makes sense to pull data out of a data warehouse and into the archival environment. When data is pulled out of the data warehouse and into the archival environment, the system of record is extended as well. Now there are three places where the system of record resides – in the transaction processing environment, in the data warehouse, and in the archival environment.
And is this a conflict? Not at all. As long as there is no overlap between the different environments, the sanctity of the definition of the system of record is just fine. It is seen then that the system of record is relative not only to the value of data, but to the value of data over time.
In order to understand that this is not just a great theory, consider the following very practical example.
Mrs. Jones goes to an ATM machine and makes a withdrawal. The withdrawal is transacted into her account. The account is available on an up-to-the-second basis. The system of record for the ATM activity is the online transaction processing system.
A month passes and the ATM activity (along with other transactions) passes into the data warehouse. Now, the bank vice president wants to do an analysis of activity. The ATM activity is used in the bank vice president’s analysis.
Over the next few years, analytical activity is run against the data warehouse and the ATM activity is awaiting analysis. Then, one day, queries and analyses run against the ATM activity diminish greatly. The ATM activity is then moved to the archival environment.
Once in the archival environment, the data sits there and is essentially inactive. Occasionally, an analyst comes along and wants to use the data found in the archival environment. But from an activity standpoint, the data in the archival environment is rather sedate.
As the ATM activity has moved from the transactional environment to the data warehouse environment to the archival environment, the system of record has also changed.