This article originally appeared on the BeyeNETWORK.
There is a curious notion that if a clever and fast bus were available to transport data between applications, there would be no need to integrate the data that resides in the applications – and that without the need to integrate data, a lot of time would be saved. This argument is very appealing to those people who do not want to go back and get their hands dirty with old data. However, just like the other schemes to “buy your way to heaven,” the bus theory for not conducting basic integration of data simply doesn’t hold water.
In order to understand why the bus theory for the avoidance of integration doesn’t work, take a look at Figure 1.
Figure 1: Illustration of the Differences in Three Applications
In Figure 1, there are three applications. These three applications have – among other things – data about revenue and data about people. The revenue data is all in dollars, but the dollars are in U.S. currency in one place, Australian currency in another and Canadian currency in the third application. Additionally, all three applications specify gender, but in one place gender is m/f, in another it is male/female and in the third it is 1/0.
In the bus theory for not having to do integration, data is accessed at the application, integration is done by the bus, and the data is passed on to whoever needs it. So, in the bus theory, at the point of accessing information, gender is changed from male/female or 1/0 to m/f. Then, dollars are collected and the conversion is made by the bus to Euro (or some other currency) from U.S. dollars, Australian dollars and Canadian dollars.
There are several fallacies with this approach of using the bus to do conversions “on the fly” instead of actually integrating the data. The first problem is that each time the data is used, the conversion inside the bus must be made. Over time, given enough accesses to the data, the resources for doing the conversions inside the bus add up. Performance is hindered, and the resources needed to manage such a scheme escalate.
The second problem is that there is no guarantee that the conversions inside the bus will be done the same way the second (or any other) time. For example, the conversion of m/f and 1/0 to male/female might be done. In this case, there will be different results from the first set of conversions (where male/female and 1/0 were changed to m/f) to the second set of conversions inside the bus. When conversion is to be done multiple times, there is no guarantee that it will be done the same way each time.
The third problem with trying to integrate data on the fly inside the bus is the one that is the most difficult. Suppose that at the time the first conversion inside the bus is done, the conversion from U.S. dollars to Euro has an exchange rate of $1.25 U.S. to E1. The second time the conversion inside the bus is done, the exchange rate is $1.33 U.S. to E1. Obviously, because of the change in the exchange rate, the same conversion yields inconsistent results, and the calculations made from one report to another cannot be reconciled – or at least cannot be reconciled easily. The exchange rate fluctuated during the time between the conversions, and the final result of the conversion is different. (Taken to its logical extreme, each time the conversion is recalculated by the bus, the results may be different.)
However, when the data is converted – once and for all – in a consistent manner at an appropriate moment in time into a single format, then the foundation of data can be used over and over again, and there is no problem with the repeatability or reconcilability of results. These are some of (but not all) of the basic reasons why a bus technology is no substitute for integration.