Raymond Karrenbauer understands the value of information.
Specifically, Karrenbauer, the chief architect for ING Worldwide and one of the two main keynote speakers at next week's TDWI data conference in Las Vegas, understands the need for companies to efficiently consolidate various forms of structured and unstructured data so they can be used for both transactional and analytical -- or business intelligence -- purposes.
Karrenbauer is accomplishing this goal at ING with the help of enterprise information integration (EII) technology -- software that unifies data regardless of its form. But the process wasn't exactly a cakewalk, thanks mainly to a lack of qualified EII vendors and the need for several architecture tweaks to make the system hum.
In this SearchOracle.com interview, Karrenbauer, who was recognized as one of the "Top 40 executives under 40" by the Hartford Business Journal in 2002, explains how ING is using EII within a BI architecture to support both analytical and operational applications and processes. Karrenbauer also explains how an unconventional look at traditional data models led ING to a far more effective BI system.
What is enterprise integration technology and what does it do for ING?
Raymond Karrenbauer: EII is more or less taking a federated look at the different data repositories and the different forms of data and how we have those dispersed across the organization. What I'll talk about in my speech is that there are different ways to go at this. A lot of the research analysts are now saying that something like 80% of an enterprise's data is not captured because it's in various unstructured forms. You have data that is in Excel spreadsheets. You have data that's in graphical formats because of scanned images and stuff like that. You have data that is structured and in application repositories not interconnected to other sources that need the data for other reporting or operational use.
EII -- at least in the context of how we're using it -- binds technology so that we can get consistent views of data, and not just customer data. [You can get consistent views of all types of data] whether it's for marketing, operational, sales or any other use.
So, EII can help companies access and analyze unstructured data?
Karrenbauer: Unstructured, structured, semi-structured, centrally structured. That's the true intent for it. It's really an abstraction layer between your applications and your data repositories and [works to] hook that all together.
Can you name some software vendors who specialize in EII?
Karrenbauer: I think that if you ask vendors, they'll tell you that they provide EII. But the reality is that it's kind of a mishmash of multiple vendors' solutions, and there are some spaces that haven't been addressed where we've had to fill in that gap. It's an industry [that] I think is untapped to this day. Vendors have gone at the problem by trying to do things like metadata routing. I've seen a couple of new entrants try to do something like that where you can index data in the different forms and search it very quickly. Well, that can solve a transactional-based problem. But if you're going to do a query, you can literally bring an enterprise to a grinding halt if you don't have a properly organized set of information. In EII, the traditional vendors that are trying to claim this space would be guys like Composite Software, which is a spin-off of Web Methods. You have Avaki, Metamatrix, and there is probably a whole laundry list. They're going at the same problem in different ways. Some are doing it based on performance. Others are doing it with richer metadata layers so that you have better traceability of data and how it's transformed. Others are going at it just by offering the sheer ability to type different types of data formats together.
How should a company go about choosing which approach is best for them?
Karrenbauer: The very beginning […] is deciding what is the problem of your particular business and what are you trying to solve, and then based on that, how to build the EII structure around it. The trick within ING is that we're faced with all levels of complexity because we have so many operating business units within the grand organization. There are different priorities, competing interests and different needs to try and solve. [You have to decide] how to balance all that stuff [and] there is no one encompassing technology that can do this. You've got to be pretty selective in how to go about this problem.
In your keynote next week, you plan to offer tips on how companies can tweak their architecture to speed up and simplify the process of analyzing disparate forms of data. Can you give an example of one way you accomplished this at ING?
Karrenbauer: If you think of a traditional data path, how data is moved in a traditional style within an organization, you tend to go from an area where you're extracting data from some source system and you getting it into a structured format that you can use. That first step is typically referred to as staging. Then from staging, you go into something like an information hub. When you go from there you're going into something like an operational data store. So, you're taking that common model, then you're trying to aggregate the data into one model.
The difference in the architecture as we've rolled out EII is to use the information hub as an information router, or a logical router, in many regards. [It's not just an] intermediary point where you have a common model. It determines 'do I go directly to warehouse, do I go directly to an operational data store, or do I bypass those and go to an ODS?' It makes logical decisions like a gateway in terms of how it gets stored. So rather than have to take data and unnecessarily replicate it through this lifecycle, you apply all of your quality techniques in the information hub versus throughout the downstream cycles like you would see in traditional development models.
Can you name another way that vendors are going about the problem of accessing disparate forms of data?
Karrenbauer: Teradata Inc. does this thing with an active data warehouse where they have this one giant can that they load everything into and then they do some things with views off of that to change the [user's perspective]. That has its plusses and minuses too. It really all boils down to what are the fundamental requirements your organization is trying to build off of. At ING, the thing we're looking at most is what we call an adaptive architecture, so that you can reuse that technology in as many permutations as you can.