NEW YORK – Organizations should avoid the tendency to take a “one size fits all” approach to data integration projects and start thinking about the best ways to
But don’t run out and purchase every data integration-related technology on the market just yet. Instead, attendees of Wednesday’s conference said, the message of integration diversity is more about choosing “the right tools for the job” and then thinking about innovative yet sensible ways to combine various approaches.
Choosing the proper tools for data integration projects
Methods for data integration and data movement include bulk processes such as extract, transform, load (ETL); granular, low-latency data capture and propagation; message-oriented data movement; and abstracted, federated or virtualized views of data from different source systems in addition to others. And choosing the right approach – or combination of approaches – can be daunting.
There is a renaissance around data.
Ted Friedman, vice president, Gartner Inc.
Research from Stamford, Conn.-based Gartner Inc. indicates that bulk data movement is by far the most widely used and most valued choice for data integration projects. However, conference attendees pointed out that oftentimes bulk processes are a lot like throwing a bomb when all that’s needed is a bullet.
Bulk processes are useful and necessary in situations when, for example, a user is trying to store historical information but dealing with large data sets that do not have create and update times, Mike Linhares, a conference speaker and research fellow at pharmaceutical maker Pfizer Inc., said in an interview. But it’s not always the right choice.
“I think that choosing virtualization – pure virtualization where there’s no caching going on – makes a lot of sense, especially when you have very transactional systems and you need low latency and systems have a very high availability,” Linhares said. “But when you get into a situation where a system’s availability starts to become a little not-so-routine, caching becomes a very selective way of making sure that the data is available. It also becomes very useful if you’re looking at a medium-sized set of data and you actually want to improve query performance but not impact the transactional systems very much.”
Conference attendee Sravan K. Kasarla, the chief information architect of Springfield, Mass.-based MassMutual Financial Group, said that while he’s seen data virtualization used for reporting and dashboards, he hasn’t seen it widely used to deliver information to a wider array of business applications.
Data virtualization, or data federation, is the process of virtually separating data from the underlying hardware on which it resides, and housing it in a semantic, or middleware, layer that can be easily accessed by applications and processes.
“I know for the [business intelligence] layer it can work very well,” Kasarla said. “But I’m trying to solve the challenge across the board [including] information access for structured and non-structured data. That is my challenge.”
Kasarla, who was at the conference investigating innovative ways to leverage data virtualization, said he ultimately plans to deploy the technology at MassMutual as part of an information architecture revision. While Kasarla sees data virtualization as a “must-have” technology, he warned that it’s easy for users to fall into the trap of investing in integration tools before implementing the organizational structure and acquiring the skill sets needed to manage them properly. Kasarla said he prefers to keep the number of data integration tools he uses to a minimum.
“There is not a single platform which can offer you soup to nuts, from granular data access […] all the way to ETL,” Kasarla said. But don’t "interpret that to mean that I can go and get as many choices as possible.”
Gartner offers keys to successful data integration projects
An increasing number of organizations are spending time and energy to derive greater value from information assets, and a greater focus on different approaches to data integration is a fundamental part of this process, said conference speaker Ted Friedman, a vice president and member of the information architecture team at Gartner.
“There is a renaissance around data,” Friedman told the audience.
Citing Gartner surveys and frequent conversations with clients, Friedman said that there are five keys to data integration success. They include standardization, diversification, unification, the ability to leverage data integration technology to its fullest, and governance.
In the context of data integration, standardization means that organizations should focus on repeatable processes and approaches for dealing with data integration problems, the analyst said. Diversification, meanwhile, is about employing a wider variety of tools, provided that they meet the needs of the business.
“This discipline of data integration has many facets and many faces, and there are many ways to skin the cat, to use another cliché,” he said.
Unification, Friedman explained, is all about determining how best to link together combinations of available tools and architectures “in a synergistic way.”
Companies that manage to standardize, diversify and unify will now have some good leverage, which means that data integration will have had a positive impact on the business. But those organizations will still need to focus on ways to increase the breadth of business impact, he added.
Organizations are increasingly looking at ways to govern data quality, data privacy, security, lifecycle management, and the list goes on, Friedman said. But they also seem to be “missing the point” when it comes to the governance of integration tools and architecture.
“[Governance] is certainly an insurance policy, in a way, to get the optimal value out of all these investments,” he said.
Friedman said users that steadfastly adhere to his five points will have an easier time with integration in the future.
“If you do these things, I can assure you that you have a very good chance of being a successful data integration practitioner or leader,” he said.