Exploring data virtualization tools and technologies
A comprehensive collection of articles, videos and more, hand-picked by our editors
Executives at wireless technology vendor Qualcomm Inc. know that staying competitive in the cutthroat world of semiconductor development means having access to up-to-the-minute information about the status of manufacturing operations, development teams, product delivery schedules and forecasts.
But in many cases, the information that Qualcomm executives needed to make important decisions was coming from a variety of disparate, often siloed sources. The company -- which designs the processors that power many of today's smartphones, tablets and other mobile devices -- needed to be able to provide the executives with a quick snapshot of current operations at any given time, and while traditional business intelligence (BI) and data warehousing tools have their place, they were too clunky to get the job done quickly.
That's when Qualcomm turned to data virtualization products from Composite Software Inc., according to Srinidhi Thirumala, a senior enterprise architect at the San Diego-based company. Today, Qualcomm uses data virtualization to connect disparate systems and fill executive dashboards with all sorts of information about the company's sales, design and manufacturing operations. The technology helps enable "a much faster realization of benefits and better decision making, and executive management doesn't really care about what is the secret sauce behind it," Thirumala said.
Gaining a foothold
Despite inauspicious beginnings and no shortage of subsequent growing pains, data virtualization has emerged as a popular method for delivering important information to C-level business executives and other decision makers.
The technology -- used to gather information from different data sources and deliver it to business users in dashboards, BI reports and other presentation tools -- is hardly considered an outright replacement for more traditional approaches to data integration, such as database consolidation and data warehousing initiatives built around extract, transform and load (ETL) processes. But data virtualization's value lies in its ability to access, combine and transform information into a usable format without the time and expense associated with physically moving that information.
Technology professionals and IT industry pundits predict that data virtualization will become increasingly important as organizations seek better ways to combine information from an ever-growing number of data sources, including cloud-based applications, mobile applications, big data implementations, external data feeds and internal data warehouses and data marts.
For more on data virtualization products
Learn about the pros and cons of data virtualization software
Find out how window and door maker Pella is benefiting from data virtualization technology
Get Forrester Research's take on the data virtualization market
Companies like Qualcomm, clinical trial facilitator Quintiles and online training provider Pearson PLC are finding that in certain situations, data virtualization products offer considerable time and cost-related advantages over the data integration mainstays. Realizing those benefits, however, means first overcoming the implementation, performance and political challenges associated with the technology.
The data virtualization market
Data virtualization software adds a virtual services layer to IT architectures. The technology is most often used to integrate data via traditional batch processes at certain points during the day, but it can also pull together information in real or near real time, depending on the user organization's business requirements and IT setup. In addition to BI reporting, data virtualization products are used to power enterprise search and high-performance transaction processing systems.
The concept of data virtualization first made its mark in the mid-1990s, when it was known as virtual data warehousing and marketed as an alternative to traditional data warehouses. The technology ultimately failed to gain a strong foothold in the IT marketplace for several reasons, according to studies from Cambridge, Mass., IT research and consulting company Forrester Research Inc.
For starters, early incarnations of the software failed to live up to customer expectations while large software vendors chose to minimize investments in data virtualization in favor of alternative approaches to integration, which they deemed more profitable. Oracle Corp., for example, has picked up a number of data virtualization packages over the years through acquisitions but hasn't put a lot of effort into marketing and selling the technology. Forrester reports that many initial data virtualization deployments were also used for spot projects -- as opposed to multidepartment or enterprisewide initiatives -- and interest in the technology faded as needs changed.
But the tide appears to be turning. While it is not yet widely adopted, IT industry analysts report that interest in data virtualization is on the rise as companies contend with a ballooning number of data sources that need to be integrated quickly. Forrester predicts that global revenues for data virtualization, including licenses, maintenance and related services, will grow to $8 billion by 2014.
Vendors that now offer data virtualization tools include Composite Software, Denodo Technologies, IBM, Microsoft, Informatica and Red Hat. Forrester estimates that initial deployments tend to run between $250,000 and $500,000, depending on the breadth of the implementation.
Getting Info to the Decision Makers
Qualcomm maintains control over the design and development of its chipsets but outsources the manufacturing of its products. The largest provider of 3G chipset and software technology, Qualcomm's products are shipped to more than 50 customers around the globe. Qualcomm also partners with about 60 3G network operators and boasts a large engineering team.
At any given time, the company has dozens of development projects at various points along the development pipeline -- and that means a flood of information from various places is constantly rushing in, according to Thirumala.
With the pace of mobile device adoption picking up speed about four years ago, it became clear to Thirumala that Qualcomm executives needed a quicker way to track various development initiatives -- their existing data warehousing and BI reporting tools were just not cutting it. That ultimately led Qualcomm to data virtualization software.
"The warehouse approach would require a large team, and moving [the information] into one physical data warehouse would be a very labor intensive development process," Thirumala said. "We just couldn't afford it."
Qualcomm uses Composite's software to feed information from various internal department sources into a special dashboard for decision makers. Known internally as Oasis, the dashboard serves as a single point of reference for major chip design projects and includes information about feature sets, design and fabrication schedules and taxonomy and product data.
The warehouse approach would require a large team, and moving [the information] into one physical data warehouse would be a very labor-intensive development process. We just couldn't afford it.
Srinidhi Thirumala, senior enterprise architect, Qualcomm Inc.
Qualcomm business leaders can look at the dashboard to find out, for example, if a particular chip design initiative is on schedule and whether product samples are ready for salespeople to use when they visit customers. "That is our lifeblood -- making sure that chip design is tracking to schedule," Thirumala said.
Qualcomm also uses the data virtualization software to power a financial dashboard that consolidates information about budgets, forecasting and financial results, among other things. Looking ahead, it plans to use the technology to power a new project the company has dubbed People Data Services (PDS). Thirumala explained that PDS will offer employees a single place to find information about personnel, locations, organizations and groups throughout the company.
Why data virtualization products?
Data virtualization vendors and users admit that the technology is not the right answer for every data integration problem. Some integration issues call for physical consolidation of information in a data warehouse or a data mart, and sometimes a hybrid approach is the right answer.
As a general rule, data virtualization is a good integration choice when the data sources are so disparate that a traditional warehousing approach would be too time consuming and cost prohibitive, according to John Poonnen, director of global IT at Quintiles, based in Durham, N.C.
Poonnen's company, which boasts about $4.2 billion in annual revenue and about 35,000 employees worldwide, is a contract research organization that specializes in helping pharmaceutical companies carry out clinical trials and gain approval for new drugs. The company has researched many drugs over the years, including the cholesterol-lowering Lipitor and pretty much "all of the major blockbuster drugs that have come through in the last 10 years," Poonnen said.
Quintiles began using data virtualization software from Informatica two years ago in an effort to consolidate important information about clinical trials and send the data back to pharmaceutical makers. As a result, those companies and the many doctors and researchers who work for Quintiles have quick access to information about how certain drugs are performing and how certain patients are faring.
The company initially considered the prospect of integrating the information at the application level but decided that it would be too expensive and time-consuming. Data virtualization offered a much more flexible and less invasive approach that would enable the company to make decisions on the fly.
"Application integration is much more operationally focused," Poonnen said, "but with data virtualization you are much more decision and action focused."
For example, data virtualization enables Quintiles to be more agile and efficient in delivering data that's needed for dealing with regulatory compliance issues, Poonnen said. The pharmaceutical industry is highly regulated, and clinical trials have many parts that often are subject to different regulations, he explained. But the data that's required for compliance varies from trial to trial, and it doesn't always make sense to physically move all of the information into a data warehouse.
"Occasionally, there are use cases where all of the data in the data warehouse is excellent, but we also need two pieces of information from another system, just to put the basic information that we have into perspective," Poonnen said. "So instead of hardwiring a connection, we use data virtualization to [make] that information available."
Mitigating performance woes
Getting a data virtualization system to work properly can eat up time, and when the software needs to access and consolidate information from a multitude of sources, performance and latency issues are bound to pop up.
One way to fight performance problems is to throw more hardware at them. But another, less expensive way is to really take the time to understand the business case, according to Martin Brodbeck, chief technology officer at Pearson PLC, which is headquartered in London.
Pearson, a $9.6 billion company that among other things provides digital courseware for students at all levels, started using Composite's software a year ago to integrate information about customers and enable its "lead-to-cash" process across business units.
Data virtualization software offers the ability to cache data sets in memory, Brodbeck explained, and one way to avoid a performance slowdown is to avoid overdoing that. The amount of data stored in memory can only be stretched so far.
"If I want to cache a customer profile, [that is] a small enough data set where you're not going to have performance issues," Brodbeck said. "But if you wanted to store a million-row Hadoop record set in Composite, for example, I would say that is probably not the best use case for the technology."
Quintiles' Poonnen agreed that understanding the motivations behind data virtualization is of paramount importance before getting started. It's also a good idea to know the "lowest common denominator" that is feeding the system. If one data source is slower than the rest, the entire system must be tuned accordingly.
"You don't want data that is up-to-the-minute combined with data that is a day old," Poonnen said.
About the author:
Mark Brunelli is news director for the Business Applications and Architecture Media Group at TechTarget. After a five-year stint covering crime and local politics for The Boston Globe, Brunelli joined TechTarget in 2000. He has since covered a wide range of technologies, running the gamut from hardware to software. Email him at firstname.lastname@example.org follow him on Twitter: @Brunola88.