Definition

data virtualization

TechTarget Contributor

Published: Jul 24, 2023

What is data virtualization?

Data virtualization is an umbrella term used to describe an approach to data management that allows an application to retrieve and manipulate data without requiring technical details about the data. This can include how the data is formatted or where it is physically located. The goal of data virtualization is to create a single representation of data from multiple, disparate sources without having to copy or move the data.

Data virtualization software aggregates structured and unstructured data sources for virtual viewing through a dashboard or visualization tool. The software allows metadata to be discoverable, but hides the complexities associated with accessing disparate data types from different sources. It is important to note that data virtualization does not replicate data from source systems; it simply stores metadata and integration logic for viewing. Vendors who specialize in this type of software include IBM, SAP, Denodo Technologies, Oracle, Tibco Software and Microsoft.

How data virtualization works

Data virtualization software is middleware that allows data stored in different types of data models to be integrated virtually. This type of platform allows authorized consumers to access an organization's entire range of data from a single point of access without knowing (or caring) whether the data resides in a glass house mainframe, on premises in a data warehouse or in a data lake in the cloud.

Because data virtualization software platforms view data sources in such an agnostic manner, they have a wide range of use cases. For example, the centralized management aspect can be used to support data governance initiatives or make it easier to test and deploy data-driven business analytics apps.

Data virtualization software can also play a role in managing who can access certain data sources and who cannot. Perhaps one of the most important reasons for deploying data virtualization software is to support business objectives that require stakeholders to view a single source of truth (SSOT) in the most cost-efficient manner possible.