Data Management/Data Warehousing Definitions

This glossary explains the meaning of key words and phrases that information technology (IT) and business professionals use when discussing data management and related software products. You can find additional definitions by visiting WhatIs.com or using the search box below.

  • D

    disambiguation

    Disambiguation (also called word sense disambiguation) is the act of interpreting the intended sense or meaning of a word. Disambiguation is a common problem in computer language processing, since it is often difficult for a computer to distinguish a word’s sense when the word has multiple meanings or spellings.

  • E

    entity relationship diagram (ERD)

    An entity relationship diagram (ERD), also known as an entity relationship model, is a graphical representation of an information system that depicts the relationships among people, objects, places, concepts or events within that system.

  • Extract, Load, Transform (ELT)

    Extract, Load, Transform (ELT) is a data integration process for transferring raw data from a source system to a target database and then preparing the information for downstream uses.

  • extract, transform, load (ETL)

    In managing databases, extract, transform, load (ETL) refers to three separate functions combined into a single programming tool.

  • F

    fact table

    A fact table is the central table in a star schema of a data warehouse. A fact table stores quantitative information for analysis and is often denormalized.

  • fixed data (permanent data, reference data, archival data, or fixed-content data)

    Fixed data (sometimes referred to as permanent data) is data that is not, under normal circumstances, subject to change. Any type of historical record is fixed data. For example, meteorological details for a given location on a specific day in the past are not likely to change (unless the original record is found, somehow, to be flawed).

  • G

    Google BigQuery

    Google BigQuery is a cloud-based big data analytics web service for processing very large read-only data sets. BigQuery was designed for analyzing data on the order of billions of rows, using a SQL-like syntax.

  • Google Bigtable

    Google Bigtable is a distributed, column-oriented data store created by Google Inc. to handle very large amounts of structured data associated with the company's Internet search and Web services operations.

  • Google Cloud Dataflow

    Google Cloud Dataflow is a cloud-based data processing service for both batch and real-time data streaming applications.

  • Google Cloud Spanner

    Google Cloud Spanner is a distributed relational database service that runs on Google Cloud.

  • H

    Hadoop

    Hadoop is an open source distributed processing framework that manages data processing and storage for big data applications running in clustered systems.

  • Hadoop 2

    Apache Hadoop 2 is the second iteration of the Hadoop framework for distributed data processing.  Hadoop 2 adds support for running non-batch applications as well as new features to improve system availability.

  • Hadoop data lake

    A Hadoop data lake is a data management platform comprising one or more Hadoop clusters.

  • Hadoop Distributed File System (HDFS)

    The Hadoop Distributed File System (HDFS) is the primary data storage system used by Hadoop applications.

  • HP e3000

    The HP e3000 is a line of midrange business servers that carries on the well-known series of 3000 computers from Hewlett-Packard (HP).

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchAWS

SearchContentManagement

SearchOracle

SearchSAP

SearchSQLServer

Close