Data Management/Data Warehousing Definitions

This glossary explains the meaning of key words and phrases that information technology (IT) and business professionals use when discussing data management and related software products. You can find additional definitions by visiting WhatIs.com or using the search box below.

  • D

    denormalization

    In a relational database, denormalization is an approach to optimizing performance in which the administrator selectively adds back specific instances of duplicate data after the data structure has been normalized.

  • dimension

    In data warehousing, a dimension is a collection of reference information about a measurable event (fact).

  • dimension table

    A dimension table is a table in a star schema of a data warehouse. A dimension table stores attributes, or dimensions, that describe the objects in a fact table.

  • dirty data

    In a data warehouse, dirty data is a database record that contains errors.

  • disambiguation

    Disambiguation (also called word sense disambiguation) is the act of interpreting the intended sense or meaning of a word. Disambiguation is a common problem in computer language processing, since it is often difficult for a computer to distinguish a word’s sense when the word has multiple meanings or spellings.

  • E

    entity relationship diagram (ERD)

    An entity relationship diagram (ERD), also known as an entity relationship model, is a graphical representation of an information system that depicts the relationships among people, objects, places, concepts or events within that system.

  • Extract, Load, Transform (ELT)

    Extract, Load, Transform (ELT) is a data integration process for transferring raw data from a source system to a target database and then preparing the information for downstream uses.

  • extract, transform, load (ETL)

    In managing databases, extract, transform, load (ETL) refers to three separate functions combined into a single programming tool.

  • F

    fact table

    A fact table is the central table in a star schema of a data warehouse. A fact table stores quantitative information for analysis and is often denormalized.

  • fixed data (permanent data, reference data, archival data, or fixed-content data)

    Fixed data (sometimes referred to as permanent data) is data that is not, under normal circumstances, subject to change. Any type of historical record is fixed data. For example, meteorological details for a given location on a specific day in the past are not likely to change (unless the original record is found, somehow, to be flawed).

  • G

    Google BigQuery

    Google BigQuery is a cloud-based big data analytics web service for processing very large read-only data sets. BigQuery was designed for analyzing data on the order of billions of rows, using a SQL-like syntax.

  • Google Bigtable

    Google Bigtable is a distributed, column-oriented data store created by Google Inc. to handle very large amounts of structured data associated with the company's Internet search and Web services operations.

  • Google Cloud Dataflow

    Google Cloud Dataflow is a cloud-based data processing service for both batch and real-time data streaming applications.

  • Google Cloud Spanner

    Google Cloud Spanner is a distributed relational database service that runs on Google Cloud.

  • H

    Hadoop

    Hadoop is an open source distributed processing framework that manages data processing and storage for big data applications running in clustered systems.

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchAWS

SearchContentManagement

SearchOracle

SearchSAP

SearchSQLServer

Close