Data Management/Data Warehousing Definitions

  • A

    Apache Falcon

    Apache Falcon is a data management tool for overseeing data pipelines in Hadoop clusters, with a goal of ensuring consistent and dependable performance on complex processing jobs.

  • Apache Flink

    Apache Flink is an in-memory and disk-based distributed data processing platform for use in big data streaming applications.

  • Apache Giraph

    Apache Giraph is real-time graph processing software that is mostly used to analyze social media data. Giraph was developed by Yahoo! and given to the Apache Software Foundation for future management.

  • Apache Hadoop YARN (Yet Another Resource Negotiator)

    Apache Hadoop YARN (short, in self-deprecating fashion, for Yet Another Resource Negotiator) is a cluster management technology. It is one of the key features in second-generation Hadoop.

  • Apache HBase

    Apache HBase is a column-oriented key/value data store built to run on top of the Hadoop Distributed File System (HDFS).

  • Apache Hive

    Apache Hive is an open-source data warehouse system for querying and analyzing large datasets stored in Hadoop files. Hadoop is a framework for handling large datasets in a distributed computing environment.

  • Apache Incubator

    Apache Incubator is the starting point for projects and software seeking to become part of the Apache Software Foundation’s efforts. The ASF is a non-profit organization that oversees the development of Apache software.

  • Apache Pig

    Apache Pig is an open-source technology that offers a high-level mechanism for parallel programming of MapReduce jobs to be executed on Hadoop clusters.

  • Apache Spark

    Apache Spark is an open source parallel processing framework for running large-scale data analytics applications across clustered computers. It can handle both batch and real-time analytics and data processing workloads.

  • B

    big data management

    Big data management is the organization, administration and governance of large volumes of both structured and unstructured data.

  • C

    column database management system (CDBMS)

    There are different types of CDBMS offerings, with the common defining feature being that data is stored by column (or column families) instead of as rows.

  • columnar database

    A columnar database is a database management system (DBMS) that stores data in columns instead of rows.

  • compliance

    Compliance is the act of being in alignment with guidelines, regulations and/or legislation. Organizations must ensure that they are in compliance with software licensing terms set by vendors, for example, as well as regulatory mandates.

  • conformed dimension

    In data warehousing, a conformed dimension is a dimension that has the same meaning to every fact with which it relates.

  • consumer privacy (customer privacy)

    Consumer privacy, also known as customer privacy, involves the handling and protection of sensitive personal information that individuals provide in the course of everyday transactions.

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchAWS

SearchContentManagement

SearchOracle

SearchSAP

SearchSQLServer

Close