Hadoop Definitions

  • A

    Apache Flink

    Apache Flink is an in-memory and disk-based distributed data processing platform for use in big data streaming applications.

  • Apache Hadoop YARN

    Apache Hadoop YARN is the resource management and job scheduling technology in the open source Hadoop distributed processing framework.

  • Apache HBase

    Apache HBase is a column-oriented key/value data store built to run on top of the Hadoop Distributed File System (HDFS).

  • Apache Hive

    Apache Hive is an open source data warehouse system for querying and analyzing large data sets that are principally stored in Hadoop files.

  • Apache Pig

    Apache Pig is an open-source technology that offers a high-level mechanism for parallel programming of MapReduce jobs to be executed on Hadoop clusters.

  • Apache Spark

    Apache Spark is an open source parallel processing framework for running large-scale data analytics applications across clustered computers. It can handle both batch and real-time analytics and data processing workloads.

  • G

    Google Cloud Dataflow

    Google Cloud Dataflow is a cloud-based data processing service for both batch and real-time data streaming applications.

  • H

    Hadoop

    Hadoop is an open source distributed processing framework that manages data processing and storage for big data applications running in clustered systems.

  • Hadoop 2

    Apache Hadoop 2 is the second iteration of the Hadoop framework for distributed data processing.  Hadoop 2 adds support for running non-batch applications as well as new features to improve system availability.

  • Hadoop data lake

    A Hadoop data lake is a data management platform comprising one or more Hadoop clusters.

  • Hadoop Distributed File System (HDFS)

    The Hadoop Distributed File System (HDFS) is the primary data storage system used by Hadoop applications.

  • J

    JAQL (json query language)

    JAQL is a query language for the JavaScript Object Notation (JSON) data interchange format. Pronounced "jackal," JAQL is a functional, declarative programming language designed especially for working with large volumes of structured, semi-structured and unstructured data.

  • S

    SQL-on-Hadoop

    SQL-on-Hadoop is a class of analytical application tools that combine established SQL-style querying with newer Hadoop data framework elements.

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchAWS

SearchContentManagement

SearchOracle

SearchSAP

SearchSQLServer

Close