Big data management is the organization, administration and governance of large volumes of both structured and unstructured data.
The goal of big data management is to ensure a high level of data quality and accessibility for business purposes, including business intelligence and big data analytics applications. Corporations, government agencies and other organizations employ big data management strategies to help them contend with fast-growing pools of data, typically involving many terabytes or even petabytes of information and a variety of data types.
Most big data environments go beyond relational databases and traditional data warehouse platforms to incorporate technologies, such as Hadoop, MapReduce and NoSQL databases, that are suited to processing and storing nontransactional forms of data. The increasing focus on collecting and analyzing big data is shaping new platforms that combine the traditional data warehouse with big data systems in what Gartner Inc. analysts describe as a logical data warehousing architecture.
Effective big data management helps companies locate valuable information in large sets of unstructured data and semi-structured data from a variety of sources, including call detail records, system logs and social media sites. As part of the process, companies must make a judgment call on how much data will be retained and what can be disposed of. That requires careful data classification to categorize information according to factors such as its potential uses and business value.
Many big data management tools, including Hadoop and related technologies such as Cassandra, Hive and Pig, are open source, enabling organizations to experiment before making a large investment in required data analysis infrastructure technologies.