Data management is the practice of organizing and maintaining data processes to meet ongoing information lifecycle needs. Emphasis on data management began with the electronics era of data processing, but data management methods have roots in accounting, statistics, logistical planning and other disciplines that predate the emergence of corporate computing in the mid-20th century.
Evolution and benefits of data management
Beginning in the 1960s, the Association of Data Processing Service Organizations (ADAPSO) became one of a handful of groups that forwarded best practices for data management, especially in terms of professional training and data quality assurance metrics. Over time, information became more popular than data as a term to describe the objectives of corporate computing -- as seen, for example, in the renaming of ADAPSO as the Information Technology Association of America (ITAA), or the National Microfilm Association renaming as the Association for Information and Image Management (AIIM) -- but the practices of data management continued to evolve.
In the 1970s, the relational database management system began to emerge at the center of data management efforts. Based on relational logic, the relational database provided improved means for assuring consistent data processing and for reducing or managing duplicated data. These traits were key for transactional applications. With the rise of the relational database, relational data modeling, schema creation, deduplication and other techniques advanced to become bigger parts of common data management practice.
The 1980s saw the creation of the Data Management Association International, or DAMA International, chartered to improve data-related education. Data arose again as a leading descriptive term when IT professionals began to build data warehouses that employed relational techniques for offline data analytics that gave business managers a better view of their organizations' key trends for decision-making. Modeling, schema and change management all called for different treatments with the advent of data warehousing that improved organization's views of operations.
Types of data management
DAMA International and other groups have worked to advance understanding of various approaches to data management. One such approach, master data management (MDM), for example, is a comprehensive method of enabling an enterprise to link all of its critical data to one file, called a master file, which provides a common point of reference. Data stewardship, data quality management, data governance, MDM and data security management are among the components of many professionals' data management practices. DAMA, among other groups which oversee certifications in data management skills proficiency, has created the DAMA Guide to the Data Management Body of Knowledge, or DAMA DMBOK, which attempts to define a standard industry view of data management functions and methods.
The view of data as a corporate asset, and concern about data-related responsibilities, have increased over time. Data management professionals are charged with finding ways to monetize corporate data -- whether by process streamlining, enhancing existing products or outright data selling.
The effective management of corporate data has grown in importance as businesses are subject to an increasing number of compliance regulations. At the same time, the sheer volume of data that must be managed by organizations has increased so markedly that it is sometimes referred to as big data.
Data management tasks
Many data managers are held accountable for corporate data security and legal liability. Stricter financial records and consumer protection requirements are driven by legislation or regulations which include Basel III, the Sarbanes-Oxley Act and Payment Card Industry Data Security Standard (PCI-DSS) policies.
Data privacy-related data management responsibilities have expanded in recent years, especially in the light of high-profile data hacks which occurred at retailer Target in 2013 and Equifax in 2017. A European data standard known as General Data Protection Regulation (GDPR) has also become the focus of data management project planning in Europe and beyond.
As data technologies have expanded, the purview of data management has expanded in turn. Increasing volumes of data and real-time processing of data have ushered in such data frameworks as Hadoop and Spark. The variety of data has grown as well. Unstructured data types have complicated data modeling procedures and ushered in an assortment of databases that do not use SQL, the structured query language closely associated with the use of relational databases. Collectively, the new technologies have come under the banner of big data. Analyst group Gartner has listed in-database analytics, event stream processing, graph databases, key-value stores and distributed ledgers as just some of the data management technologies to watch going forward.
Data management history
The first flowering of the discipline of data management was largely driven by IT professionals who focused on solving the problem of garbage in, garbage out (GIGO). That problem became apparent with the earliest mainframes, when exceptional computers reached false conclusions because they were fed inaccurate or inadequate data.
Among figures notable in the history of data management are E.F. "Ted" Codd, who conceived the relational model for database management; Ralph Kimball, who created dimensional modeling theory for data warehousing; Bill Inmon, author and data warehouse technologist; Jim Gray, who helped pioneer granular database locking; and Michael Stonebraker, who built relational databases for early midrange computers before participating in the development of a number of columnar, stream-oriented and object-oriented databases.
Approaches to data management eventually permeated what came to be known as the data lifecycle, spanning data creation, storage, processing, archiving and, sometimes, data destruction.