News Stay informed about the latest enterprise technology news and product updates.

Data confidence

Data confidence, a critical management success factor, is not automatic. How do you attain it?

This article originally appeared on the BeyeNETWORK.

There are many aspects of data which are important, but none more important than data confidence. When there is true confidence in data, management has the power to make good, informed decisions. There is minimal second guessing, and there are fewer conflicting opinions based on related but somewhat dissimilar data. The entire organization has the opportunity to become focused and in harmony. Without data confidence, the organization struggles to be informed and make good decisions. 

In a word, having data is a good thing, but having confidence in the data is an even better thing. 

The Elements of Data Confidence
So exactly what are the elements of data confidence? Data is like a many faceted jewel. There are many aspects to data, and each of the aspects is a factor in having confidence in the data. 

Some of the more important aspects of data confidence for small granular units of data are: 

  • Accuracy. If data is inaccurate then it is not believable.
  • Timeliness. If data is out of date, then it is not believable.
  • Availability. If data is not available, it is useless and hardly credible.
  • Precision. In order to be credible, the precision of the data must be in sync with the variable being measured.
  • Source. In many cases, the source of the data is an important factor. If the source is hidden or unknown, then the data becomes less than credible.
  • Definition of data. The definition of data is important so that the using audience knows what is being described.
  • Speed of access. If data cannot be accessed quickly, it becomes less than useful, and, by extension, less than credible.
  • Time variancy. Some data is accurate only as of some moment in time. If that is the case, then the data and its relevant point in time are required for credibility.
  • Presentation. Data presented to an American audience in Mandarin or Kanji will not be useful except for those speaking and reading Mandarin or Kanji.
  • Granularity. Data must be at the appropriate level of granularity in order to be useful.
  • Security. Some data has the property of needing to be secure. Other data is open to anyone who wants to look at it. In order to be credible, if data needs to be secure, then there is confidence when, in fact, it is secure.
  • Moment of capture of the data. When was the data first captured?

For data that is created as a result of a calculation, there are other factors required for data confidence. Some of these factors are: 

  • The calculation used in order to create the unit of data.
  • The date and time the calculation was made.
  • The organization making the calculation.
  • The raw data that was included in the calculation.
  • The raw data that was excluded from the calculation. 

Systemic Data Confidence
There are another entirely different set of factors that relate to data confidence. These factors are manifest at the system level, not the individual data level. The systemic factors that relate to data confidence include: 

  • The definition of an overall system of record. When a system-wide system of record has been defined, the result is integrity of data. There is no overlap and no redundancy of data when the system of record has been implemented properly. With a properly defined and implemented system of record, there is a clear source of data for each time data flows or data is involved in a calculation.
  • The integration of data where there are no stovepipes of unintegrated data. When the stovepipes have been eliminated, there is the possibility of truly implementing a system of record. As long as there are stovepipes, there will be overlap of data to some extent – either small or large. And where there is overlap of data, there will be discrepancies in the value of data in one place versus another.
  • An architectural plan. The architectural plan for information systems will describe where the organization has been and where it is going. The architectural plan describes how new information will be added to existing information in a fashion that continues the integrity of the data.

Data confidence then is an extremely desirable state for an organization. With data confidence, the organization is prepared to make the most timely and the most accurate decisions – all the way from small decisions involving the smallest amounts of data up to large, strategic decisions involving very large amounts of data. 

Achieving Data Confidence
It is one thing to describe what data confidence is; it is quite another thing to achieve data confidence. In order to achieve data confidence, the organization needs a formal statement of architecture that guides the entire information infrastructure. Data confidence is achieved one step at a time. Depending on the organization, data confidence may be achieved in a short or lengthy amount of time. In any case, data confidence is achieved incrementally over a period of time. 

Each aspect of data confidence must be addressed, and in some cases, addressing the aspects may require considerable effort. Data confidence is not like a product that you go to the shopping mall to buy. Instead, data confidence is built like a wall – one brick at a time. Every time you add another brick, the data confidence factor increases. 

Creating data confidence is a long-term proposition. As such, an information systems architecture is required. The architecture allows many people over a lengthy period of time to work in concert with each other and according to a common plan. Without an overall architecture, the result will be a strange looking wall. Some of the wall will be brick, some of the wall will be cinderblock. Some of the wall will be rock, and other parts of the wall will be wood. Part of the wall will go 90 degrees away from other parts of the wall, while in some places, the wall will be disconnected. Such a disorganized approach to the building of a wall over a long period of time by different bodies of people is the result of not having a common framework – a common architecture – from which to work. 

An Implementation-Oriented Architecture
In addition, the architecture must be implementation-oriented. It does no good to have an architectural plan that cannot be implemented. An architecture that produces only paper results and the case for more planning and more design is an exercise in futility. 

In order to be successful, an implementation-oriented architecture must factor in the following requirements:

  • The ability to handle a lot of data,
  • The ability to handle many different kinds of data,
  • The ability to handle a high volume of transactions,
  • The ability to handle many different kinds of transactions,
  • The ability to create and maintain transaction and data integrity throughout processing,
  • The recognition of the life cycle of data as it enters the organization and resides there over time,
  • The need to integrate and store metadata as a central part of the environment,
  • The need to store data at a granular level so that the data can be reshaped to meet future unknown needs,
  • The need for reconcilability of data,
  • The need for allowing end users to customize their usage of data,
  • The need for different kinds of security throughout the architecture and information infrastructure,
  • The need for an evolutionary migration away from stovepipe systems,
  • The need for rational sharability of data among organizations,
  • The need for an evolutionary implementation,
  • An awareness of the cost of the support of the infrastructure,
  • The need for the ability to choose implementation technologies from a multiple list of vendors, and so forth.

Data confidence is achieved one step at a time. An implementation-oriented architecture is needed in order for data confidence to be created. The architecture is needed because data confidence is created at the infrastructure level.

Dig Deeper on Data quality techniques and best practices