What is data redundancy?

News

What is data redundancy?

Bill Inmon

This article originally appeared on the BeyeNETWORK.

Redundancy of data is an inefficient and costly waste of resources. IT technicians, if given the opportunity to use their skills and intelligence, can eliminate or at least minimize this redundancy. 

As long as there were one or two applications, redundancy of data was never much of an issue. But those applications continued to grow. The ones that already existed continued to expand. Then there were many applications and significant overlap between them. The same data appeared in different places, but it had different values. Furthermore, reconciliation of values was impossible. There was not a single place that an organization could reply upon to determine the correct value. 

Another problem was that this redundancy of data required maintenance. When it came time to make a change, the change had to be applied in multiple places. Under these circumstances redundancy of data got a “bad name”. But was this justified? Was redundancy of data the real culprit here? According to the collective intelligence of technicians, if we got rid of redundancy, we would get rid of many problems associated with it. 

The truth is – redundancy of data is absolutely a normal part of life. Take time, for example. Time is everywhere. It is on your wristwatch. It is on the television. It is on the clock in your house. It

    Requires Free Membership to View

    When you register, you'll begin receiving targeted emails from my team of award-winning writers. Our goal is to keep you informed on the hottest data and information management trends today.

    Hannah Smalltree, Editorial Director

    By submitting your registration information to SearchDataManagement.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchDataManagement.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

is on the radio. It is in your room at the hotel. In short, almost everywhere you look you find time. If ever there was a piece of information that was redundant and ubiquitous, it would be the time. 

And do we have a problem with redundancy of time? Not at all. If we find a clock that has the wrong time we simply reset it. In fact, twice a year we reset our clocks due to daylight savings time. But if it were up to the technician, there would be no redundancy. If it were up to the technician there would be one and only one clock in the world and everyone would have to do their business based on that one clock, even though only a few people could ever see the time. This is of course is an absurd proposition.

Clearly redundancy of data is a good thing. 

So what is the problem with all of these applications and the massive amount of data redundancy that we have? The problem is that there is no single “system of record”. A system of record is the designation of the one place where data will be captured and updated. From that single place, data will be copied.  The early application designers didn’t understand architecture. Everyone captured data, edited and updated it. As a result no one knew what the real values should be. It wasn’t the data that was bad. It was the architectural treatment of the data that was to blame.  

There it is! It is the architecture that was created to support the early applications that is the problem. Redundancy of data is merely a byproduct of an improper architecture. The reality is that massive amounts of redundant data should reside within the corporation.  As long as there is a proper architecture supporting that redundant data, there is nothing that is wrong or out of place.

 

Bill is universally recognized as the father of the data warehouse. He has more than 36 years of database technology management experience and data warehouse design expertise. He has published more than 40 books and 1,000 articles on data warehousing and data management, and his books have been translated into nine languages. He is known globally for his data warehouse development seminars and has been a keynote speaker for many major computing associations. Bill can be reached at 303-681-6772.

Editor's Note: More articles, resources and events are available in Bill's BeyeNETWORK Expert Channel. Be sure to visit today!


Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.