Sergey Nivens - Fotolia
In recent years, leading technologists have begun to take a close look at blockchain data architecture underlying the at-times controversial bitcoin encrypted payment system. Vendors, bankers and government regulators have taken interest. Finance would appear to be the first target for disruption by a centrally regulated blockchain, one that acts as a distributed ledger for various types of transactions -- the world of data management could be a candidate for disruption, too.
Stewart Bond, director of IDC's data integration and integrity software research, has been looking at blockchain and its potential impact on data management. He recently shared his assessments -- a key one being that blockchain innovation could spawn new approaches to achieving data integrity.
Blockchain has gotten a lot of coverage, but it's still worthwhile to ask what it is. How have you come to view it?
Stewart Bond: I heard about blockchain making waves in financial, healthcare and other industries. When I started digging in, I immediately saw the potential disruption this could have in the data world. I had to peel off the use cases and look at the technology itself with a data lens.
The interesting thing from a data management perspective is that blockchain provides a complete, immutable, historical record that everyone in the network agrees with.
But, at its most basic level, blockchain is a data store -- a historical record of changes to an entity. Up to now, it's been related to financial assets. But really, it's a write-once append-only data store. A blockchain could be almost an equivalent to a record in a database. What is in that block is a combination of transaction data and reference data.
It starts with a genesis block, which is the very first block that is created. As the value of the entity there is changed over time, blocks are added to the chain. If you think about it in terms of lineage, it is instance-level lineage for a particular object or thing. Every time something changes, a new block is time-stamped. The new information about the state of the object is persisted. It adds blocks every time the data changes. So, the most recent block in the chain has the most recent version of the truth for that data or instance. You can go back to previous blocks in the chain to get a historical record of how the entity has changed. That is the block or data side of it.
The chain side refers to the other piece of data in the block. It's a hash value. It's a cryptographic hash of the contents of the previous block. The longer the chain is, the more secure the chain is, and the more difficult it is to break. That is one part of blockchain's immutability. Every block is connected cryptographically to the previous block.
Another important element of blockchain is proof of work, which is how consensus is reached on the network. When there is proposed change to state of an entity, there needs to be consensus among participants. They can see the change. There are different models for this. The familiar one is the public permissionless model of bitcoin, which is completely disintermediated. What is being discussed now are private permission models that could be controlled by a central authority.
One of the key uses of relational databases has been to underlie general ledgers. How could data professionals encounter this new blockchain data architecture technology?
Stewart Bondresearch director, IDC
Bond: Because it provides a historical record that everyone in the network agrees to, it provides provenance -- complete data integrity. This is something anyone that has ever done data management is concerned with. You always look at data and ask if it is right. That is why there is data governance, and why there are data-lineage solutions. Complete data provenance has been elusive in many cases. But the blockchain technology itself provides the data provenance. It tells who did what to the data and when it was done, since the day the data was first born. It's the way the network works.
So, blockchain technology could potentially disrupt [the data integrity and lineage capabilities] that software vendors have in their tools now. There is still room for the tools these vendors have, but this is something very significant that they have to pay attention to.
So far, blockchain has been mostly discussed in terms of business-to-business networks, payments, currency exchanges -- ways of moving value between institutions, between entities and among people. Still, there are some organizations looking at replacing internal systems with a blockchain solution that will eventually be extended to their trading partners. But there will continue to be relational database systems within those institutions.
Let's ask a relational question. How do you query blockchain?
Bond: That is actually one of the issues with blockchain right now. We call it a data store, not a data set. It doesn't have indices. You're actually able to enter 'Bitcoin explorer' in a browser, and look at the transactions in the current bitcoin network. You can search for them, but you can't necessarily query them.
There are some database schema that have been published that allow you to set up a database that would have representative tables, and then you can import the data from the blockchain into that database, and then use that database for querying and analytics and so on.
IBM just announced IBM Blockchain, which is a commercial version of Hyperledger Fabric, and it adds some utilities around that, including a search-and-query capability. Right now, the technology is still new and still developing. We are still in very early days, and there is a lot of work to be done.
ComputerWeekly asks if blockchain can revolutionize the Middle East
IoT Agenda looks at IoT in the blockchain world
SearchFinancialApplications views blockchain in human resources