It’s a mistake to think of database use within an organization as static.
It certainly is true that about 30 years ago, we began to broadly replace mainframe and paper-based systems with distributed applications built on top of servers and transactional databases. By the 1990s, relational database management systems ruled the DBMS roost – a position that they continue to have a firm grip on.
And 20-plus years ago, we started pulling the data from relational databases into what are now known as business intelligence (BI) systems. The heart of most of them is a data warehouse that brings together information from disparate transactional systems into a central repository, where the data is cleansed and integrated for analytical uses.
That’s the typical database framework at most organizations, then. But growing data volumes, changing business needs and the emergence of new technologies – cloud and columnar databases, to name two – require IT managers to keep their eyes and minds open as they do database planning and consider possible updates to their database strategies.
For example, relational databases are great for storing and manipulating structured transaction data. And both relational software and multidimensional database engines have their places in BI systems, which again deal with structured data. But now we’re seeing an increasing interest in capturing and analyzing what I like to call semi-structured data: documents, emails, tweets, posts from social networks such as Facebook, videos and so on. Any database and data management strategy with an eye on the future must incorporate thinking about those data types.
Standardization is also worth considering as part of your database planning activities. Currently, almost all companies are supporting a range of different database technologies in separate departments and business units. That is usually a consequence of how databases and the applications running on top of them have been acquired over the years – particularly in decentralized organizations that give business users autonomy on the technology selection process.
The perils of haphazard approaches to database planning
Let’s say, for example, that the human resources department has funding for a new system. It picks an application based on the functionality that the software offers. The application may be available on multiple database engines, but the choice of a database is usually immaterial to the HR department – so the decision might simply be left up to the application vendor. In any event, there’s no coordination with IT or other departments.
Soon, the same thing happens in finance, marketing and so on – and thus the organization finds itself running a motley collection of database engines. Mergers and acquisitions only exacerbate the situation, of course.
This kind of database system software diversity complicates administration and support work. But going forward, the other big reason to think about introducing standardization as part of a database strategy is BI. The job of consolidating data in BI systems is made hugely easier if transactional systems are built on the same database engine – even more so if that same technology is used for the data warehouse.
Most companies maintain a list of their current assets: buildings, vehicles, etc. But data rarely appears on such lists, which I find odd. Imagine two disasters for your company: One involves the sudden loss of every single vehicle it owns, the other the loss of every piece of data. For many companies, the former disaster is survivable – but I would argue that the latter is not because at most businesses these days, their data is the company.
So, many companies completely fail to think about data as a vital asset; more importantly, they never properly define their data. Take “advertising spend per customer.” On the face of it, “customer” has an obvious definition: someone who has bought something from your company. But to different business users, depending on the circumstances, it could mean:
- All customers, past and present.
- Current customers.
- Corporate customers
- Individual customers.
- Profitable customers.
Within a large organization, all of those definitions (and more) will be in use somewhere. When data was isolated within each department, different definitions for the same type of data might not have been a problem. But as the use of data expands out across the entire enterprise, particularly for BI and analytics uses, agreeing on common data definitions should be part of any corporate database and information management strategy.
Good database planning: it isn’t always about the technology
Of course, that isn’t the kind of technical issue that hardened database professionals are used to dealing with. Nevertheless, it is an issue that must be addressed – or else all of your database planning and development efforts may go awry in the end.
With data volumes growing rapidly at most organizations, it would be easy to talk about disks and processors. But that would be to miss the point. Whatever data volumes we ultimately reach, it’s a fair bet that the available technology will be able to cope with them. What’s more important, in strategic terms, is to be aware of the different types of data in your organization, seize the opportunities to work toward a unified adoption of database technology and make sure that your data is well defined and understood.
While we can’t always predict the future, consideration of those points should improve the success rate of a database strategy and provide an element of future-proofing as companies plan and execute database projects going forward.
About the author: Dr. Mark Whitehorn specializes in the areas of data analysis, data modeling and business intelligence (BI). Based in the U.K., he works as a consultant for a number of national and international companies and is a mentor with Solid Quality Mentors. In addition, he is a well-recognized commentator on the computer world, publishing articles, white papers and books. He is also a senior lecturer in the School of Computing at the University of Dundee, where he teaches the masters course in BI. His academic interests include the application of BI to scientific research.