The late Danish physicist and Nobel laureate Niels Bohr is credited with saying, "Prediction is very difficult,...
especially about the future." Nonetheless, it's that time of year -- so here are some thoughts on data management issues and trends that seem likely to feature heavily on the agendas of CIOs and IT managers in 2017.
Big data continues to dominate technology headlines, but genuine examples of it in action as part of data management programs, as distinct from proof-of-concept and sandbox projects, remain sparse. Finding big data case studies with real return on investment (ROI) figures -- net present value, internal rate of return, payback period -- is like searching for the proverbial needle in a haystack.
There's no doubt that data volumes are increasing, especially from nontraditional data sources such as internet clickstreams and sensors on industrial equipment and other devices. There's also no doubt that trying to deal with such data using conventional relational databases is often both challenging and expensive. Clearly, adoption of Hadoop and other big data technologies is rising as a result.
Various surveys have found that more than half of large companies now have big data architectures in place. For example, nearly two-thirds of the 44 organizations that took part in a survey released in January 2016 by consulting firm NewVantage Partners said they had at least one big data system in production use, double the amount from a similar survey conducted three years earlier.
However, my discussions with user companies and vendors alike continue to illustrate a subject that's long on hype and short on tangible ROI. It may be only a matter of time before more demonstrations of big data ROI filter through -- in the latest NewVantage Partners survey, revised to focus on big data's business impact and released this month, just under half of the respondents said they were getting measurable benefits from their deployments.
But given the considerable level of investment in big data environments in recent years, one would expect to see a decent number of compelling case studies becoming public by now. I hope that will indeed happen this year, or else big data may be in danger of being seen as a white elephant -- a very different animal than the fluffy yellow toy one that Hadoop was named after.
Opportunities and challenges in IoT
The internet of things (IoT) is a related technology that's also in the data management spotlight these days. Much of the data used in big data applications is supplied by the plethora of devices connected to IoT networks: sensors, RFID tags, smart meters, etc. As just one example, an Airbus A350 airplane is said to be equipped with nearly 6,000 sensors that collectively generate more than 2 TB of operational data daily.
IoT data enables all kinds of predictive equipment maintenance, but it needs to be collected, stored and prepared for analysis. That isn't an easy task for the teams involved in data management programs, and traditional business intelligence tools are frequently ill-suited to making sense of such data, requiring new investments in data science skills and advanced analytics software in many organizations.
In addition, a dark side of this brave new world is that many IoT-connected devices are even more vulnerable to security threats than PCs are. Hackers have already demonstrated how a botnet of devices like printers, digital cameras and DVR boxes can be hijacked for distributed denial-of-service attacks on internet targets. Coming to grips with IoT will be distinctly difficult overall; it promises to be an area of opportunity, but also one of considerable challenges in 2017.
The gradual move from on-premises systems to the cloud is a more predictable element of data management programs this year. The economic and IT benefits of cloud computing are considerable: By offloading system implementation and management to cloud providers, companies get a simpler, less costly and more flexible technology environment. More and more companies have started relying on cloud services for core applications, and the sky hasn't caved in on them yet. So, despite nagging doubts about security in the cloud, this is a trend that will run and run.
End users out in front on data prep
Increasingly, data analysts and business users are seeking to take control of data in order to gain quicker business insights. This trend is shown in the recent rise of self-service data preparation tools aimed at data scientists, business analysts and other end users. The push toward more rapid decision-making this year means more of those workers will take matters into their own hands, instead of waiting for the IT department to hand down officially sanctioned data in time.
Given the considerable issues of poor data quality and information overlap between different systems, I predict that there will be a lot of disappointment when people try to make use of raw data in such a way. But perhaps it will have a useful side effect in forcing analytics teams and business units to take data quality, master data management and data governance seriously, rather than (incorrectly) regarding them as problems for the IT department to handle.
Among other data management and analytics trends that seem certain to gain more attention in 2017, machine learning is a prime candidate. Machine learning isn't new, in principle, but big data platforms, easier-to-use software and emerging technologies, such as IBM's Watson cognitive computing system, have brought it to a more practical place for widespread use.
In the coming years, more and more aspects of business operations -- and everyday life -- will be decided by automated algorithms. There will be some pushback as people realize that the limits of data quality processes affect the information accuracy those algorithms rely on to generate accurate analytics results. Still, this is another bandwagon that will be hard to stop, and, in many cases, machine learning applications will expose how unreliable human decision-making is.
Listen to a podcast on trends in big data management and analytics
Analyst Tony Baer on machine learning, data streaming and IoT
Governing big data now a bigger part of the data management process