Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Big data management and analytics weather tumult -- with more in store

Listen to this podcast

Cloud had a big impact on big data management and analytics last year. Machine learning and streaming designs will contribute to change in 2017.

Big data management and analytics saw plenty of commotion last year, as bleeding-edge users dug deeper into machine learning, streaming architectures gained attention and cloud computing exercised greater overall influence. There's little indication things will slow down in 2017.

The emphasis on cloud for data has been both gradual and sudden. Ten years ago, in the early days of cloud computing, there were more than a few users who held that data would probably never go to the cloud in great portion. Recent years have seemed to dispel that notion and, in 2016, cloud and data got tangibly closer.

During the year, Microsoft moved forward with its long-brewing analytic data warehouse in the cloud. Oracle announced a cloud-first product release policy. Startup Snowflake improved its cloud offering with adaptive query result caching tuned to speed often repeated queries. Also, IBM continued to enhance its dashDB cloud data service, often with Watson-related plug-ins.

All these players seek to cut the distance that cloud leader Amazon has gained with a host of offerings. Yet another offering arose late in the year, when the company's Amazon Web Services group added Athena, a Presto-like engine that brings interactive SQL queries to Simple Storage Service data sets.

Amazon's influence extends to Hadoop, as discussed by the editors in this edition of the Talking Data podcast. They note that some estimates place the famed bookseller as No. 1 in Hadoop deployments. That these deployments may be invoked, run and then dissolved shows the radical nature of elastic cloud architecture.

That major Hadoop distribution providers came to embrace the same model in 2016 showed that Hadoop would continue to bend to meet the road. For data managers, such models could change some basic tenets of data management.

Assorted components in the Hadoop ecosystem continued to gain attention in 2016 -- none more than Spark, 2015's data word of the year. At the same time, Spark and other new technologies expanded data developers' capabilities; it seems that stitching together the components in advanced applications -- some that may also feature machine learning and data streaming -- added perceptibly to development complexity. Sometimes, progress appears fitful.

Among the strategies used to tame the complexity were DevOps methods -- also sometimes going under the name of DataOps -- and microservices architectures. For many data developers, the methods and architecture come with a steep learning curve.

For everyone in big data management and analytics, the roller coaster ride seems bound to continue.

Next Steps

Listen to SearchBusinessAnalytics' companion podcast review of 2016

What to watch for in big data management and analytics in 2017

Find out how machine learning can change system infrastructures going forward

Discover top issues for data management programs


Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

Which among the many big data management and analytics developments in 2016 do you see as most important?
The most important would be the lack of proactive and big data usage of predictive threat analysis ... why would data management or analytics be more important when cybersecurity should be the most talked about big data and analytics issue - adding that machine learning is just but a tiny component of what the movement of investment in the big data industry niches ... the need for such a protective proative barrer is more evident despite it not being most obvious .. but quite important that it was not as mainstream as it should have been in 2016...
Granted, PNSincoboca: The most active area for much of so-called 'big data' and 'machine learning' has to do with security. Thanks for being in touch.
In addition, Azure DW, Cooladata, Redshift, Xplenty are also  but those which enhance will survive. There are many ready-made enhancements and they just need to tweak.
Often the tweaks can be 'labor intensive' - but there is always room for users to improve on what they have, right? Thanks for being in touch.
Yes , you are right. The problem is that there is no "one-fit-for-all" tools/ software products. Workarounds are there but those features are not as good as those other products which are developed for those specific cases. In majority of the cases now, it is "case-to-case" basis.
Scala is a very good Programming language  combining FP and object-oriented features. I like it being concise and logical especially in HPC. You can have a look at few of my linkedin blogs on how it is being very useful. Some I have done in C,C++ and Java for reasons. Only when you start hands-on , it will answer your questions. 
Right - Scala developer roles seems to be expanding. Especially in terms of creating frameworks that allow other developers to work with the new data analytics tools, especially Apache Spark. Thanks.