Real-time big data analytics calls for changes in the way data systems are built. Some of the pressing challenges are discussed in this edition of the Talking Data podcast.
The shape of data management continues to change, often driven by companies' embrace of real-time big data analytics. Static reports are not going away, but they are being joined by predictive analytics systems that are hooked into operations, as discussed in this edition of the Talking Data podcast.
Static reporting for analytics will continue, but it could very well take a backseat to newer methods over time. Data preparation was once a much simpler discussion, the podcasters reflect, but data is no longer taking a one-way path ending in a data warehouse. Instead, it is a part of real-time system that is "always on."
The technologies employed are far from simple. The streaming analytics engines that act on predictions and make fast recommendations involve many moving parts, both in terms of data ingestion and processing. These moving parts include messaging systems, data streaming, in-memory analytics and more.
NoSQL databases are also part of these new pipelines, as they work effectively with quickly arriving data. But SQL continues to be a way of pulling data from the operational stores.
The complexity inherent in new component combinations makes data management a more dicey undertaking than it may have been in the past. Wisely selecting and combining these components -- many of which are open source at their core -- is a daunting task. Knowing where best to apply the technology is challenging, too.
The immediate uses of these systems spans the gamut. They range from health system enrollment documentation updates and hospital readmittance predictions to fraud detection for credit card transactions and risk analysis for insurance underwriting.
Vendors plying the real-time big data analytics waters include established and new players. A snapshot view would include established companies like IBM, Informatica and SAS Institute. Upstarts are numerous, including Alteryx, Attivio, Datameer, Paxata and Pentaho.
Listen to this podcast for a recap of the recent changes in data management methods, and stay tuned for more.
Listen to the Talking Data podcasters on lambda architecture
Find out more about Kafka and Spark Streaming
Gauge machine learning's prospects for going mainstream