A brief look at recent developments includes news of an update to Oracle's GoldenGate replication platform and...
an Xplenty-Segment deal to connect Web-based marketing data to Hadoop processing in the cloud. The moves show data architectures and products in the midst of change, as large volumes of data and cloud applications combine.
GoldenGate 12c creates fast links to Hadoop data
Oracle recently enhanced one of its major replication and data integration platforms, releasing Oracle GoldenGate 12c with expanded support for cloud, heterogeneous databases and Hadoop. An objective of the latest GoldenGate software is to reduce performance hits on hardworking operational databases that run in distributed modes and feed analytical efforts in Hadoop.
"The volume is going through the roof and people can't afford the performance hit that ETL processing places on applications," said Jeff Pollock, vice president of product management for Oracle Data Integration. GoldenGate can take transactions off logs or standby systems, while having low impact on performance of source systems, he said.
Behind Oracle's GoldenGate effort, said Pollock, is a realization that diverse data platforms have to work together in real-time as structured and un-structured data are combined for analytics.
The GoldenGate update comes with an adapter for Java that supports integration with Oracle NoSQL, Apache Hadoop, HDFS and Hbase. Also supported are Apache Storm, Flume and Kafka -- each of which is a rising star in the Hadoop software ecosystem.
With this release, extended support is offered for Microsoft SQL Server, MySQL and IBM Informix. At the same time, an Oracle Active Data Guard capability lets customers use GoldenGate 12c on standby systems.
Oracle has also included a migration utility for customers looking to move from Oracle's Streams replication system, a legacy system that is taking a back seat to GoldenGate, which Oracle acquired in 2009.
Xplenty links up with Segment to churn online data in the cloud
Hadoop-on-cloud provider Xplenty now integrates with the marketing data hub from Segment. As a result, users of Segment SQL can gain speedy preparation of big data in Hadoop. The Segment SQL service runs on Redshift, Amazon's cloud-based data warehousing service.
For its part, Xplenty's Hadoop cloud service shields end-users from the more complex elements of Hadoop programming and configuration. Yes, it does ETL with Hadoop, but, no, Xplenty doesn't want to burden the end user with a long conversation about the innards of Hadoop.
"Our focus is on the application, not the technology," said Yaniv Mor, CEO and co-founder of Xplenty. "The end user doesn't care as long as it runs on time."
Mor quipped that the only Hadoop terminology Xplenty uses with customers is "cluster." He said his company has created a click-and-point interface that allows non-experts to design Hadoop data pipelines. When applications run on the Xplenty platform, Xplenty will make the decision as to whether workloads run, for example, as MapReduce jobs or Spark jobs, Mor said.
So far, most Xplenty customers are, in Mor's words, companies that started out in the cloud, pursuing markets like e-commerce and Internet gaming.
But, he adds, Xplenty's services also support integration of enterprise data that does not yet reside on the cloud. "While the trend to cloud is unstoppable, there still is a lot of data on premise," he said.
Find out about Hadoop on the cloud
Learn about data lakes, an ETL alternative
See how ETL relates to ELT