Big data poses weighty challenges for data integration best practices

Data integration framework needs more horsepower to handle big data

In many organizations, the rubber is meeting the road on the need for an upgraded data integration framework that incorporates big data platforms. And you could easily hit some potholes on the journey -- or worse, end up in a ditch.

For starters, big data architectures typically include a combination of internal systems and external data sources. They also add various types of unstructured and semi-structured data, in addition to structured transaction data. Hadoop data lakes and NoSQL databases pose different integration challenges compared to traditional data warehouses. The growing adoption of stream processing tools puts pressure on IT teams to rev up the data integration process to real-time speeds.

That nets out to a lot of added demands -- and new investments. In a 2016 Magic Quadrant report, Gartner said the need to blend existing IT infrastructure with big data systems, cloud platforms and other emerging technologies is ratcheting up the number of data integration initiatives getting the green light from corporate executives.

TDWI analyst Philip Russom made a similar point in a December 2015 report on modernizing a data integration framework. Without broader integration capabilities, "organizations cannot satisfy new and future requirements for big data, analytics and real-time operations," Russom wrote.

But there's still work to be done. Gartner analyst Merv Adrian said in an October 2016 blog post that ingesting data into data lakes has been a big discussion topic with user clients at the company's annual Symposium/ITxpo conference that same month. Much of the focus, he added, was on finding data integration tools to help in "managing and documenting the process better."

This handbook offers advice on navigating the new demands to help your organization polish up its data integration framework -- and stay out of the big data integration breakdown lane.