WavebreakMediaMicro - Fotolia

Cloud data integration tasks require new IT approaches, tools

In a Q&A, consultant Rick Sherman offers tips on integrating cloud and on-premises applications, and discusses potential barriers and the technologies that can help get the job done.

With the corporate use of cloud applications increasing, the integration points that IT and data management teams are responsible for are growing as well. More and more companies are opening up to the idea of going to the cloud, particularly for sales and marketing applications -- "and now they need to get the data integrated with their on-premises applications," consultant Rick Sherman said in an interview with SearchDataManagement.

Sherman, founder of consultancy Athena IT Solutions, added that doing so isn't always a simple task. IT managers often have to pick up the integration pieces after individual business units deploy cloud applications on their own, he said. In the interview, Sherman discussed the hurdles typically faced in cloud data integration, the available technology options for integrating cloud and on-premises applications, and how to get started on a cloud integration project.

What barriers are there to integrating data sources in the cloud and on-premises?

Rick Sherman: Each of these new cloud applications is another data silo, so there's a tendency for [the data] to diverge or not be consistent. As far as technical issues, a lot of the integration that IT is used to doing is in data warehousing and business intelligence. Some of the needs of data integration for the cloud are a little different because we're not only dealing with a one-way transfer from a data source system to a warehouse -- we're also dealing with application-to-application integration, where you're loading the data onto the cloud platform and synchronizing it between applications. There are different technologies you can use to do that: enterprise service buses, enterprise message services. But a lot of times, it presents issues to the IT group because they're not familiar with those other technologies. They're used to data integration tools that are in the ETL category -- extract, transform and load.

Rick ShermanRick Sherman

Is integration in the cloud something that's still primarily being done with ETL software?

Sherman: ETL tools have still been the primary [choice]. When the first wave of integration came about for the cloud, you also got something called iPaaS -- integration platform as a service. What happened was the data integration vendors that use mainly ETL started incorporating other technologies into their integration techniques -- the ESB variant, the EMS one. That's been beneficial to IT groups because, within their existing tools, they can just add the newer ones that their vendors are offering.

How functional are the iPaaS offerings that are available now?

Sherman: I would say a lot of the iPaaS tools are still in the data loading and data synchronization use cases -- so they can do that well. When it comes to integrating and cleansing the data and making it more consistent, they're not as mature or sophisticated. That's not the use case they're used to. We still have a lot of [cloud users] who are just synchronizing data. Whether you're a big or small company, the first wave that happens is you get these applications and you want to load data into them and synchronize the data between applications. But at some point, you need to go beyond that.

What are some of the factors that IT and data management teams should consider in determining which cloud data integration option is right for a company?

Sherman: They should look to see what integration vendors and products are in use now. Sometimes, integration technologies are embedded in or bundled with a cloud application to get you started, so they should do a quick assessment. If they're using [ETL] tools and have expertise in them, they should look at those vendors' iPaaS capabilities and see if they can expand that way. If they don't have a big need for ETL, or at the current time they're just trying to synchronize applications, they should probably look at the iPaaS vendors and stick with those [platforms] because they'll be on a subscription basis and won't be overwhelmed with the sophisticated data integration [capabilities] that the other tools have.

And what steps should they take to get started on a cloud and on-premises integration project?

Sherman: First, they should take an inventory of what on-premises applications and cloud applications they have. You'd think it would be simple, but the business may or may not have kept them abreast of all the cloud applications. [Second], they should do an assessment of what kind of data volumes and data updates each of those applications has, to [determine] the demand on the integrations they need to do. Third is to figure out, with these cloud applications, what types of integrations are needed. Do they need to move data between cloud applications? Between cloud and on-premises applications? Do they need to bring the data to a data warehouse or comparable database to do analytics? Based on that, they can start to look at what technologies make the most sense in order to be able to complete those integration tasks.

Corlyn Voorhees is an editorial assistant for SearchDataManagement. Email her at [email protected], and follow us on Twitter: @sDataManagement.

Next Steps

Resolutions for successfully integrating cloud and on-premises data

Cloud BI adoption is increasing, but only to a degree

Big data architectures add new integration options and needs

Dig Deeper on Extract transform load tools