The next stop for middleware-based data services may be shared cloud computing environments, but analysts and IT professionals say organizations taking this route will want to keep a
For starters, the task of connecting internal data services to external clouds shouldn’t be undertaken unless those data services were deployed and maintained properly in the first place, according to industry experts. Then there are data quality issues, industry standards and security concerns to consider. But if they’re done right, data services and cloud computing can be a highly lucrative combination.
“I think the next step for data services is going to be cloud integration, which some organizations have started to do. But there is more work to be done there,” said Noel Yuhanna, a database management systems analyst at Cambridge, Mass.-based Forrester Research Inc. “I think the cloud really takes it to the next level. It’s an integration level, which is becoming important because data is [generally] very siloed right now.”
Data services -- also known as data-as-a-service, information-as-a-service and information services -- take advantage of an organization’s existing service-oriented architecture (SOA) to offer users an ideally safe middleware layer for integrating information from database management systems and other structured and non-structured data sources. The newly integrated information can then be consumed by key business applications, which use data services to create rich mash-ups, add depth to content, and create new sales and marketing opportunities. Experts say other key benefits of data services include a reduction in error-prone data duplication across the enterprise and quick access to information that can help business users do their jobs.
While data services certainly aren’t being used everywhere, the technology has grown considerably in popularity in recent years. Yuhanna estimates that the number of organizations currently using a data services software platform has doubled over the past two-and-a-half years to between 20% and 22%. He adds that it’s mainly advanced data services users -- the early adopters -- who have linked or are planning to link data services to cloud computing environments.
Currently, most data services users create mash-ups that integrate data primarily from internal data sources. Banks use data services to quickly integrate customer addresses with account information for improved call center service, for example. But experts say organizations that extend data services to external clouds can take this process further by enriching the overall presentation of information with highly relevant but previously unusable data from outside sources. Hospitals, for instance, could use data services and cloud computing to compare a patient’s symptoms with global epidemiological information, perhaps uncovering a continuing trend.
“The NYPD integrates data from hospital records, from credit card information, from driving records and employment records,” Yuhanna said. “The whole idea is to provide this integration point so that they can type a name and be able to search in all relevant information. All this was initially done all manually, and it was slow. I think that’s why you’ll see a lot of improvements in some of these cases -- because … they can now deliver this data integration in real time.”
Software vendors like IBM, Informatica, Microsoft and Oracle all offer tools and infrastructure for creating a data services middleware layer. Microsoft, in a new plan codenamed ‘Dallas,’ intends to offer customers a cloud-based environment where organizations can come together to find, purchase and manage relevant datasets. Microsoft, which has said that it wants Dallas to become an “iTunes for data,” has been quiet about the new service’s release date.
Data services in the cloud depend on data quality, security and standards
Companies mulling the possibility of linking data services to cloud computing environments in order to share information with outsiders need to be careful, experts say, regardless of whether they’re supplying the information or taking it from the cloud and using it. Organizations that supply outsiders with erroneous information could find themselves earning a reputation for bad credibility – and that could be bad for business. Also, organizations that accidentally take and use bad information for business purposes could make some costly mistakes.
Yuhanna said it’s always a good idea to know the reputation of any cloud-based source of information. And when sending data into a cloud environment, it’s important to conduct regular data quality checks on databases and other data sources.
“Data quality is a critical component in data services," he said. "[It's] a key concern with data services, since data comes from many sources and therefore quality could [easily] be impacted."
Data quality is a topic of great importance to Eric Williams, the CIO of St. Petersberg, Fla.-based Catalina Marketing Corp., because his company’s success depends on it.
Catalina specializes in making sure that individual consumers receive coupons at retail points-of-sale – coupons that are tailored specifically to each consumer’s specific preferences and buying habits. The company doesn’t collect personal information about shoppers, such as names and addresses, but Williams said the shopper-driven marketing business is extremely data intensive nevertheless.
Shopper-driven marketing involves registering shopper and product information from grocery stores and other retail outlets around the country – and then ensuring the quality of that data.
“We actually check every single record because there are lots of data quality issues that come up,” Williams said. “We’re collecting data directly from the point-of-sale system so that information set can be incorrect, and so [sometimes] you’ll get funny data that gets passed down to the systems.”
Catalina uses rules-based systems for running data quality checks which can cover “600 million rolls of information” in about two hours. Williams' advice to those who want to ensure data quality? Start at the beginning.
“You need to fix it at the inception,” he said. “Wherever people are entering it is where you need to put your processes.”
Companies linking data services software to the cloud should also be cognizant of the industry standards they employ. Yuhanna suggested that organizations use a combination of classic, proven standards like XML, SOAP and SQL, as well as the emerging Representational State Transfer (REST) protocol, which is getting more popular all the time.
“The protocol for accessing a cloud should be standard so that your internal applications, or your data services, are easily integrated with external clouds,” Yuhanna said.
Security is also a major consideration, and putting highly sensitive data into a cloud computing environment should either be avoided or done with extreme caution. But it’s also important to remember that clouds – such as semi-private clouds where business partners come together to do business – can actually help to shore up overall IT security. Yuhanna said this was the case for one of his clients, a financial services organization.
“Their take is that they don’t want the partners to come into the data center, for security reasons,” Yuhanna explained. “They want their partners to go to the cloud, collaborate in the cloud, share data in the cloud and be done with it.”
Linking data services to the cloud is “the easy part”
Data services software might be heading for the clouds, but at least one enterprise data consultant thinks that really isn’t such a big deal. The more important thing to remember, according to Jill Dyche, co-founder and partner with Baseline Consulting, a Sherman Oaks, Calif.-based firm that focuses on data management issues, is that when data services are designed, deployed and maintained correctly, linking to the cloud should be a simple task.
Dyche said the really “hard work” associated with data services includes establishing data requirements, identifying systems of record, data modeling, data integration and service-enabling applications. Meanwhile, she said, companies deploying data services need to formalize and document business processes, then create both the “fine- and large-grained” services needed to support those processes.
Once that work is completed and properly maintained, data services users should have no problem reaching into the clouds. After all, she said, the whole premise of a “service” -- and of SOA-based technologies in general -- is that users are not bound to a specific processing platform.
“Almost everything is cloud-able these days, and it’s certainly de rigueur to proclaim any emerging technology as cloud-worthy,” Dyche said. “But whether or not a company’s data is in the cloud is almost beside the point. The hard work … still has to be done.”