A quest to find the right data quality tool set for a major database consolidation and IT upgrade effort has forced...
By submitting your email address, you agree to receive emails regarding relevant topic offers from TechTarget and its partners. You can withdraw your consent at any time. Contact TechTarget at 275 Grove Street, Newton, MA.
one Chicago-based media research firm to make a tough decision between DataFlux and Informatica Corp.
Cision Inc. is a nearly 80-year-old media research and monitoring company that got its start by providing public relations professionals with large catalogs full of media contact information and journalists with news clips tailored to their individual beats.
According to Brett Safron, Cision’s senior vice president of product management, the company has spent the last decade working to update hardware and software, consolidate databases from around the globe, and seamlessly combine services with its European counterpart, Cision Europe.
“We now have a single product called CisionPoint, which pulls all of our services into one online SaaS model application,” Safron explained. “We currently have about 6,500 customers on that globally.”
Cision’s IT operations today consist mainly of Intel-based HP blade servers running Microsoft Windows. But getting to this point has been a gradual progression, said Greg Stam, Cision’s senior vice president of development -- and a significant portion of the business continues to rely on older, Unix-based servers.
“We have legacy systems with customers still on them,” Stam said. “We’re still migrating them into CisionPoint, and to do that, you’ve got to have people that actually know the data and know what it means.”
Discovering the need for a data quality tool
When Stam joined Cision about two years ago as leader of the firm’s IT development organization, he was immediately faced with some big data quality challenges.
The company had already begun cobbling together disparate servers, databases and media monitoring systems located in the U.S. when a decision was made to centralize global operations on the North American version of CisionPoint. This meant that many more systems based in Europe and throughout the world would have to be integrated as well.
“CisionPoint became the portal that would pull the data from the research systems, the monitoring systems and whatnot,” Stam said. “And so we found ourselves with multiple copies of data from various databases across the enterprise.”
While CisionPoint may look seamless to users, Safron said, making the system work on a global scale required “hundreds of hours” of manual data de-duplication, data cleansing and data mapping efforts.
“We didn’t know the alternatives that were out there and the technology solutions that could be placed in here that would help,” Safron said. “We thought these are just a bunch of legacy systems and legacy issues and that [we would] continue to do a lot of these things manually. [But Stam pointed out] that there was a better way to do that.”
DataFlux vs. Informatica: A data quality tool set evaluation
After consulting with his team and IT industry analysts, Stam was certain that his company needed a solid data quality tool set. He quickly narrowed the choices to three software vendors: Informatica, DataFlux and Pervasive Software Inc.
Cision’s IT group already had Pervasive Software’s extract, transform and load (ETL) software running in-house, Stam said. But the team didn’t use Pervasive much and eliminated the company from the data quality tool evaluation.
“Pervasive is largely an ETL tool, and it could do some of the job, but it’s a little difficult to use,” he said. “You need to be a little more technical to use it.”
Stam was impressed with the range of Informatica’s functions but found it too complex for Cision’s relatively small group of 80 to 100 developers worldwide.
“[Informatica has] a bunch of different modules, and it’s very expensive,” he said. “It wasn’t like we were going to be able to leverage or use all that functionality that came with Informatica anytime soon. We were really trying to focus in on data quality and the ability to kind of grow into [master data management] as we matured.”
Ultimately, Cision decided to go with DataFlux and went live with the data quality tool last September.
Stam said DataFlux offered a decent range of functionality in its base price, including address correction capabilities and a data analysis module. He said the latter module makes it easier for IT workers – especially those unfamiliar with older coding techniques – to work with legacy systems.
“We had some postal correction software that was non-compliant so we replaced that right away [with DataFlux],” he explained. “We just immediately did it real quickly with no problem.”
DataFlux also came with useful geocoding capabilities, Stam added, but that functionality may require the purchase of an additional data set for reference purposes.
After taking care of the “low-hanging fruit,” Cision’s IT team began focusing on data cleansing, data migration processes and creating Web services -- an area where Stam believes that DataFlux technology has room for improvement.
Creating Web services in DataFlux requires users to build their own custom containers, he said -- a process that requires significant programming knowledge.
“I’ve got my DataFlux person who kind of works the data end of it, and then we have to hand it off to [a programmer] to put the shell around it,” Stam said. “It would be ideal if we didn’t have to do that and it was a little better in the way that it handled Web services.”
Overall, Stam and Safron agreed that DataFlux data quality tools have helped Cision take a great deal of the manual labor out of the database consolidation effort. Looking ahead, they said the company plans to work on moving the last of its customers from legacy systems to CisionPoint while continuing to integrate monitoring systems, enrich data and create more Web services.
“I think we’re just kind of going to wash, rinse and repeat with these Web services,” Stam said. “We're trying to exploit that functionality and are just scratching the surface of it with the data right now.”