The newest incarnation of DataFlux Inc.’s flagship data management software – the DataFlux Data Management Platform 2.2 -- offers significantly improved data matching capabilities over previous releases, according to one user who beta-tested the new technology.
Data matching technology helps organizations identify erroneous or nearly identical information residing in one or more databases, explained James Brousseau, a longtime DataFlux user and an enterprise data architect with
For example, data matching tools might help a user realize that “John Smith” in the customer relationship management (CRM) system is the same person as “John Smyth” in the enterprise resource planning (ERP) system. Data matching tools also make it easier for users to delete or aggregate files in an effort to ensure consistency across business units.
Brousseau, who experimented with just a handful of the platform’s new features, said he was particularly impressed with the software’s improved “intuitive matching” capabilities. Intuitive matching helps users seek out similar files and assigns them a score indicating the likelihood of a duplicate. For example, if the files for “John Smith” and “John Smythe” get a 90% score, chances are very high that they refer to the same person.
“[DataFlux’s] prior searching technology would sometimes eliminate some entries because they didn’t match close enough,” Brousseau said. “But with the intuitive matching, you can adjust your sensitivity to increase or decrease how it scores.”
Unveiled in late November, the DataFlux Data Management Platform 2.2 is designed to provide users a centralized platform for all data integration, data quality and master data management (MDM) initiatives. According to DataFlux, the software offers many improvements including enhanced MDM capabilities and new tools to help business analysts and data stewards launch and manage enterprise data governance initiatives.
In addition to testing the intuitive matching technology, Brousseau ran some of his company’s existing data quality and data integration workflows through the new version and found that it performed well. Brousseau also had some suggestions for DataFlux as to how they could improve the product in the future. One of those suggestions centered on improving data integration processes related to data warehousing initiatives.
“I suggested that they put together what is known as a merge node,” Brousseau said. “This node would have the intelligence to know whether or not the data coming through your work stream has changed from the last time you processed it.”
Cleanup effort prompts data management software buy
When Brousseau first joined SonoSite about three years ago, he quickly realized that the organization had a great deal of work to do integrate and aggregate customer and contact information.
The company -- which uses the Oracle E-Business Suite for its back-end ERP suite and Salesforec.com for sales operations and lead management -- was struggling with loads of duplicate information and the system had to be cleaned up, Brousseau said.
“They would purchase contact lists or lead lists and they would import them into Salesforce without any real checking to make sure that the contact isn’t already a customer or something like that,” he explained. “Over the years, they ended up having just a real mess.”
At the time, Brousseau said, Salesforce didn’t offer a great deal of user friendly de-duplication functionality and instead relied mostly on partnerships with third-party data quality vendors.
"Salesforce kind of oversells their abilities,” he said. “You can probably purchase third-party products or services that will perform intricate de-duplication, but with just Salesforce out of the box, it’s not very intuitive.”
Brousseau ultimately solved the problem by implementing DataFlux data management software. The choice made sense, he said, because he had already been using the software for five years at Washington Mutual Bank, his previous employer.
“We primarily used DataFlux [at Washington Mutual] for profiling data sources for our enterprise data warehouse,” he said. “But it also could perform complex matching and de-duplication for housekeeping type activities.”
Brousseau said one of the keys to ensuring success with any data management software platform is remembering the tight relationship between data integration, data quality, MDM and enterprise data governance initiatives. He said it’s important to look at the four areas holistically and to avoid taking a “siloed” approach to data management.
More new features in DataFlux 2.2
Some of other new features in the DataFlux Data Management Platform 2.2 include the MDM Foundations feature and Web-based tools such as the Business Data Network and the Reference Data Manager.
DataFlux says the new MDM Foundations feature gives users the ability to take a phased approach to managing MDM “entities” such as household, organization, person, policy, product and supplier.
The new Web-based Business Data Network serves as a centralized repository to help manage how critical terms are used within a business. For example, a data governance council can decide on the definition of a particular term and enter it into the system for future reference. The Web-based Reference Data Manager lets users create and manage hierarchies and the relationships between them.
DataFlux says the new platform also offers improved “contextual extraction” features which allow users to access unstructured data -- such as old business records or invoices -- and incorporate it into existing data management processes.