Guide to managing a data quality assurance program
A comprehensive collection of articles, videos and more, hand-picked by our editors
A sweeping data quality software project is helping one Chattanooga, Tenn.-based trucking outfit save “millions”...
in annual fuel costs and prepare for a turnaround in the U.S. economy, company officials said.
When Tim Leonard took over as CTO and Vice President of Information Technology at U.S. Xpress Enterprises Inc. last June, he quickly realized that data related to customers, suppliers and the company’s fleet of about 7,000 trucks was in very rough shape.
“I looked at the data itself and Walmart was spelled 176 times incorrectly,” said Leonard, who added that address information was also “absolutely horrible.”
Compounding the problem was the fact that U.S. Xpress’s various business units didn’t have a standard, centralized way to share data. And Leonard found some potentially useful historical information that wasn’t exposed to business users at all.
The company CEO knew there was a data quality problem, and he wanted the newly hired Leonard to do something about it. He also wanted Leonard to identify significant money-saving opportunities in the process.
“He wanted to be prepared so that when we came out of this [slow economy we would have] a centralized area for information and could react quickly to demand for moving freight across the United States,” Leonard recalled.
Engine idle data presents an opportunity for savings
Speaking with company officials, Leonard quickly realized that fuel is always one of the firm’s top three expenses. He had also noticed that some of the disparately housed data he was dealing with had to do with the amount of time that each truck runs idle on a given day. That information is fed back to headquarters via monitoring devices in each truck. But over time, Leonard said, the idle data had been spread out among many databases.
“There was never a completely standard monitoring capability for looking at all the idle information,” he said.
The CTO believed that improving the quality of that data and analyzing it properly would reveal opportunities to save money. But first he had to pick out the data quality software that would help get the job done.
A data quality software showdown
Leonard and his team began analyzing data quality software vendors through research and interviews and quickly narrowed the choices to Trillium Software and Informatica Corporation.
The CTO liked Trillium but felt it wasn’t a great match for the trucking industry -- and specifically, for a trucking company like U.S. Xpress, with multiple business units.
“The quality of data at different hierarchies within trucking is a lot different than any companies that I’ve worked with,” Leonard said. “I needed a software package that really dynamically could build the hierarchical data structures for the enterprise, and I just didn’t feel Trillium had that.”
The company ended up choosing Informatica Data Quality (IDQ) and Informatica Data Explorer (IDE). IDQ offers data cleansing and roles-based data management capabilities, while IDE is used mainly for data profiling.
Leonard liked the fact that the IDQ integrated well with Informatica PowerCenter, a data integration hub that U.S. Xpress had recently implemented. He was also drawn to IDQ’s graphical user interface (GUI).
“IDQ was so GUI-driven that one of my business units actually worked with one of my senior data architects [and] had it up and running and created a first business rules mapping in about six days,” Leonard said. “We had it integrated into PowerCenter in about two days and were actually cleaning and cleansing customer and idle data in less than three weeks.”
Leonard said that he would like Informatica to improve the product’s ability to do “near real time” data cleansing for Web services.
“I’d just like to see a better near-real-time cleansing capability in a messaging layer that helps me cleanse four to five data elements within sub-seconds,” he explained.
Gartner weighs in on data quality software
Informatica and Trillium were both ranked as leaders -- along with IBM and SAP-BusinessObjects and DataFlux -- in the recent Gartner Magic Quadrant for Data Quality Tools vendor comparison.
“Informatica has been putting a very strong focus on data quality and doing a very good job at cross-selling the data quality technology into its existing customer base for data integration tools,” Gartner analyst Ted Friedman said in a June interview.
Gartner reports that users of the Informatica Data Quality 9, which was released last fall, have complained about some of the complexities of the product related to workflow integration, security handling and rule management.
Also, “the lack of robust data quality reporting is mentioned regularly as a weakness, and some customers are struggling with the efforts associated with the upgrade to version 9.0,” the Magic Quadrant report read.
Informatica’s June update to the product, Informatica Data Quality 9.01, set out to address those two issues by providing an automated upgrade utility and additional reporting capabilities.
Data quality software pays off
Leonard said IDQ has been very beneficial in terms of creating business rules for defining and properly managing data.
“Instead of having 32 different business rules on the definition of a customer, we consolidate it down to one, and now we’ve got one definition that the entire enterprise can use,” he said.
And by using the software to cleanse the engine idle data, U.S. Xpress was able to come up with some new rules that amount to major savings on fuel. Chief among them is a new decree which states that engines must be turned off for longer periods of time during truckers’ required sleeping breaks. Leonard said that change in policy is helping the company save about 3,000 gallons of gas per day.
The CTO had some literally simple advice for anyone facing similar data quality problems in the future.
“Keep it simple,” Leonard said. “Don’t use a sledgehammer to hammer that nail.”