In a recently published ETL report, 40 tools were evaluated by Philip Howard, research director with Northamptonshire, U.K.-based Bloor Research International Ltd. The report looked at tools -- with or without the ETL label -- concerned with data movement, migration, synchronization and transformation, Howard said.
"Some of those were not really what you'd call ETL tools, but products you could use for ETL in a pinch," Howard said. "I wasn't limiting myself. I covered all sorts of data movement."
The term ETL used to refer to more batch-based data warehouse processes, he said, but these tools are used today to manage data within a variety of systems and applications. The report classified the tools by scale -- categorizing tools into five groups, from quick departmental jobs to enterprise-level deployment. The evaluation criteria included stability, support, fitness for purpose, architecture, performance, and ease of implementation and use. Howard was surprised at some of the trends he discovered.
Open source tools were more plentiful than expected, and Howard evaluated four in the report: Enhydra Octopus, CloverETL, and two different tools with similar names -- Kinetic Networks Inc.'s KETL tools and the Kettle project, which was recently acquired by open source business intelligence maker Pentaho Corp., based in Orlando, Fla.
There were also more tools aimed at developers, Howard said. In recent history, many ETL tool vendors focused on visual interfaces intended for less technical users. Developers have typically hand-coded data movement functions for small projects, he said, but with compliance and data quality concerns prevalent, the pendulum is swinging back toward the use of ETL-type tools for code generation.
"In the old days, when the ETL tools first appeared, a lot of them were code generators. Then things really moved to a black box environment. There seems to be a resurgence in code-generating solutions. There's a move away from hand coding and, outside of the Microsoft community, a big interest in Java and generating data movement which you can build into your own application," Howard said.
Other trends were less surprising, such as more support for moving and transforming unstructured data found in documents or Web pages, Howard said. ETL vendors are also incorporating data-quality functions, through development and acquisitions like Redwood City, Calif.-based Informatica Corp.'s purchase of Dublin data-quality vendor Similarity Systems and San Jose, Calif. and Paris-based Business Object SA's purchase of La Crosse, Wis.-based FirstLogic.
"[The ETL market] is changing. Historically, it was just ETL, but the leading vendors have all merged in data-quality capabilities and are also extending out to include EII [enterprise information integration] as well," Howard said.
Of the 40 tools evaluated, the overall winner was Informatica, according to Howard. Other leaders were Microsoft's and Oracle's ETL tools, which are "much better than they used to be," he added. Also of note were IBM, with its Ascential-based tools and Burlington, Mass.-based Sunopsis Inc., both of which fared well in Howard's evaluations. Business intelligence vendors such as Cary, N.C.-based SAS Institute and Business Objects also offer ETL tools, Howard said, though the latter is having more success with adoption outside its existing customer base.
ETL vendors will continue to play a leading role in data integration, Howard believes.
"I suspect that it's likely that some of the ETL vendors will start to play in the MDM [master data management] space as well," Howard predicted. "The bigger players are really broadening out their platforms."