ETL tools: What you do and don't want
What should I look for in extract, transform and load (ETL) tools? What should I avoid?

    Requires Free Membership to View

    When you register, you'll begin receiving targeted emails from my team of award-winning writers. Our goal is to keep you informed on the hottest data and information management trends today.

    Hannah Smalltree, Editorial Director

    By submitting your registration information to SearchDataManagement.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchDataManagement.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

Ahh. ETL tools. Can't you just do it all with COBOL? (Insert laugh track here.)

Seriously, there are some core functions that most ETL tool vendors offer today. These include:

  • The ability to deal with multiple input data formats (such as flat files, SQL databases, SAS data sets, etc.)
  • Support of a GUI interface that allows the "drag and drop" of data elements from source to destination
  • Built-in data profiling and analysis functionality allowing the examination of source data
  • Built-in data transformation functions (such as data type conversion, date reformatting, etc.)
  • Support for data quality/cleansing functionality (either built-in, or via interfaces to popular data quality tools like Group 1, DataFlux, or Business Objects)
  • Metadata support, allowing for the capture and identification of data transformation and business rules
  • Job scheduling functionality
  • Version control of the ETL maps
  • The ability to manipulate data either inside or outside the target database (aka: ETL versus ELT)

What you should avoid

My ETL guys suggest staying away from code-generating ETL tools, which are difficult to troubleshoot and de-bug. They also recommend steering clear of ETL tools that don't provide error tracking and logging functionality.

And a final thought: ETL isn't just for breakfast anymore. What I mean by this is that ETL isn't exclusive to data warehouse projects, but can solve a host of data movement and transformation problems. Best of luck with your tool selection!

For more information on ETL tools

This was first published in June 2007