EXPERT RESPONSE
Ahh. ETL tools. Can't you just do it all with COBOL? (Insert laugh track here.)
Seriously, there are some core functions that most ETL tool vendors offer today. These include:
- The ability to deal with multiple input data formats (such as flat files, SQL databases, SAS data sets, etc.)
- Support of a GUI interface that allows the "drag and drop" of data elements from source to destination
- Built-in data profiling and analysis functionality allowing the examination of source data
- Built-in data transformation functions (such as data type conversion, date reformatting, etc.)
- Support for data quality/cleansing functionality (either built-in, or via interfaces to popular data quality tools like Group 1, DataFlux, or Business Objects)
- Metadata support, allowing for the capture and identification of data transformation and business rules
- Job scheduling functionality
- Version control of the ETL maps
- The ability to manipulate data either inside or outside the target database (aka: ETL versus ELT)
What you should avoid
My ETL guys suggest staying away from code-generating ETL tools, which are difficult to troubleshoot and de-bug. They also recommend steering clear of ETL tools that don't provide error tracking and logging functionality.
And a final thought: ETL isn't just for breakfast anymore. What I mean by this is that ETL isn't exclusive to data warehouse projects, but can solve a host of data movement and transformation problems. Best of luck with your tool selection!
For more information on ETL tools
|