A slow ETL process: Is there a 'best' ETL tool for a SYBASE IQ database?
Are there other ETL tools (for example: Informatica, DATAStage) that can be integrated with SYBASE IQ to produce results comparable to working with ORACLE or DB2? Which ETL tool might best support SYBASE IQ database?
There may not be a cut-and-dry answer to this one. First, one needs to figure out the bottleneck—the slowest step—in their processing. What's causing the long processing times? The actual ETL steps, if not optimized from all angles (including tool specific behavior), can cause the slowdown irrespective of what tool is chosen. In order to make a realistic recommendation, we'd need more in-depth details about the processing, benchmarks at each processing step, specific platforms, and network info- among other things.
Having said that, all major ETL tools (including non-SAS tools such as Informatica and Ascential DataStage) have "native" drivers to Sybase (similar to all other major databases) as compared to say ODBC, JDBC, etc. The native drivers are the best possible interfaces to the database from ETL tool. Therefore, it's not the ETL-database driver interface that may be slowing the process down, which is really the heart of the question: Which ETL tool performs better with Sybase?
In conclusion, we need to look at the ETL process, source-target platforms, and complexity and then actually and systematically gather benchmarks at each step to identify the root cause of the performance degradation. Until then, it might be premature to examine alternative toolsets.
Jill Dyche, partner and co-founder of Baseline Consulting, adds her expertise to this answer:
Following up on the question about Sybase data loading, I reached out to my friends at SAS. Gary Mehler, Manager of Enterprise Data Integration, suggests a paper from the SAS Users Group International (SUGI) meeting. that covers the issue of database access throughput. Gary directs the reader to the section entitled "Multi-Row Writes" on Page 3. Configuring an "INSERTBUFF" option to a number much larger than the default (1) based on transaction-ality requirements should greatly improve throughput in cases where a million rows are being written at a time. It might be worth doing some research and tuning before turning toward alternative approaches.