Q

A slow ETL process: Is there a 'best' ETL tool for a SYBASE IQ database?

We are trying to build our extract, transform, load (ETL) process using SAS Data Integration Studio and SYBASE IQ 12.7 as the DBMS. However, the ETL process is unacceptably slow (around nine to 10 hours per 1,000,000 records).

Are there other ETL tools (for example: Informatica, DATAStage) that can be integrated with SYBASE IQ to produce results comparable to working with ORACLE or DB2? Which ETL tool might best support SYBASE IQ database?

Shravan Miriyala is a senior consultant for Baseline Consulting. His specialty is data quality and master data management. Here's what Shravan says about the Sybase loading conundrum:

There may not be a cut-and-dry answer to this one. First, one needs to figure out the bottleneck—the slowest step—in their processing. What's causing the long processing times? The actual ETL steps, if not optimized from all angles (including tool specific behavior), can cause the slowdown irrespective of what tool is chosen. In order to make a realistic recommendation, we'd need more in-depth details about the processing, benchmarks at each processing step, specific platforms, and network info- among other things.

Having said that, all major ETL tools (including non-SAS tools such as Informatica and Ascential DataStage) have "native" drivers to Sybase (similar to all other major databases) as compared to say ODBC, JDBC, etc. The native drivers are the best possible interfaces to the database from ETL tool. Therefore, it's not the ETL-database driver interface that may be slowing the process down, which is really the heart of the question: Which ETL tool performs better with Sybase?

In conclusion, we need to look at the ETL process, source-target platforms, and complexity and then actually and systematically gather benchmarks at each step to identify the root cause of the performance degradation. Until then, it might be premature to examine alternative toolsets.

Jill Dyche, partner and co-founder of Baseline Consulting, adds her expertise to this answer:

Following up on the question about Sybase data loading, I reached out to my friends at SAS. Gary Mehler, Manager of Enterprise Data Integration, suggests a paper from the SAS Users Group International (SUGI) meeting. that covers the issue of database access throughput. Gary directs the reader to the section entitled "Multi-Row Writes" on Page 3. Configuring an "INSERTBUFF" option to a number much larger than the default (1) based on transaction-ality requirements should greatly improve throughput in cases where a million rows are being written at a time. It might be worth doing some research and tuning before turning toward alternative approaches.

This was first published in January 2007

Dig deeper on Extract transform load tools

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchAWS

SearchContentManagement

SearchOracle

SearchSAP

SearchSOA

SearchSQLServer

Close