Workload management tools key to running busy data warehouses

Jeff Kelly

In this section of the Data Warehouse Buyer's Guide, readers will learn how to use workload management (WLM) tools to get their data warehouses to run more efficiently.

Data Warehouse Buyer's Guide Table of Contents:

    Requires Free Membership to View

Selecting the right data warehouse platform for your organization
How to evaluate data warehouse software in five steps
Commodity hardware aiding data warehouse appliance performance, costs
Why you should build a data warehouse
Workload management tools key to running busy data warehouses

Part one of two.

Nobody ever said a data warehouse's job was an easy one.

Between loading data in real time, producing multiple daily reports and returning answers to deep-dive ad hoc analytic queries, today's data warehouses are being asked to handle more tasks from more users against more data than ever before.

But there's only so much disk space and CPU to go around, meaning that bottlenecks can occur and performance can slow to a halt.

"That's one of those things the user feels out on their desktop and it is obviously very important," said Chris Stewart, a senior architect with Premier, a healthcare firm in San Diego. Stewart said Premier has been working hard to keep up with the increasing demands put on its data warehouse.

"People expect a very quick response [from the data warehouse] every time," Stewart said.

So how can database administrators and architects like Stewart hope to keep enterprise data warehouses firing on all cylinders? That's where workload management (WLM) comes in, and it's more critical now than ever.

"There're a lot of different workloads that an enterprise data warehouse has to handle these days," said Jim Kobielus, an analyst with Cambridge, Mass.-based Forrester Research. "Workload management is what DBAs use day-in and day-out to keep this thing humming."

Workload management is what DBAs use day-in and day-out to keep this thing humming.

Jim Kobielus, analyst, Forrester Research

WLM refers to a set of capabilities built into the data warehouse that allow administrators to prioritize particular types of tasks -- taking into account who is making the requests and how time-sensitive the jobs are -- and allocate the finite amount of computing power available to complete them.

WLM is practically carried out by DBAs through monitoring tools that let them track how a data warehouse is performing and make adjustments accordingly. If a query from the CEO is being held up due to heavy usage, for example, a DBA could postpone other jobs and allow the CEO query to pass through.

"Think of workload management as a traffic cop," Kobielus said. Without it, there'd be chaos.

More data and more users tax data warehouses
Ten years ago, most data warehouses were only expected to produce a handful of reports, load data in batch form just daily or even weekly, and handle the occasional ad hoc query from a small pool of users.

Today, sophisticated data integration tools have made it possible for data warehouses to integrate data in near real-time in addition to the more common batch loading. More users can access data warehouses thanks to easy-to-use front-end business intelligence tools. And the volume of data entering and exiting data warehouses has skyrocketed and shows no sign of letting up.

The result is that WLM has grown in importance, according to Donald Feinberg, an analyst with Stamford, Conn.-based Gartner. In fact, in its latest Magic Quadrant report for data warehousing, Gartner declared that mixed workload performance is and "will remain the single most important performance issue in data warehousing."

The good news for customers is that almost all of the data warehouse vendors have some form of WLM capabilities, though some are better than others, Feinberg said. He singled out Teradata as having the most sophisticated, comprehensive WLM tools among the data warehouse vendors, though he said Oracle, IBM and Microsoft are not that far behind.

Large data warehouse vendors offer sophisticated workload management capabilities
It is not surprising, though, that the vendors with the most sophisticated WLM capabilities are also the ones with among the largest enterprise customers. WLM is most critical in environments where one large, central enterprise data warehouse is accessed by users throughout a large company.

In this scenario, the warehouse is expected to deal with far more traffic and queries than multiple departmental data warehouses and data marts would handle in a more distributed environment.

"The more centralized you are, the more WLM is critical to your success," said Randy Lea, head of products and services marketing at Teradata.

Teradata takes a two-pronged approach to WLM, according to Lea. First, the vendor's Teradata Active Systems Management (TASM) application allows customers to implement workload prioritizations based on predetermined business rules.

For instance, many Teradata customers give a higher priority to queries from senior-level executives than to queries from other staff. Others allocate more bandwidth to specific departments at specific times, depending on usage patterns. If the marketing department, for example, makes a lot of complex queries between 8 am and 10 am, it could be given higher priority during that two-hour period.

Alternatively, data warehouse jobs can be given priority based on size and complexity. Larger queries that need to run against large data sets can be set aside during peak usage, for example, to avoid slowing down the entire system.

"Not every query is of equal importance," Lea said.

TASM also allows DBAs to quickly make changes to the prioritization scheme so that bottlenecks, should they occur, can be mitigated.

But even smaller data warehouse vendors offer WLM. Aster Data, for one, "has exceptionally strong workload management for managing MapReduce jobs," Kobielus said.

Netezza also offers fully functional WLM capabilities, according to Phil Francisco, vice president of marketing at the Marlborough, Mass.-based data warehouse appliance vendor.

In addition to preset and on-the-fly workload optimization capabilities, Netezza gives customers the option of carving out a small slice of CPU that can be used to quickly handle short ad hoc queries when they occur.

Short query bias, as Netezza calls it, allows for shorter queries "to get in and get out of the system rapidly even though I may have all of these other things going on the machine," Francisco explained. "It's sort of like the express checkout lane at the grocery store."

Read why some users don't think WLM tools are doing a good enough job here.

Data Warehouse Buyer's Guide Table of Contents:
Selecting the right data warehouse platform for your organization
How to evaluate data warehouse software in five steps
Commodity hardware aiding data warehouse appliance performance, costs
Why you should build a data warehouse
Workload management tools key to running busy data warehouses

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: