Home > Data management / BI News > Greenplum brings data warehousing in the cloud indoors
Data management / BI News:
EMAIL THIS

Greenplum brings data warehousing in the cloud indoors

By Jeff Kelly, News Editor
24 Jun 2009 | SearchDataManagement.com

News on data management trends and technology
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google

You don't have to explain data mart proliferation to Clint Johnson. As vice president of data warehousing and business intelligence at Zion Bancorporation, Johnson is all too familiar with the concept.

When so-called power users at Zion wanted to do complex data analysis that wasn't supported by the bank's business intelligence (BI) applications – like predictive modeling for customer retention -- Johnson would extract the data from source systems and databases and let power users create their own data marts.

The result was departmental data marts everywhere, largely out of the control of IT, Johnson said. Not only were the data marts and the hardware required to run them difficult to manage, but the marts were disconnected from one another, meaning different groups and departments were often working with different sets of data.

But rather than trying to stop or even slow data mart sprawl, a task he conceded was almost impossible, Johnson decided to take a different approach with a new private cloud computing initiative. Zion is now in the process of implementing Greenplum's Enterprise Data Cloud (EDC), essentially a virtual infrastructure of commodity hardware upon which the bank will deploy and run Greenplum's data warehouse platform.

Once the system is operational, Zion workers will be able to create and dismantle data marts on an internal or private cloud, as it's called, via a self-service portal, as demand dictates. Irrespective of which department they work in, they will be able to tap into the same, consistent data sources on the internal cloud, where IT will also have a one-stop shop for managing the platform. No more data marts living in the shadows, as Johnson puts it.

More on data warehousing
Decide whether to stick with SAP's Business Information Warehouse or look elsewhere for an enterprise data warehouse  

Check out Gartner's latest data warehousing Magic Quadrant report and see who made the leaders' quadrant  

Find out how real-time data loading can help data warehouses support operational business intelligence
The idea behind EDC, which was announced earlier this month, is to bring the elasticity of public cloud computing inside the corporate firewall, allowing easy self-service provision of data warehouses and data marts, as well as centralized manageability, said Ben Werther, director of product management at Greenplum.

The more conventional method of data warehouse deployment and management – deploying one large data warehouse backed up by weeks and months of data modeling and cleansing – is too rigid for today's workers, who want fast and flexible access to data for BI and other analytics, according to Werther. In that environment, rather than waiting for IT to create a data mart, workers often take the initiative and create their own, he said. That leads to data mart proliferation, an unmanageable situation for IT.

With EDC, customers get the best of the public cloud model – easy provisioning of data marts, unfettered access to data sources, and centralized management of the cloud itself – without the risk -- letting sensitive corporate data outside the firewall, Werther said. And it is economically possible, he said, because, unlike many competitors' offerings, Greenplum's database runs on cheaper commodity, not proprietary, hardware.

"Hardware is very cheap now. You can buy 1,000 cores of servers for under $1 million, much less than buying a Teradata machine," Werther said. Eventually, data warehousing on internal clouds "is going to become, we think, the way of doing data warehousing."

James Kobielus, an analyst with Cambridge, Mass.-based Forrester Research, tends to agree with that assessment. He thinks data warehousing in the cloud, or virtual data warehousing, as he also calls it, is the wave of the future.

"Data warehousing is increasingly moving away from being a discipline with a focus on centralized analytic databases or a single physical node [or enterprise data warehouse] to a more virtualized data warehousing ecosystem, or cloud," Kobielus said. "It's a highly, massively parallel grid of nodes that collectively manage multiple data analytics instances."

Some nodes can focus on data integration functions like extract, transform and load (ETL), others on data cleansing, others on provisioning new data marts, he added. "The idea is that it is very flexible."

Greenplum is in a particularly good position to push the cloud deployment model, Kobielus said. In addition to running on commodity hardware -- unlike competitors such as HP's NeoView and Vertica's Analytic Database, for example -- Greenplum's database uses massively parallel processing to simultaneously query large data sets – a prerequisite for a virtualized, distributed environment like the cloud.

But EDC will not have the immediate effect of giving IT one internal cloud to manage, he cautioned, because customers will have to standardize on Greenplum. Most organizations operate in heterogeneous environments, with data warehouses and data marts from multiple vendors. "That's just a reality," Kobielus said.

And, while Greenplum may have a head start on the field, other data warehouse vendors are likely to join them in the data warehousing in the cloud market. Just this week, for example, IBM announced its Smart Business cloud portfolio, which will let customers run integrated software and applications in either a public or private cloud, both supported by Big Blue.

While IBM's new cloud portfolio does not currently support data warehouse deployments, "we are definitely looking at the opportunity to deliver those types of services in the cloud" as customer demand increases, said Dennis Quan, director of autonomic computing at IBM.

Microsoft may also be a candidate to eventually begin offering data warehousing on internal clouds. Already, for example, it offers data warehousing in the public cloud built on its own cloud platform, Azure, which debuted last year, and SQL Data Services.

Nevertheless, "the cloud model is still in an embryonic stage" when it comes to data warehousing, Forrester's Kobielus said, and it could be a year or more before it truly begins to mature. "Few vendors have put together a coherent story going forward to help the industry and help users jump to the next plateau of development of data warehousing in a purely cloud environment."

But that assessment hasn't stopped Zion Bancorporation, where Johnson is counting on EDC to reduce maintenance and support costs caused by proliferating data marts and to break down barriers between groups and departments, giving all workers a unified way to find, access and analyze corporate data.

With implementation under way, the bank hopes to have EDC fully deployed by the end of the year. Then, Johnson said, as many as 50 of Zion's "most seasoned analysts" will begin accessing around 4 terabytes of data in the internal cloud and creating manageable data marts of their own.

"We're going to give direct database access to end users and the ability to upload their own data and create their own data warehouses," Johnson said. "It's to give them a place [the private cloud] to work where they can do complicated things without having to remove the data."



Tags: Data warehouse project managementData warehouse softwareDatabase management systems (DBMS) architecture and designVIEW ALL TAGS

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Data warehouse project management
Why pay for a data warehouse appliance when you can get one free?
Teradata takes a logical approach to data warehousing appliances
BT taps open source BI software, homegrown DW to unlock customer data
Teradata VP talks data warehouse appliances, reveals cloud and SSD plans
Commodity hardware aiding data warehouse appliance performance, costs
Future of data warehousing shaped by open source, MDM, the economy
What does MapReduce and in-database technology mean for data warehouses?
Three data warehouse project management metrics
Introduction to enterprise master data management
To avoid enterprise data mashup madness, plan ahead and keep it simple

Data warehouse software
Why pay for a data warehouse appliance when you can get one free?
In-database analytics pulls together SAS, data warehouse vendors
Teradata takes a logical approach to data warehousing appliances
BT taps open source BI software, homegrown DW to unlock customer data
Bill pushes for data warehouse, XBRL to track TARP funds
Teradata VP talks data warehouse appliances, reveals cloud and SSD plans
Data Warehouse Platforms Product Directory
Commodity hardware aiding data warehouse appliance performance, costs
What does MapReduce and in-database technology mean for data warehouses?
Columnar databases, appliances, cloud computing top BI trends

Database management systems (DBMS) architecture and design
Definition of primary, super, foreign and candidate key in the DBMS
What is the difference between a logical and physical warehouse design?
What are some emerging data warehouse and DBMS trends?
Data Warehouse Platforms Product Directory
Designing for performance: Strategic database application deployments
An introduction to database transaction management
Database access security: network authentication or data encryption?
Executing SQL statements using prepared statements and statement pooling
Static SQL vs. dynamic SQL for database application performance
How to get data/database independence with a three-tier architecture

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
data modeling  (SearchDataManagement.com)
extract, transform, load  (SearchDataManagement.com)
OLAP  (SearchDataManagement.com)
tree structure  (SearchDataManagement.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary



Data Management: Business Intelligence, Data Integration, Data Compliance
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts