A buyer's guide to selecting the best data warehouse product
A collection of articles that takes you from defining technology needs to purchasing options
Amazon Redshift is a hosted, large-scale data warehouse as a service offering. Designed for analytic workloads, the cloud data warehouse can be used with a site's existing business intelligence tools for data analysis.
Amazon Redshift delivers database features geared for analytic queries. This is in contrast to the company's other relational -- SimpleDB and Amazon RDS -- and nonrelational -- DynamoDB -- database management system (DBMS) technology.
Amazon Redshift cloud data warehouse features
Redshift offers massively parallel processing (MPP) that's built on a column-oriented DBMS. The infrastructure enables fast query processing using parallelized queries across multiple nodes. In addition to columnar storage, Amazon Redshift deploys data compression and zone maps to reduce the amount of I/O needed to perform queries.
Amazon Redshift is built on hardware designed for high-performance data processing. Local attached storage is used to maximize throughput between the CPUs and drives; a 10 Gb Ethernet mesh network ensures high-speed throughput between nodes.
The architecture of Amazon Redshift enables automation of many common administrative tasks, including provisioning, configuring and monitoring a data warehouse in the cloud. Data backup to Amazon Simple Storage Service (S3) is continuous, incremental and automatic. Data recovery from S3 is easy, with disaster recovery across regions.
Amazon Redshift offers built-in enterprise security, with features such as data encryption at-rest and in-transit using hardware-accelerated Advanced Encryption Standard (AES)-256 and Secure Sockets Layer protocol, key management using AWS Key Management System, and cluster isolation using Amazon Virtual Private Cloud.
The AWS Management Console or an API call is used to scale the Amazon Redshift cloud data warehouse. Users can readily modify the type and number of nodes in the data warehouse to adapt to their changing requirements. Dense Storage nodes allow users to create very large data warehouses using hard disk drives at a low price point. Dense Compute nodes can be configured to create high-performance data warehouses with high-speed CPUs, large amounts of RAM and solid-state disks.
Since Redshift is delivered and managed in the cloud, users must have an Amazon Web Services account. Amazon Redshift handles connections from other applications using standard Open Database Connectivity (ODBC) and Java Database Connectivity (JDBC). After setting up your AWS account and drivers, configure your firewall rules and you're ready to launch an Amazon Redshift Cluster.
Amazon documents new releases of Amazon Redshift by date on its website.
Amazon Redshift licensing, pricing and support
A free two-month trial of Amazon Redshift is available, after which you can start as low as $0.25 per hour with no long-term commitments. Users can scale up to petabyte-sized data warehouses, at a cost of $1,000 per terabyte per year. Your raw data will be compressed on Amazon Redshift, typically by about three times, reducing costs by as much as one-third when compared with uncompressed data.
Clients can start with as little as a single 160 GB DC1.Large node and scale up to a petabyte or more of compressed user data using 16 TB DS2.8XLarge nodes. While resizing, Amazon Redshift places your existing cluster into read-only mode, provisions a new cluster of your chosen size and then copies data from your old cluster to your new one. You can continue running queries against your old cluster while the new one is being provisioned.
There are four tiers of support for Amazon Redshift. Each tier provides an unlimited number of support cases with pay-by-the-month pricing and no long-term contracts:
- Basic provides 24/7 customer service and includes access to the resource center, service health dashboard, product FAQs, discussion forums and support for health checks at no additional charge.
- Developer adds best-practice guidance and a guaranteed response time to incidents of less than 12 hours.
- Business adds API support-guaranteed response time to incidents of less than one hour.
- Enterprise adds direct access to a technical account manager, infrastructure event management and guaranteed response to incidents of less than 15 minutes.
Amazon Redshift delivers customer-defined user functions
Are you an Amazon Redshift whiz? Take this short quiz to find out
Evaluating Redshift performance monitoring strategies