Jumbo2010 - Fotolia

Snowflake cloud data warehouse targets gaps big clouds overlook

Cloud data warehouse offerings from smaller vendors seek to address functionality gaps that bigger players may miss. Newcomer Snowflake Computing targets concurrent queries, for example.

While big cloud players like Amazon Web Services and Microsoft gain attention for their cloud data warehouse offerings, smaller players are looking to compete as well. As in the past, the object is to find gaps in data services that meet users' needs that the bigger players may overlook.

Along with Cazena, Treasure Data Inc. and others, Snowflake Computing Inc. wants to tailor software services for companies moving at least some on-premises data warehouses to the cloud. For its part, Snowflake wants to address querying issues that may hamper more general-purpose cloud data warehouse offerings.

With a recent release of its Snowflake Elastic Data Warehouse, the company seeks to improve on the bigger players in terms of speed and scalability of queries against the cloud data warehouse. According to Snowflake, that means providing multi-cluster warehousing that better supports users' queries during periods of heavy use, and adaptive query result caching that tunes often-repeated queries in order to ensure high performance for reports and the like.

Focus on concurrent queries

Snowflake has worked out how to not just be fast in the cloud, but fast across a group of users, according to John Myers, managing research director at Enterprise Management Associates Inc. Myers pointed to that as a potential selling point for Snowflake versus Amazon's market-leading Redshift cloud data warehouse.

"What they are trying to do is address some of the issues people face with Redshift," he said. "There are some issues there when you get into concurrent use and lots of people are hitting on the platform at the same time."

Myers said users who are new to the Amazon cloud often encounter obstacles as they move from a prototype or sandbox implementation of a cloud data warehouse to a fully utilized one.

"It works in the sandbox, but then when it starts to act on a lot of queries for a wider breadth of people, the performance of Redshift may slow down," he said. "It gets grumpy." Companies like Snowflake want to fill that gap, he said.

Myers marks pricing as another part of Amazon Redshift that can catch users by surprise. As they put more data on the cloud, and scale up from prototype to full-fledged implementation, ''they find, if not hidden costs, then unanticipated ones,'' he said.

Amazon's bundled pricing for compute and storage can create issues as data warehouses scale up, according to Myers and others. Snowflake has tried to address that with a pricing model that, in the company's words, is aligned toward ''pay-for-what-you-use.'' An approach based on unbundling compute and storage cost was also seen in Microsoft's Azure SQL Data Warehouse formal release last month.

Managed data services in the cloud

Snowflake's ability to run as a managed service was an important point for Matt Solnit, CTO at SOASTA. The Mountain View, Calif., company, which provides cloud and web-based load testing and performance monitoring software and services, also uses Redshift but was looking for something more turnkey to handle some of its data warehousing processes.

"We wanted something we could use as a managed solution -- to be able to turn it on and go," Solnit said.

SOASTA's first use of the Snowflake Elastic Data Warehouse was as a special store for reports on unstructured data representing all the elements involved with customers' website page performance. A next step will be a near-real-time implementation.

"Our core competency is not about managing data. We've built scalable systems with PostgreSQL, Redshift and other tools. But we want to relieve ourselves of dealing with day-to-day managing of database systems," he said. "Just because you can do it doesn't mean you should."

He said his group is focusing its efforts around achieving world-class analytics and answering people's questions, rather than building and managing a data warehouse.

Solnit said he had previously encountered issues with Amazon's compute-storage price policy. "You can end up paying for storage and for horsepower you don't need," he said. "Amazon's great in a lot of ways. But Redshift didn't map with some of our needs."

Finding the right products to map to needs will continue as more options become available. When Microsoft opened up its Azure SQL Data Warehouse for business last month, it was hailed for bringing competition to an area -- data warehouse on the cloud -- that seemed to have become very much the bailiwick of Amazon and Redshift.

While the players at the top of the cloud can be expected to continually update their warehouse tools, smaller players like Snowflake also bear watching.

Next Steps

Look inside Microsoft's recent data warehouse on Azure

Find out what's going on in managed cloud services for data

Journey to the vaults: Amazon unveils Redshift

Dig Deeper on Database management system (DBMS) architecture, design and strategy