In this podcast Q&A, Rick Sherman, founder of consulting firm Athena IT Solutions and an adjunct professor at Northeastern University's Graduate School of Engineering, gives his take on the capabilities and limitations of cloud data warehouse services and cloud-based big data management platforms.
Sherman said cloud data warehousing technologies are relatively new, but vendors both large and small have now made them available. "It reminds me a lot of when we had the first wave of data warehouse appliances, where we had a wide variety of vendors and a wide variety of architectures," he said. "We have a varied lot as far as underlying architectures go and an evolving set of capabilities in these [services] right now." Most of the vendors aren't new to data warehousing software, "but the cloud is a newer type of offering for them," he added.
Building a data warehouse can be "a very time-consuming and extensive implementation," Sherman noted. And it can be a difficult process to manage, especially for smaller businesses with limited resources. He pointed out that cloud-based data warehouse offerings bundle together a package of data management services; in addition, companies don't have to deploy and manage their own systems and storage infrastructure. "So, you're probably up and going a lot quicker," he said.
Sherman sees the cloud options as particularly useful for companies that are just starting out in data warehousing or looking to expand their warehousing architectures into new areas. For example, he said, one of the good opportunities for using a cloud data warehouse is to support special-purpose data marts and new types of business intelligence and analytics applications. For companies with an existing enterprise data warehouse infrastructure, transferring data into a cloud service might require more effort than is worthwhile, "especially if you have an extensive number of data sources and if there's a large volume [of data] on it."
In the interview with SearchDataManagement Executive Editor Craig Stedman, Sherman also said he thinks cloud-based Hadoop repositories and NoSQL database services will be the primary way that big data systems are implemented in the future. In-house big data implementations require a significant investment of skills, time and infrastructure, he said. Whether big data cloud services are ready now for enterprise uses depends on what users are looking to do with them, according to Sherman. Many of the companies he works with are still in the process of understanding their potential uses for Hadoop clusters and NoSQL software. But in the long run, he said, the cloud "is clearly the direction of where Hadoop and NoSQL databases are going to be for the vast majority" of organizations.
Find out why consultant David Linthicum thinks Amazon Redshift is a data warehousing game-changer
Read about new database buying considerations prompted by the growth of the big data cloud
Watch a video Q&A with consultant Colin White on managing big data systems in the cloud