For companies like SumAll Inc., the cloud is the obvious place to put the large, varied and rapidly updated data sets they collect. But even cloud-based big data systems can be complex to set up.
SumAll, which was founded in 2011, is a marketing analytics services provider that helps small businesses identify Web trends, track website traffic and measure the success of their social media advertising and outreach efforts. The New York company processes data on about 30 million discrete online events on a daily basis, according to CIO Korey Lee -- and the volume is growing.
Lee said that eventually, he expects to build an internal systems architecture. But for now, he added, the kind of data traffic that SumAll has to handle makes it "almost prohibitively expensive over time to grow your number of servers as necessary. You want to focus on your business, not your infrastructure."
That's a common consideration for startups and emerging businesses looking at big data in the cloud implementations. "If you're a small company, the cost of putting in your own data center is something you have to think about," said Ray Ullmer, a senior consultant at data warehousing and analytics consultancy Waterloo Data in Austin, Texas.
SumAll decided to go to the public cloud. But it hasn't been a simple, straightforward path. Lee said the company shifted a deployment of NoSQL database MongoDB from Amazon Web Services to Rackspace US Inc.'s ObjectRocket platform "because Amazon's cost was too high." For querying and analyzing data, it also switched from a self-managed MySQL database running on AWS to BitYota's data warehouse as a service technology. That runs on AWS as well, leaving SumAll with a multi-cloud environment, though relying on the BitYota managed service mitigates some of the complexities for Lee's team.
Read an in-depth look at how broadly organizations are managing big data in the cloud
Get consultant David Linthicum's thoughts on the challenges of designing a cloud-based big data architecture
Cloud data doesn't exist on an island at InsideTrack -- read about its data integration efforts