michelangelus - Fotolia

Problem solve Get help with specific problems with your technologies, process and projects.

In building the big data future, architectural issues add up

Building a big data architecture is more complicated than setting up a data warehouse to support BI applications. But that creates new opportunities for data architects to shine.

Building the back-end systems that support business intelligence and analytics applications used to be relatively simple -- or at least straightforward. You'd set up a data warehouse and consolidate transaction data in it, then maybe spin out some data marts with subsets of the information for individual departments or groups of users. But as we move toward the big data future, things aren't so simple anymore. New technologies, such as Hadoop, stream processing systems and NoSQL databases, have entered the picture. Older ones -- columnar databases, in-memory processing tools -- have also become more prevalent in recent years, spurred partly by big data uses.

And there's no easy recipe for mixing all those technologies together with mainstream relational databases to create a big data architecture. William McKnight, president of McKnight Consulting Group in Plano, Texas, uses the term no-reference architecture to describe the current state of affairs. "Every company is different," he said in a video interview with SearchDataManagement in February 2014. "Gone are the days when a vendor or a consultant could walk into a shop with a laminated sheet of paper and say, 'This is what everybody needs to do.'"

Perhaps that's one reason why Gartner Inc. analyst Svetlana Sicular found twice as many data architect job listings as data scientist ones in a search for Hadoop-related positions in the New York area on the jobs site Dice.com, as detailed in an April 2014 blog post. Sicular added that inquiries from her clients had recently shifted to questions about "no-nonsense big data architecture, management and real-time use cases."

SearchDataManagement and its companion site, SearchBusinessAnalytics, have published a variety of content offering insight and advice to help organizations figure out the way forward on architecting a big data infrastructure. In his video interview, McKnight expands on the lack of uniformity in big data ecosystems. In another video Q&A, John Myers, an analyst at Enterprise Management Associates, discusses the mix of data management technologies being tapped to support big data applications. A case study looks at the deployment of a cloud-based big data platform at supermarket co-op Allegiance Retail Services, while another story delves into the clear-eyed thinking that's needed in evaluating and selecting big data technologies.

Writing as part of our BI Experts Panel, consultant Rick van der Lans examines the competitive importance of big data -- and of corporate execs who understand the technologies that can be used to exploit it. Also as part of the panel, consultants Claudia Imhoff and Colin White detail a proposed method for extending traditional data warehouse architectures to handle today's expanded data needs. But another panelist, Wayne Eckerson, says it's time to stop dissing the data warehouse -- according to Eckerson, it still has a key role to play in IT architectures, even in the bright, shiny big data future.

Craig Stedman is executive editor of SearchDataManagement. Email him at cstedman@techtarget.com and follow us on Twitter: @sDataManagement.

Next Steps

Read about the big data programs at health system UPMC and financial services firm CIBC

Get advice on key factors to consider in designing a big data analytics architecture

Find out why multiple big data tools are often needed to handle processing workloads

Dig Deeper on Big data management

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

What's the most important thing a big data architect should do in designing a systems infrastructure to support big data applications?
the only way forward is to develop big data reference architectures that are customized to the sector say public sector/agencies based on their usecases DURING DATA PROCESSING CYCLE