What constitutes big data is often a matter of individual choice. Different people have different ideas about what it is -- and what it isn't. And the lack of a common big data definition renders the term almost meaningless, according to Rick Sherman, founder of Athena IT Solutions, a BI, data warehousing and data management consultancy in Maynard, Mass.
In a video interview with SearchDataManagement, recorded at the 2014 TDWI Executive Summit in Boston, Sherman said that discussions about big data projects can quickly become confusing because of the lack of clarity about what the phrase really entails. "I cringe when I hear the term big data, because it's like we all are speaking a different language," he said.
To some people, big data refers specifically to Hadoop, the open source distributed processing framework developed by the Apache Software Foundation. Some big data definitions tie the phrase solely to semi-structured and unstructured data that doesn't fit well into mainstream relational databases -- for example, text from posts on social networks and sensor data captured from equipment connected to the Internet of Things. Others make room for large amounts of structured transaction data in the big data tent.
There's also the 3Vs of big data: volume, variety and velocity. That was an early stab at a standard big data definition, sometimes augmented by a fourth V -- variability, for example. But Sherman said it "also doesn't clearly define the term" in an understandable way. Watch the one-minute video to hear more of what he had to say about big data terminology and why it's hard to be sure of exactly what people mean when they use it.