Big data is an incredibly popular topic. Plentiful articles, books, and blogs have been written on this topic and countless sessions have been presented that discuss some aspect of big data. I am a big fan of big data, because organizations have been able to improve and extend their analytical capabilities through their big data systems, resulting in various business benefits.
But I am also aware that big data is sometimes being overhyped. Take for example this statement: “Big data: A revolution that will transform how we live, work and think”, and “Without big data, you are blind and deaf in the middle of a freeway.” These and many others are all hyperboles. The danger is that big data is being over oversold and that this creates a false or skewed picture of big data and leads to false expectations.
Therefore, I decided to write a number of blogs on big data myths. Each myth is based on statements that repeatedly pop up in articles and sessions. I call them the big myths on big data.
The first big myth I want to address is that big data is sometimes presented as the goal of a project or system. Wrong, the goal is never to develop a big, fat database. Can you imagine a meeting in an IT department that ends with a statement like: “Ok guys, let’s see if we can develop the biggest database on the planet.” This is not how it works in real life. That can never be a goal.
Nor is big data ever a question. Users never call the IT department with a question such as: “Please, can you develop a big database for me?”
If big data is something, it’s an answer, or, in fact, it’s part of an answer. But then, if big data is part of an answer, what’s the question? In almost all the big data systems, the question is analytics. Organizations want to increase, enrich, or extend their analytical capabilities, and quite often this requires more data, big data. So, it’s almost always analytics that’s driving big data systems.
Do not confuse analytics with reporting. Reporting is primarily about presenting what has happened, whereas analytics is mainly aimed at showing what may happen and at influencing what’s going to happen. Typical examples of analytics that may require big data are: improving product development, optimizing business processes, optimizing the operations of machines, improving the level of customer care and customer delight, and personalizing products.
So, the goal is not big data, the goal is to improve and extend the analytical capabilities of an organization. However, sometimes, with all these technical discussions on Hadoop and NoSQL, and so on, we tend to forget this. The following somewhat crude saying can apply in big data projects: “When you are up to your ass in alligators, it’s difficult to remember that your initial objective was to drain the swamp.” I think this applies to some big data projects. The goal is not big data itself, it’s analytics.