BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
We're looking at the issue of NoSQL vs. SQL databases. When does it make sense to consider using a NoSQL DB rather than a relational database?
Put simply, NoSQL databases are a better choice when you have data that doesn't fit well into tables. We have worked on SQL-based relational databases for about 40 years now; the result of all that work is that they are very good at handling transactions involving tabular data stored in rows and columns. We can also analyze such data very effectively in dimensional databases.
The kind of data that fits well into relational tables is known as atomic data, which simply means that we split the data up into the smallest components we want to manipulate. For example, we don't usually store the complete name of a customer in one field. If we're adding data about a customer named "Mr. James Mason" to a relational database --but we want to be able to find all of the people with the title "Mr." in the database and sort customers by both first and last name -- we would store the name data in three distinct columns in a table.
More on NoSQL databases
Learn about the wide variety -- and fit-for-purpose nature -- of NoSQL DB software
Read how a mix of MongoDB and SQL technologies are helping a marketing analytics firm
Watch this video Q&A to see why relational vs. NoSQL databases isn't always a black-and-white choice
However, a great deal of the data we are now collecting doesn't tabularize well. We're talking about images, sensor data, Word documents, Twitter feeds and so on – or what is often called big data. Even when we can put such data in tables, it may not be efficient to do so. For example, you could store every pixel of an image as a row in a relational table. But then you have to ask yourself, "What SQL code could I write to determine if the image includes a person?" I can't even begin to imagine what that would look like.
The good news is we have specific database engines that are built to hold and manage big data: NoSQL databases. Relational databases require us to impose what is called a schema in the data. Think of the schema as a way of organizing the data: In relational databases, we have to split up data into atomic units and then organize it as columns and rows in tables.
NoSQL database engines come in a variety of different types, so too much generalization can be misleading. But in general, they require only a very simple schema and sometimes no schema at all. For example, in some NoSQL database systems, we could put image files straight into the database without altering their structure. We could also put audio files into the same database. Getting data into the database becomes much simpler, and there's more flexibility on how that data is structured.
So if you have data that can't be put into tabular form in an elegant way, or that needs queries that can't be comfortably expressed in SQL, think about looking at the range of NoSQL database engines that are available.