Nmedia - Fotolia

Book author says big data NoSQL databases need proper application

Consultant Dan Sullivan, author of "NoSQL for Mere Mortals," discusses some of the pertinent points for IT shops to consider before entering the new world of NoSQL databases.

NoSQL databases crept into enterprises on the backs of Web-scale applications that typically involved large amounts of data or needed to serve large user populations. Even for many data management veterans who are steeped in the processes of working with mainstream relational databases, there is much to learn about big data NoSQL technology that is still something of a moving target. 

"Basically, people were confronting problems they hadn't really confronted before," said Dan Sullivan, an independent database consultant and author of NoSQL for Mere Mortals, a book published last month. "The NoSQL databases solved problems that couldn't be solved with existing tools."

But while the plethora of NoSQL products now available across distinct key-value, document and graph and column-family categories add new technologies to the developer's tool box, they can be a challenge to implement and maintain, according to Sullivan.

"With all the varieties of NoSQL, there is a higher potential for getting yourself in trouble," he said in an interview. "Relational databases have safeguards built in over many years. With NoSQL databases, that's not here -- yet."

Working with less of a safety net

That could cause problems for data professionals who come to the NoSQL party expecting create, read, update and delete (CRUD) features that come standard with SQL databases. Moreover, because there are a lot of different types of NoSQL databases, there are also a lot of ways they can be misapplied, added Sullivan, whose book covers NoSQL's roots, as well as its varieties and their technical innards.

Factors Sullivan pointed to that can influence technology selection between the different styles of NoSQL software include the expected volume of reads and writes, acceptable latency levels and the tolerance applications have for inconsistent data that can temporarily materialize in database replicas.

Dan Sullivan, independent database consultant and author

According to Sullivan, queries, which serve in a way after the fact to describe how data is meant to be used, are a good place to begin planning for a NoSQL implementation. But one big change is that with NoSQL, the data modeling role needs to become more collaborative than it usually is with SQL-based relational databases. Enterprise data architects and database administrators should expect to work with application developers in order to create solid data models that can support database querying, Sullivan said.

Software makers are working to create tools that can inspect code and unravel the data models inherent in NoSQL application designs. While some tools have emerged to provide views of MongoDB, Cassandra and other NoSQL technologies, there is still much ground to cover. For now, data management teams are often left to comb through NoSQL source code, or to query applications using trial and error to try and gain an understanding of data structures and elements.

NoSQL leaves fixed schema behind

In many cases, NoSQL databases are implemented with no fixed schema. That's especially true with document databases, which developers tend to update repeatedly as time goes by, Sullivan said during a presentation at the Enterprise Data World 2015 conference in Washington, D.C.

The lack of a need to define everything up front is a favorable development for NoSQL systems in the bustling world of Web applications, where data sets often have diverse structures and fields. But, Sullivan pointed out, the apparent lack of any structure can be deceiving, because ultimately, there is some degree of structure there.

For that reason, Sullivan prefers the term flexible schema to schema-less. Thought does need to go into a NoSQL database's creation, he said, but the purpose can be hard to discern if data teams and developers don't work together.

Sullivan emphasized that some things that are true in relational database design aren't necessarily the case in big data NoSQL environments. As examples, he cited data joins and data normalization. The approaches to joins and normalization of data tend to be quite opposite -- part and parcel of the process in relational databases, not always available in NoSQL ones.

A good understanding of both the old and new, and proper appreciation for the varied criteria of the different use cases they're most suited to are needed in order to build database systems that can handle what's being thrown at them in the big data era, Sullivan said.

Jack Vaughan is SearchDataManagement's news and site editor. Email him at [email protected], and follow us on Twitter: @sDataManagement.

Next Steps

Learn how to set up the right RDBMS for the job

Discover key considerations for operational DBMSs

Find out how NoSQL changes data modeling methods

Dig Deeper on Database management system (DBMS) architecture, design and strategy