Skip to content

Solving the SQL or NoSQL Conundrum

JustCLI edited this page Jan 17, 2021 · 2 revisions

The SQL or NoSQL conundrum seems to be universal. It is a common puzzle among the software designer community. In this write-up, I try to revisit some of the aspects of this conundrum. In doing so, I may be able to give you an idea on how to solve this conundrum.

To understand the puzzle better, let's understand this ACID thing that everyone talks about

Atomic

Atomic means all or no changes. Say you have requested changes to two values but you want them to be done as one unit. An atomic database allows you to do it. For example, if you want to read the bank balance and then deduct some amount, you would need atomic change to be supported by the database. Otherwise, you may read the bank balance as 10000.00. Then you issue a command to deduct 9000.00 from it. But in the meanwhile, another process came and updated the balance to be 2000.00. To avoid it, you need support for atomicity. SQL or NoSQL, all databases support atomicity.

Consistent

In the SQL world, it means that the database verifies every column of a row before accepting the change. So, if you have defined the second column to be number, the database will not allow you to write a name in it. NoSQL databases do not believe in such rules. They allow users to write anything the user wishes. In the NoSQL world, the meaning of consistency is different. In this world, consistency refers to a single view of data. If you have a database cluster, consistency would ensure that a change in one server is reflected in another instantly. But this is not feasible. Every NoSQL database faces the consistent view hurdle. We will discuss this later in this write-up. The consistent word causes a lot of confusion while comparing SQL with NoSQL.

Isolated

Simply speaking, it means that till all the changes in a bundle (termed transaction) are completed, no one will even know that changes are going on. For others, the changes only become visible when all are completed. Till that time others see the existing data (before any change). This feature is well supported by both SQL and NoSQL databases.

Durable

This means once you all your bundled changes are reported complete by the database, you will never lose it. Even if the database crashes, these changes will remain once the database is up again. It is just like saving to a file. In fact, it is in reality nothing but flushing the data to files. This is where NoSQL databases struggle. This is another hurdle for NoSQL databases for reasons we will discuss later in this write-up. Note that many NoSQL databases claim durability.

Now that we have revisited the ACID behaviour, let us look at the issues. On the face of it, NoSQL provides every feature that SQL already gives you. Besides, it is faster and scalable. It also supports data in any form. That means NoSQL is SQL + some more. The reality is a little different. As long as you have a single NoSQL server, you can (almost) consider it as SQL + some more. Well almost, unless you need the advanced power of query language. But in real life, you do not run a standalone NoSQL server (for good reasons). The moment you have a bunch of NoSQL server, the problem of consistency starts. A change on one NoSQL server can not be instantly reflected on another. So, everyone can not have the same view all the time. Fashionably this weakness is called eventual consistency. Amazingly, weakness is marketed as a feature. Another issue with NoSQL is the in-memory nature of NoSQL servers. This is what makes a NoSQL server blazing fast. This is what allows NoSQL servers to scale upon demand. This is what makes a NoSQL server even worth considering an option. But the same virtue of NoSQL comes with an inherent weakness. It is the loss of durability. When you do a transaction on a NoSQL server, the changes are done in memory. These changes are written in files later. If the server crashed before the file-write, the changes will be lost. To overcome this problem, NoSQL servers use what is called a commit logs or journals. They dump the changes in a file before updating the in-memory data. If the server dies, the write-ahead file still allows it to recover the change. But this also comes at a cost. There are many reasons for this which are too boring to discuss. In short, durability is another weakness of NoSQL.

How To Decide

The solution to this SQL or NoSQL puzzle is not universal. For big systems, it is generally not either of them. It is mostly both of them. For other systems, NoSQL can be your default choice unless

— you need sacrosanct data writes. E.g. in core banking or ticket booking etc.

— or, you need advanced power of queries (yes you can do it NoSQL, but no thank you).

— or, your data write rate is not high and you do not expect it to rise exponentially and you have SQL expertise in your team.

In real life, NoSQL rarely replaces SQL completely. Rather, the scope of SQL-managed data is limited and NoSQL is used for most of the data.

It is worth noting that with the selection of SQL database, you must be ready with necessary resources like a DBA, SQL programmers etc. These skillsets are increasingly becoming rarer.

P.S.

If you think NoSQL has clearly won over SQL, here is a food for thought for you.

The Facebook, Twitter and Uber of the world, all use SQL databases as part of their core system.

Clone this wiki locally