System Design, Chapter 9: SQL vs. NoSQL

Relational databases accessed with SQL (Structured Query Language) were developed in the 1970s with a focus on reducing data duplication as…

System Design, Chapter 9: SQL vs. NoSQL

Relational databases accessed with SQL (Structured Query Language) were developed in the 1970s with a focus on reducing data duplication as storage was much more costly than developer time. SQL databases tend to have rigid, complex, tabular schemas and typically require expensive vertical scaling.

Examples: Oracle, MySQL, Microsoft SQL Server, and PostgreSQL

NoSQL databases were developed in the late 2000s with a focus on scaling, fast queries, allowing for frequent application changes, and making programming simpler for developers.

Examples: Document: MongoDB and CouchDB, Key-value: Redis and DynamoDB, Wide-column: Cassandra and HBase, Graph: Neo4j and Amazon Neptune

Reference: guru99

What are the Differences between SQL and NoSQL?

  1. Data Storage Model
  • SQL Databases: Tables with fixed rows and columns
  • NoSQL Databases: Document: JSON documents, Key-value: key-value pairs, Wide-column: tables with rows and dynamic columns, Graph: nodes and edges

2. Schemas

  • SQL Databases: Rigid
  • NoSQL Databases:Flexible

3. Scaling

  • SQL Databases: Vertical (scale-up with a larger server)
  • NoSQL Databases: Horizontal (scale-out across commodity servers)

4. Querying

  • SQL databases: It use SQL (structured query language) for defining and manipulating the data, which is very powerful.
  • NoSQL database: Queries are focused on a collection of documents. Sometimes it is also called UnQL (Unstructured Query Language). Different databases have different syntax for using UnQL.

5. Reliability or ACID Compliancy (Atomicity, Consistency, Isolation, Durability)

  • SQL Databases: Multi-Record ACID Transactions supported
  • NoSQL Databases: Most do not support multi-record ACID transactions. However, some — like MongoDB — do.

SQL VS. NoSQL — Which one to use?

When to use SQL

When you have relational data, this is the natural fit, of course. But you may be asking yourself how you identify the “mythical” natural fit. Well, when you look at your data, do you see distinct entities with well-defined relationships with one another that must be strictly enforced and/or navigable? If so, we have match!

When your focus is on data integrity, relying on a tried and true relational databases is a good bet. When you want flexible access to your data, the relational model and SQL allow for much greater support of ad-hoc queries. Additionally, databases like PostgreSQL have added excellent support for NoSQL-style workloads with features like native JSON data types. If you don’t need the scale-out capabilities of NoSQL data stores, they can be a good fit for some non-relational workloads as well.

When to use NoSQL

NoSQL is appealing when you have highly flexible data models or very specific needs that don’t fit into the relational model. If you are taking in a lot of unstructured data, a document database like MongoDB or CouchDB can be a nice fit. If you need very fast access to key-value data but can live without strong integrity guarantees, Redis is a great fit. Complex or flexible search across a lot of data? Elasticsearch is a great fit.

NoSQL data stores tend to be highly scalable, and scaling out is a core tenet of many of these systems. Built-in sharding makes scaling reads and writes out much easier than doing so with a relational database. Relatedly, NoSQL systems can often meet very high availability requirements. Databases like Cassandra have no single points of failure and your applications can trivially react to underlying failures of individual members.

[Bonus]What are the Drawbacks of NoSQL Databases?

One of the most frequently cited drawbacks of NoSQL databases is that they don’t support ACID (atomicity, consistency, isolation, durability) transactions across multiple documents. With appropriate schema design, single record atomicity is acceptable for lots of applications. However, there are still many applications that require ACID across multiple records.

To address these use cases MongoDB added support for multi-document ACID transactions in the 4.0 release, and extended them in 4.2 to span sharded clusters.

Since data models in NoSQL databases are typically optimized for queries and not for reducing data duplication, NoSQL databases can be larger than SQL databases. Storage is currently so cheap that most consider this a minor drawback, and some NoSQL databases also support compression to reduce the storage footprint.

Depending on the NoSQL database type you select, you may not be able to achieve all of your use cases in a single database. For example, graph databases are excellent for analyzing relationships in your data but may not provide what you need for everyday retrieval of the data such as range queries. When selecting a NoSQL database, consider what your use cases will be and if a general purpose database like MongoDB would be a better option

Thanks for reading!!