Hey there! I’m Maciej from an outsource python development company that works on speech recognition software development, shareholder management software and more. I recently performed many database operations which took hours to complete. Waiting for too long in this way may force you to search for a better alternative to your current database engine, so I started looking for something new. To help me in my search, I tried to find a comparison based on database performance, but couldn’t find anything which suited me. That’s when I decided to create my own small comparison.
For my analysis I used 4 databases:
Here we have both SQL and NoSQL databases. The main difference between these two is that SQL databases, also called Relational Databases (RDBMS), have relational structure and NoSQL doesn’t use relations. SQL databases are vertically scalable, which means one ultimate machine will do the work for you. On the other hand, NoSQL databases are horizontally scalable, which means multiple smaller machines will do the work for you.
Comparing NoSQL and SQL databases is hard to do, because of differences in build. User Euphoric made a good comment about this on StackExchange:
The thing you are missing about NoSQL is that NoSQl cannot be compared to SQL in any way. NoSQL is name of all persistence technologies that are not SQL. Document DBs, Key-Value DBs, Event DBs are all NoSQL. They are all different in almost all aspects, be it structure of saved data, querying, performance and available tools.
But for this comparison I’m going to test only insert, select, update and remove operations which can be performed both in NoSQL and SQL.
Google trends shows that MySQL is a true champ in searches, but does it come with great performance? Let’s check it out!
MySQL is a full-featured, relational database management system sponsored by the company MYSQL AB, but still the source code is open source. It’s written in C and C++ and works with most current operating systems. In this comparison I used version 5.7 which will be supported to October 2023.
PostgreSQL is also an open-source, relational database management system. It is not controlled by a single company but is developed through community effort. PostgreSQL provides support for advanced data types and optimization.
Redis is an open source (BSD), in-memory key-value data structure store, which can be used as a database, cache or message broker. It’s a NoSQL database used in GitHub, Pinterest and Snapchat. Redis performance and atomic manipulation of data structures solves problems which can often be found with relational databases.
MongoDB is a general purpose, document-based, distributed database. It is another example of a NoSQL database. Record in MongoDB is a document, which is a data structure composed of field and value pairs.
After gathering some databases for comparison, I needed to use a language which would have good drivers for them. I didn’t have to look much into that because my main language, Python, provides a number of good drivers for these databases.
The drivers I used for the specified database are:
To manage all the databases in a single application I created Docker-compose, which handled the databases as services. Using Docker to manage the databases allowed me to omit installing every database locally.
Here is an example which I used to generate sample data:
In order to create similar execution of every operation, I created database dispatcher, which executed a specified operation for every database.
Here you can see the part for the insert operation:
For the comparison I used all CRUD operations, so we will be looking at create, read, update, and delete.
In the create category we can observe that Mongo and Redis are definitely better. At about 400–500 operations MySQL also looks good, but it decreases in performance after 600 records, with results similar to Postgres.
The select operation shows that SQL also has a bad time here. Performing it in MySQL and Postgres, time grows rapidly. For NoSQL it doesn’t matter how many records it needs to select, time is basically constant.
Time consumption for update is also much bigger for SQL databases, but the increase in time is similar for both databases. Time of operation for Redis and Mongo and MySQL is under 1 second for 5000 records. Only Postgres gives results over 10 seconds.
Here we have the biggest differences in results. The delete operation for Redis shows that having all records in memory allows for almost instant data removal. Time of removing records for SQL databases is stable. For Mongo we can see that the time of removal depends on the number of records.
This comparison shows the time for a specific type of operation and single use case of database. In most cases, Redis had the best performance because it is just key/value storage, so when I performed operations to find records with a specified value it was logical that it should win.
NoSQL relies on demoralization and creates optimization for the deformalized case. If we take a blog post for example, everything connected with single (text, comments, likes etc.) will be stored in a single document, so there won’t be a need to perform any join operations.
In this comparison we can see that single CRUD operations are much faster in NoSQL databases, but we still need to remember that SQL can perform many more operations. Besides this, the speed of the database depends on the application you are creating.
For other articles on programming you may refer to our software development blog from which you'll learn how to use Django Rest Framework more effectively, how to improve your coding skills, and more.