2019 Open Source Database Report: Top Databases, Public Cloud vs. On-Premise, Polyglot Persistence

2019 Open Source Database Report: Top Databases, Public Cloud vs. On-Premise, Polyglot Persistence

Ready to transition from a commercial database to open source, and want to know which databases are most popular in 2019? Wondering whether an on-premise vs. public cloud vs. hybrid cloud infrastructure is best for your database strategy? Or, considering adding a new database to your application and want to see which combinations are most popular? We found all the answers you need at the Percona Live event last month, and broke down the insights into the following free trends reports:

2019 Top Databases Used

So, which databases are most popular in 2019? We broke down the data by open source databases vs. commercial databases:

Open Source Databases

Open source databases are free community databases with the source code available to the general public to use, and may be modified or used in their original design. Popular examples of open source databases include MySQL, PostgreSQL and MongoDB.

Commercial Databases

Commercial databases are developed and maintained by a commercial business that are available for use through a licensing subscription fee, and may not be modified. Popular examples of commercial databases include Oracle, SQL Server, and DB2.

Top Open Source Databases

MySQL remains on top as the #1 free and open source database, representing over 30% of open source database use. This comes as no surprise, as MySQL has held this position consistently for many years according to DB-Engines.

2019 Most Popular Open Source Databases Used Report Pie Chart - ScaleGrid

PostgreSQL came in 2nd place with 13.4% representation from open source database users, closely followed by MongoDB at 12.2% in 3rd place. This again could be expected based on the DB-Engines Trend Popularity Ranking, but we saw MongoDB in 2nd place at 24.6% just three months ago in our 2019 Database Trends – SQL vs. NoSQL, Top Databases, Single vs. Multiple Database Use report.

While over 50% of open source database use is represented by the top 3, we also saw a good representation for #4 Redis, #5 MariaDB, #6 Elasticsearch, #7 Cassandra, and #8 SQLite. The last 2% of databases represented include Clickhouse, Galera, Memcached, and Hbase.

Top Commercial Databases

In this next graph, we’re looking at a unique report which represents both polyglot persistence and migration trends: top commercial databases used with open source databases.

We’ve been seeing a growing trend of leveraging multiple database types to meet your application needs, and wanted to compare how organizations are using both commercial and open source databases within a single application. This report also represents the commercial database users who are also in the process of migrating to an open source database. For example, PostgreSQL, the fastest growing database by popularity for 2 years in a row, has 11.5% of its user base represented by organizations currently in the process of migrating to PostgreSQL.

So, now that we’ve explained what this report represents, let’s take a look at the top commercial databases used with open source.

2019 Most Popular Commercial Databases Used with Open Source Report Pie Chart - ScaleGrid

Oracle, the #1 database in the world, holds true representing over 2/3rds of commercial and open source database combinations. What is shocking in this report is the large gap between Oracle and 2nd place Microsoft SQL Server, as it maintains a much smaller gap according to DB-Engines. IBM Db2 came in 3rd place representing 11.1% of commercial database use combined with open source.

Cloud Infrastructure Breakdown by Database

Now, let’s take a look at the cloud infrastructure setup breakdown by database management systems.

Public Cloud vs. On-Premise vs. Hybrid Cloud

We asked our open source database users how they’re hosting their database deployments to identify the current trends between on-premise vs. public cloud vs. hybrid cloud deployments.

A surprising 49.5% of open source database deployments are run on-premise, coming in at #1. While we anticipated this result, we were surprised at the percentage on-premise. In our recent 2019 PostgreSQL Trends Report, on-premise private cloud deployments represented 59.6%, over 10% higher than this report.

Public cloud came in 2nd place with 36.7% of open source database deployments, consistent with the 34.8% of deployments from the PostgreSQL report. Hybrid cloud, however, grew significantly from this report with 13.8% representation from open source databases vs. 5.6% of PostgreSQL deployments.

2019 Open Source Databases Report: Public Cloud vs Private Cloud vs On-Premise Pie Chart - ScaleGrid

So, which cloud infrastructure is right for you? Here’s a quick intro to public cloud vs. on-premise vs. hybrid cloud:

Public Cloud

Public cloud is a cloud computing model where IT services are delivered across the internet. Typically purchased through a subscription usage model, public cloud is very easy to setup with no large upfront investment requirements, and can be quickly scaled as your application needs change.

On-Premise

On-premise, or private cloud deployments, are cloud solutions dedicated to a single organization run in its own datacenter (or with a third-party vendor off-site). There are many more opportunities to customize your infrastructure with an on-premise setup, but requires a significant upfront investment in hardware and software computing resources, as well as on-going maintenance responsibilities. These deployment types are best suited for organizations with advanced security needs, regulated industries, or large organizations.

Hybrid Cloud

A hybrid cloud is a mixture of both public cloud and private cloud solutions, integrated into a single infrastructure environment. This allows organizations to share resources between public and private clouds to improve their efficiency, security, and performance. These are best suited for deployments that require the advanced security of an on-premise infrastructure, as well as the flexibility of the public cloud.

Now, let’s take a look at which cloud infrastructures are most popular by each open source database type.

Open Source Database Deployments: On-Premise

In this graph, as well as the public cloud and hybrid cloud graphs below, we break down each individual open source database by the percentage of deployments that leverage this type of cloud infrastructure.

So, which open source databases are most frequently deployed on-premise? PostgreSQL came in 1st place with 55.8% of deployments on-premise, closely followed by MongoDB at 52.2%, Cassandra at 51.9%, and MySQL at 50% on-premise.

2019 Percent of Open Source Databases Using an On-Premise Infrastructure Report - ScaleGrid

The open source databases that reported less than half of deployments on-premise include MariaDB at 47.2%, SQLite at 43.8%, and Redis at 42.9%. The database that is least often deployed on-premise is Elasticsearch at only 34.5%.

Open Source Database Deployments: Public Cloud

Now, let’s look at the breakdown of open source databases in the public cloud.

SQLite is the most frequently deployed open source database in a public cloud infrastructure at 43.8% of their deployments, closely followed by Redis at 42.9%. MariaDB public cloud deployments came in at 38.9%, then 36.7% for MySQL, and 34.5% for Elasticsearch.

2019 Percent of Open Source Databases Using a Public Cloud Infrastructure Report - ScaleGrid

Three databases came in with less than 1/3rd of their deployments in the public cloud, including MongoDB at 30.4%, PostgreSQL at 27.9%, and Cassandra with the fewest public cloud deployments at only 25.9%.

Open Source Database Deployments: Hybrid Cloud

Now that we know how the open source databases break down between on-premise vs. public cloud, let’s take a look at the deployments leveraging both computing environments.

The #1 open source database to leverage hybrid clouds is Elasticsearch which is came in at 31%. The closest following database for hybrid cloud is Cassandra at just 22.2%.

2019 Percent of Open Source Databases Using a Hybrid Cloud Infrastructure Report - ScaleGrid

MongoDB was in 3rd for percentage of deployments in a hybrid cloud at 17.4%, then PostgreSQL at 16.3%, Redis at 14.3%, MariaDB at 13.9%, MySQL at 13.3%, and lastly SQLite at only 12.5% of deployments in a hybrid cloud.

Open Source Database Deployments: Multi Cloud

On average, 20% of public cloud and hybrid cloud deployments are leveraging a multi-cloud strategy. Multi-cloud is the use of two or more cloud computing services. We also took a look at the number of clouds used, and found that some deployments leverage up to 5 different cloud providers within a single organization:

Average Number of Clouds Used for Open Source Database Multi-Cloud Deployments - ScaleGrid Report

In our last analysis under the Cloud Infrastructure breakdown, we analyze which cloud providers are most popular for open source database hosting:

2019 Most Popular Cloud Providers for Open Source Database Hosting Pie Chart - ScaleGrid

AWS is the #1 cloud provider for open source database hosting, representing 56.9% of all cloud deployments from this survey. Google Cloud Platform (GCP) came in 2nd at 26.2% with a surprising lead over Azure at 10.8%. Rackspace then followed in 4th representing 3.1% of deployments, and DigitalOcean and Softlayer followed last representing the remaining 3% of open source deployments in the cloud.

Polyglot persistence is the concept of using different databases to handle different needs using each for what it is best at to achieve an end goal within a single software application. This is a great solution to ensure your application is handling your data correctly, vs. trying to satisfy all of your requirements with a single database type. An obvious example would be SQL which is good at handling structured data vs. NoSQL which is best used for unstructured data.

Let’s take a look at a couple polyglot persistence analyses:

Average Number of Database Types Used

On average, we found that companies leverage 3.1 database types for their applications within a single organization. Just over 1/4 of organizations leverage a single database type, with some reporting up to 9 different database types used:

Average Number of Database Types Used in an Organization - ScaleGrid Report

Average Number of Database Types Used by Infrastructure

So, how does this number break down across infrastructure types? We found that hybrid cloud deployments are most likely to leverage multiple database types, and average 4.33 database types at a time.

On-premise deployments typically leverage 3.26 different database types, and public cloud came in lowest at 3.05 database types leverage on average within their organization.

Average Number of Database Used On-Premise vs Public Cloud vs Hybrid Cloud - ScaleGrid Report

Databases Types Most Commonly Used Together

Let’s now take a closer look at the database types most commonly leveraged together within a single application.

In the chart below, the databases in the left column represent the sample size for that database type, and the databases listed on top are represent the percentage combined with that database type. The blue highlighted cells represent 100% of deployment combinations, while yellow represents 0% of combinations.

So, as we can see below in our database combinations heatmap, MySQL is our most frequently combined database with other database types. But, while other database types are frequently leveraged in conjunction with MySQL, that doesn’t mean that MySQL deployments are always leveraging another database type. This can be seen in the first row for MySQL, as these are lighter blue to yellow compared to the first column of MySQL which is shows a much higher color match to the blue representing 100% combinations.

The cells highlighted with a black border represent the deployments leveraging only that one database type, where again MySQL takes #1 at 23% of their deployments using MySQL alone.

Percent of Database Deployments Used With Another Database Type - ScaleGrid Report

We can also see a similar trend with Db2, where the bottom row for Db2 shows that it is highly leveraged with MySQL, PostgreSQL, Cassandra, Oracle, and SQL Server, but a very low percentage of other database deployments also leverage Db2, outside of SQL Server which also uses DB2 in 50% of those deployments.

SQL vs. NoSQL Open Source Database Popularity

Last but not least, we compare SQL vs. NoSQL for our open source database report. SQL represents over 3/5 of the open source database use at 60.6%, compare to NoSQL at 39.4%.

SQL vs NoSQL Open Source Database Popularity - ScaleGrid Report

We hope these database trends were insightful and sparked some new ideas or validated your current database strategy! Tell us what you think below in the comments, and let us know if there’s a specific analysis you’d like to see in our next database trends report! Check out our other reports for more insight on what’s trending in the database space: