How to Assess MySQL Performance

Predicting application performance is a difficult art, but an important one when choosing the target deployment environment. Among the different components of modern software solutions, the database is one of the most critical. Regardless of whether the computing platform to be evaluated is on-prem, containerized, virtualized, or in the cloud, it is crucial to consider several essential factors. In this blog, we aim to call out some key considerations when trying to assess MySQL performance for your application. We will do this by conducting experiments using HammerDB and sysbench — two popular workloads with different approach and focus — to evaluate MySQL performance. We will not concern ourselves with the raw throughput of workload. Instead, focus on understanding what the workloads exercise to help us determine how to best use them to aid our performance assessment.

Know your target platform

Everything starts with knowing the target. There are many times we get asked why some cloud instance performed poorly for their database application and almost always turned out to be some configuration error. Therefore, before we attempt to measure our database performance, we should know the system or cloud instance to be tested in detail. This allows us to know our operating environment and its capability. As database performance is heavily influenced by the performance of storage, network, memory, and processors, we must understand the upper limit of these key components. There are several ways to find out this information with the easiest way being by referring to the documentation. For example, if you are buying the latest Amazon memory-optimized EC2 instance (R7iz), the AWS page (https://aws.amazon.com/ec2/instance-types/) tells us the following:

  • Up to 3.9 GHz 4th Generation Intel Xeon Scalable processors (code-named Sapphire Rapids)
  • Up to 20% higher compute performance than z1d instances
  • Up to 50 Gbps of networking speed
  • Up to 40 Gbps of bandwidth to the Amazon Elastic Block Store (EBS)

We can also verify these capabilities by running some simple benchmarks on the different subsystems. For storage, FIO is generally used. For the network, we can use Iperf to assess the network bandwidth between the client and the database server to ensure it will be enough to meet our peak requirement.

Benchmarking the target

Two of the more popular database benchmarks for MySQL are HammerDB and sysbench. Since we are posting this blog on the HammerDB blog site, we will not describe the HammerDB workload in this blog. Instead, readers should refer to the HammerDB documentation.

sysbench is a widely used open-source benchmarking tool that is designed to evaluate the performance of CPU, memory, disk I/O, and database systems. In the context of MySQL performance evaluation, it simulates a typical online transaction processing (OLTP) workload on your MySQL database. The OLTP workload simulates a typical online transaction processing workload, where multiple clients execute transactions on a database concurrently. This all sounded very similar to HammerDB TPROC-C workload (we will look at HammerDB TPROC-H (OLAP) another time), so it is easy for any reader to think both workloads are about the same. Let’s look at how the workloads behave when running on an identical system

We ran both sysbench and HammerDB on a system with:

  • Processors: Two Intel Xeon 8360Y processor sockets (36 Core/72 Threads per socket). We used the first processor socket for the MySQL database and the second socket for the benchmark (sysbench or HammerDB).
  • Operating System: Ubuntu 22.04
  • Storage: The system has a SATA drive for the operating system and one NVMe (Intel SSD D7-P5510 (3.84 TB)) for storage of database tablespaces and logging.
  • Database: MySQL 8.0.31

We applied the following settings for both benchmarks to make it easier to stress the system. MySQL has thousands of other settings and is beyond the scope of this blog.

  • skip-log-bin
  • innodb_flush_log_at_trx_commit=0
  • innodb_flush_method=O_DIRECT_NO_FSYNC
  • innodb_doublewrite=0
  • innodb_buffer_pool_size=64000M
  • HammerDB Settings:
    • diset tpcc mysql_dbase tpcc
    • diset tpcc mysql_count_ware 1000
    • diset tpcc mysql_rampup 2
    • diset tpcc mysql_duration 5
    • vuset vu 256
  • Sysbench Settings:
    • Create sysbench: sysbench –mysql-user=user –mysql_password=password –mysql-host=dbserver –tables=214 –table-size=2000000 /usr/share/sysbench/oltp_read_write.lua prepare
    • –rand-type=uniform –report-interval=1 –time=900 –mysql-db=sbtest oltp_read_write run

Both workloads also support many, probably hundreds if not thousands, of tunable. We are using the most basic, default, configurations for a database of about 100GB. Please refer to this tuning guide to tune the system for HammerDB: Open Source Database Tuning Guide on 3rd Generation Intel® Xeon® Scalable Processors Based Platform. For the experiments in this blog, we did not tune the system. Instead, use the default settings. The goal of this blog is not to get the best performance but to show the differences between the benchmarks and how they can give us insight from different angles.

The table below shows the performance characteristics of both benchmarks captured using the System Activity Reporter (SAR) tool in Linux:

sysbench HammerDB sysbench/
HammerDB
CPU METRICS
cpu_utilization_% 66.83 97.81 0.68
cpu_utilization_%_in_kernel_mode 29.70 7.04 4.22
%usr 38.40 91.72 0.42
%sys 9.52 3.41 2.79
%iowait 36.51 0.04 835.79
%idle 12.01 2.89 4.16
DISK METRICS
sysstat_io_nvme0n1_avg_wait_ms 8.32 0.21 39.23
sysstat_io_nvme0n1_read_kB/s 2,535,38 14,289 177.44
sysstat_io_nvme0n1_write_kB/s 904,422 248,075 3.65
NETWORK METRICS
sysstat_net_ens260f1_rx_kB/s 38,241 22,700 1.68
sysstat_net_ens260f1_tx_kB/s 675,910 34,712 19.47
OTHER METRICS
average context switches/sec 1,610,78 320,699 5.02
avg interrupts/sec 888,675 256,589 3.46

From the above table, we noticed the following:

  • HammerDB saturates the CPU and spends a larger portion of time in user mode (91.72% for HammerDB vs. 38.40% for sysbench). sysbench spends a higher portion of time in system mode as compared to HammerDB (9.52% for sysbench vs. 3.41% for HammerDB). From this data, it can be argued that HammerDB is a better candidate to test MySQL database engine and CPU performance.
  • sysbench exerts greater pressure on the storage and network subsystem as compared to HammerDB. The %iowait for sysbench was very high even though we used a fast NVMe drive for the database and log files, ~37% iowait for sysbench vs. almost 0% for HammerDB. This suggested that sysbench is a better candidate for testing storage and network performance.

By default, HammerDB is designed to take advantage of database system caching mechanisms such as buffer caches, query caches, or statement caches. This means, HammerDB prioritizes caching the data as much as possible and pushing the CPU utilization. HammerDB does this using the concept of “home warehouses” where approximately 90% of the workload takes place. Each HammerDB virtual user is assigned a “home warehouse” at the start of the workload and which remains fixed. Hence if you configure a schema with 1000 warehouses and run a test with 10 virtual users by default most of the workload will be concentrated upon 10 warehouses. In the case of sysbench, if the database tables are too large to fit entirely in the main memory, it accesses data from the disk more frequently leading to more disk I/O and wait times. Also, sysbench makes a network roundtrip for every SQL Statement whereas HammerDB wraps a more complex workload in stored procedures resulting in only one network call per stored procedure.

We will have a follow-up blog looking into what the database is doing and how the CPUs are being used. Also, a separate blog looking into the detail of how each of these workloads used the storage. Be sure to check back. Or if you already conducted studies that answered these questions, please share. We can also collaborate.

Based on user requirements and priorities, HammerDB and sysbench are both great workloads to evaluate database performance. Also, instead of viewing them as mutually exclusive, they can also be used in tandem to get a more holistic view of the system being studied. In addition, both these workloads are highly configurable and can be configured as per the user’s requirements.

Size does matter!

When testing, we need to make sure that we tested with the configuration that matters most to our application. For example, if we want to evaluate moving our OLTP applications from MySQL 8.0.27 to MySQL 8.0.32, different test scenarios can give very different conclusions. The table below showed the result of running HammerDB on MySQL 8.0.27 and 8.0.32 on identical hardware, with identical settings, but at different load levels. We tested concurrency at 80 VU’s and concurrency at 256 VU’s.

HammerDB 4.5 TPROC-C running on

Intel Xeon 8360Y processor with 36-core (72-threads)

MySQL 8.0.27 MySQL 8.0.32
Concurrency at 80 VU’s 1.00 0.93
Concurrency at 256 VU’s 1.00 1.03

At 80 VU concurrency, we observed a 7% performance regression moving from MySQL 8.0.27 to 8.0.32. But at 256 VU concurrency, we would expect to get a 3% performance gain by moving from MySQL 8.0.27 to 8.0.32.

Summary

For any benchmark, it is critical that we use the configuration that is most representative of our deployment environment to get the most accurate assessment. Overall, HammerDB and sysbench are both valuable tools for evaluating MySQL performance.

  • HammerDB aimed to make it easier to stress test the database engine scaling to handle more connections, locking, and a high number of CPU cores.
  • sysbench has a higher demand for storage and medium stress on the CPU and network making it a good candidate for stress testing the platform.

Authors