In Percona Managed Services, one key aspect of managing the databases is configuring backups. Percona XtraBackup is one of the best tools for performing physical database backups.

It is a good practice to compress the backups to save costs on storage and to encrypt the backups so those can’t be used if the files are compromised as long as you keep your encryption keys safe!

Percona XtraBackup supports both compression and encryption of the backups. When the data is too big (tens of TB and more), the backups can take several hours or even days to complete. In order to speed up the backup process along with the compression and encryption, we can use multiple threads. We can specify the number of threads to be used for each operation (copy, compression, encryption) in the Xtrabackup command.

For example, we can use:

parallel: to specify the number of threads for copying data.
compress-threads: to specify the number of threads that will compress.
encrypt-threads: to specify the number of threads that will encrypt data.

Given that the usual workflow is:

copy -> compress -> encrypt

We should use more threads to copy than to compress or encrypt. I won’t get into the details about how those should be configured as that depends on every scenario, but generally, we could use as many threads as CPUs on the server. 

If you have multiple CPUs on the server, you can increase the number of threads for each operation.

Note that multithreading backups will perform better on DBs where there are many big tables compared to when there are many small tables and a few big or huge tables, as each table will be processed by a single thread, so the bottleneck would still be to process those big tables by a single thread as in the example I will present here.

OK, now we have configured Xtrabackup with multiple threads; what to do if my backup is still taking too long, and in the log, there doesn’t seem to be any progress? 

Reviewing the log can be a bit hard as there is a lot of information. But let’s try to understand it with the following example.

Consider the following Xtrabackup command ran on Apr 20th:

NOTE: The example server has 48 CPUs. Even though the sum of threads configured (parallel 36 + compress 24 + encrypt 16) is more than the actual CPU count, it doesn’t mean that all threads would be working at the same time, as the copy passes the work to the compress thread and then to an encrypt thread, in the meantime, some threads might be idle.  

We’re using parallelism with 36 threads to copy, 24 to compress, and 16 to encrypt.

The backup takes several hours to complete, so we need to review the status. For that, we check the latest entries from the log:

Is it stuck? Is it progressing at all?

Let’s take a look. 

First, we connected to MySQL and found that replication is blocked by the backup (caused by the –lock-ddl flag), so there are no new writes to MySQL, that’s why the log sequence is not moving (437325895905570).

From the processlist:

The above-blocked thread is the SQL thread.

Because of the above, it is expected that the log sequence number is not moving.

Then, we can ignore the lines of “log scanned” to remove “clutter” from the log:

We can see something like this:

From the above output, we can see that the number of the thread is just after the date and time, and it prints which table it was processing.

NOTE: In Percona XtraBackup 2.4, the thread number is printed between brackets [], for example:

In our example, we ran it with 36 threads, so it is expected to see numbers up to 36.

After each thread finishes what it was doing, it prints the label “Done:” followed by the table it has just finished processing, as we can see here:

NOTE: In XtraBackup 2.4, it was only printed the label “…done” without further information as follows:

So, if we search for the “Done:” text in the log and print the last ones, we can see the following:

NOTE: For XtraBackup 2.4, the command we can use is this one: 

We can see that there are recent entries (assuming that we are checking on Apr 22nd). So we can assume that the backup is still running.

From the above list of threads, it seems that only thread 12 is missing; thus, it may still be processing some work. So let’s search for what thread number 12 is doing:

It is still compressing and encrypting a large table since two days ago. So we check if the file size keeps moving:

We can see that the size of the file keeps increasing, so my backup is still running, and it is not stuck. Note that the table is 5T in size, and it is being processed by a single thread, and that’s the reason for the big duration of the backup. Enabling InnoDB compression and encryption at rest could help to avoid using encryption and compression with the backups.

Conclusion

We can use multiple threads to speed up the backup process with Percona XtraBackup, but even with multiple threads, the backups can take several hours or days to complete if the data is large.

It may be easy to assume that the backup is stuck if the log is not printing any different message for a few hours, but we need to make sure that it is running by reviewing the log.

Reviewing the log can be a bit overwhelming as it prints too much information. In this blog, we covered how to review what the backup process is doing with basic Linux text parsing commands such as grep, tail, etc.

Percona XtraBackup is a free, open source, complete online backup solution for all versions of Percona Server for MySQL and MySQL. It performs online non-blocking, tightly compressed, highly secure backups on transactional systems so that applications remain fully available during planned maintenance windows.

Try Percona XtraBackup today

Subscribe
Notify of
guest

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments