Hello friends, at Percona Managed Services, we usually do different types of things every day, including routine tasks, monitoring, and, very frequently, answering questions that are not always easy to answer.

A few days ago, a client asked us the following question: “Hey Percona, I have a question and a problem simultaneously: I want to delete rows from a table from a specific date back because the data is not necessary. I tried to run the DELETE command, which gave me this error: Cannot delete or update a parent row: a foreign key constraint fails. Could you please help me?”

At first glance, the error message was obviously clear: the table from which rows were to be deleted had a child table, which prevented the execution of the DELETE directly.

“Don’t worry, we’ll take a look at the issue, and accordingly, we’ll proceed to suggest an execution plan,” we replied. And this is how this blog originates so that if you find yourself in the same situation, proceed similarly or at least consider some of the practices used to meet the objective.

First of all, let’s review the structure of the table, its size, and its constraints.

MySQL table

Let’s check the table size (the numbers of the table sizes are real, they don’t match the structure of the table, as I have removed the relevant columns to keep the whole thing anonymous).

The constraints:

Finally, we show you below the topology presented by the client in question:

This means the following:

  • A primary server, with Percona Server for MySQL installed (in Read/Write mode) version 8.0.31-23.
  • One replica with the same software version (in Read-Only and Super-Read-Only mode).

What can we say or respond to this situation?

  • First, the table has two child tables, not just one.
  • Second, the size of the tables in play is important.
  • The person who tried to run this in production did not do the following:
    • Run some previous backups of the rows to delete.
    • Check if the table had referential integrity, child tables.
    • Check the size of the table.
    • Check the database’s topology, whether one or more servers were replicating, if it was a Percona XtraDB Cluster, etc.

What does all this imply?

  • If he were able to execute the DELETE (let’s assume he googled something and found a way to do it by executing SET FOREIGN_KEY_CHECKS=0), not having made a previous backup, no backup, bad idea.
  • Having executed such a DELETE and considering the massive number of rows to be deleted, the replication in the replica/s would be affected.
  • Having executed the DELETE of only the parent table using SET FOREIGN_KEY_CHECKS=0, leaves the child tables with orphaned rows, with which we are violating the referential integrity of the model.
  • Depending on how much the application uses the table, such a high-volume transaction would have generated some lock, which in one way or another, would have caused downtime in the application.

So what should we do in these cases?  

When you need to do this type of operation in which a large number of rows are involved, with constraints, it is always advisable to carry out certain checks, make a backup of the information before it is deleted, delete the data from the child tables using the criteria of the parent table (with pt-archiver).

Very well. We are going to do the necessary tests for this to be carried out successfully, for which we will use the following test environment:

Below is the test topology (as you can see, we’re using the same version of Percona Server for MySQL).

Once created and loaded the tables (you can use this link to do so), we have something like this:

Now, for the test, let’s remove a significant number of rows from the table. For this, we will execute this query in test.orders and use the CreatedDate field as a reference:

Let’s try doing the same thing the client did to check if he was right:

The procedure

  1. Backup: the most crucial step in this process. In this way, we make sure that if for some reason, the data is needed again, it can be reinserted in the same table or another table in another schema.

Let’s create the /home/percona/Backup directory and check if there is enough disk space for such a backup (based on the table size, it should be at least the same or less). For this, we will use the replica so as no impact the operation of the primary instance.

For the backup, we will use mydumper, which allows us to use the following:

  • WHERE clauses filter the rows.
  • Use several threads in parallel to read the data (the default is 4).
  • Compress the backup file.

Next, these are the commands to execute for the backups.  And we verify that each step finished OK:

Let’s verify that the backup files exist and have data:

  1. Data purging: here, we will use the pt-archiver utility. Why?
  • It allows us to eliminate rows in chunks in a controlled way.
  • We can use WHERE clause (since we are going to need it).
  • It allows you to control the lag in the replica or replicas (very important).
  • We can interrupt (but not undo) the purging using the –sentinel flag.

Let’s start with test.exceptions table first. We will run all the archivers on the primary server as follows:

We verify that there are no rows that meet the purging criteria, and we verify how many rows are left in the table.

Let’s continue with test.personalizations table:

We verify that there are no rows that meet the purging criteria:

To finally finish with the test.orders table:

We verify that there are no rows that meet the purging criteria:

Conclusion

When you need to do this type of operation in which a large number of rows are involved, with the constraints fence, it is always advisable to carry out all the previously mentioned checks, make a backup of the information before it is deleted, delete the data from the child tables using the criteria of the parent table (with pt-archiver) and constantly control the replication process during the process (in fact, the tool will not allow the lag to be greater than the one stipulated in the command –max-lag=30).

Finally, remove the data from the originally required table.

This process may take longer than a simple and traditional DELETE. Still, my friend, the most important thing is the availability of the data, integrity, and good response times. Always.

Bonus track

If necessary, you can restore the deleted rows using the backup as follows:

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Try Percona Distribution for MySQL today!

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments