Indexes are crucial for optimizing query execution times in databases, but having an excessive number of indexes, or redundant ones, can negatively impact performance. While pt-duplicate-key-checker is the go-to tool for identifying duplicate or redundant indexes in MySQL, it may not catch all duplicates.

In this blog post, we’ll put ourselves to the test and see if we can identify duplicate and redundant indexes in MySQL. Toward the end, we will identify what the pt-duplicate-key-checker doesn’t.

The unique quiz

Consider the following MySQL table definition. Let’s put our brains to work and note any of the duplicate or redundant indexes (play fair, don’t cheat):

While you work on noting down the duplicate indexes in that MySQL table, let me also add some descriptions for duplicate and redundant indexes.

Duplicate index

Duplicate indexes occur when two or more indexes have the same set of columns in the same order. These can occur accidentally due to poor database design or through the use of database management tools that automatically create indexes without checking for duplicates.

Redundant index

Redundant indexes occur when two or more indexes have some overlapping columns. While these may not be exact duplicates, they can still negatively impact database performance.

Both duplicate and redundant indexes can waste disk space and slow down write operations. Each additional index requires additional disk space and inserts, so updates and deletes have to update multiple indexes. Additionally, such indexes can make it harder for the query optimizer to choose the most efficient index, as it has more options to consider.

Test results

Now, I believe you have your list of duplicate keys ready. Let us see what our favorite pt-duplicate-key-checker tells us about the indexes of the table, along with the reasons why they are considered duplicate or redundant.

The pt-duplicate-key-checker notes nine duplicate indexes. Could you identify all nine of them? If so, surely you’ve good command over the database schema design. But I wouldn’t write a blog to test your compatibility with pt-duplicate-key-checker.

There is one more duplicate key that pt-duplicate-key-checker is missing; could you identify it? If so, I encourage you to apply at Percona and give me an opportunity to work with smarter brains.

The duplicate unique keys

For those who couldn’t identify the duplicate index, the unidentified duplicate keys are… (drum roll)…

It follows logically that if a tuple {a, b} is unique, then {b, a} will also be unique. Similar to how Peter Parker is to Spiderman and Gangadhar is to Shaktiman, the set {a, b} is equivalent to the set {b, a}.  This causes the unique key to double-enforce the uniqueness check.

Therefore, having an additional duplicate constraint defined on the same set of columns becomes unnecessary regardless of order. This is specifically true for two-column unique keys only. To optimize your database, you should consider dropping the second unique key or converting it to a secondary index if it is required.

Since you cannot go on and read all table definitions, I wrote a query for you to identify duplicate unique indexes:

Also, don’t forget to provide your opinion in the comments section: Should the non-identification issue with pt-duplicate-key-checker be considered a bug report or a feature request?

Conclusion

Percona’s pt-duplicate-key-checker is an amazing tool, but like every other tool, it is not “fool-proof.” While you create your indexes, evaluate them for duplicity.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

Subscribe
Notify of
guest

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Fernando Mario

Very interesting and useful blog, Kedar, keep going!