E11000 duplicate key error’ is an error that you might have encountered during the restore process. In this blog, we will talk about in what scenario you might face this error when restoring Oplog for PITR (point-in-time recovery).

You might wonder why this error will come during PITR as the operations in the Oplog are idempotent, meaning they always result in the same change to the database no matter how many times they’re performed. Now let’s see in which scenario you might face this error while applying Oplog.

I created a collection book with four documents with a unique compound index. So as per the application logic, the document is first inserted, updated, and deleted, but when a new document is re-inserted, it is created with the same keys and values on which a unique index is created.

Index:

Now your application logic is written in such a way that it is doing insert, update, delete, and again inserting the document with the same values in the number and author keys on which the unique index is created. Below we have already inserted four docs, and now, we will update one of the below documents.

  1. First insert:


    Corresponding ops in Oplog:
  2. After update:


    Corresponding op in Oplog:

  3. After remove:


    Corresponding op in Oplog:

  4. Below we will insert a new document:


    Corresponding op in Oplog:

  5. Take mongodump of database london:

  6. We will again insert a new doc:


    Corresponding op in Oplog:

  7. Take incremental Oplog backup.

    After first Oplog dump:


    Above is the last document of the first Oplog backup, i.e., till {“$timestamp”:{“t”:1679561151,”i”:2}}.

    After second Oplog dump (incremental):

    Above is the starting document of the second/incremental Oplog backup i.e., $timestamp”:{“t”:1679561151,”i”:3}}

     

  8. Drop database:

    Corresponding op in Oplog:

  9. First, we will restore the database from the dump, which we took in step five:

    The above-restored documents match until step five before taking the dump of london database:

    Corresponding ops in Oplog after restore from dump:

  10. Now we will check in Oplog backup to which document the data has been recovered and from which Oplog file we need to apply ops. We can see that documents in the first Oplog backup have been restored in step nine. To verify, below is the last ops entry in the first Oplog backup:

    Now we need to replay the second Oplog backup just before the drop command ops (we already have the time of drop database command in step eight) for PITR (we can see below ops is already available, but we cannot split the BSON file based on the time or ops, so we need to apply full Oplog slice):

    We can see Oplog replay got failed due to a unique index constraint, as we can see the ops associated with { number: 4.0, author: “Graham” } is already present in the database:

    Below are the ops from the incremental Oplog backup slice associated with { number: 4.0, author: “Graham” }. So if you see the first ops below is an insert op with a different _id (“o”:{“_id”:{“$oid”:”641c11c1d0495f80ac5e610f“}) which was inserted in the beginning. When Oplog tries to replay the below op, it sees that there is already a document with a different _id  associated with { number: 4.0, author: “Graham” }, and it cannot apply this op due to a unique index violation. Thus failing to apply the Oplog and PITR.

     

There are two solutions for the above issue:

    1. If the incremental Oplog backup is having only ops starting since the last op that is in the database backup.
    2. Have Percona Backup for MongoDB (PBM) configured and let PBM handle all the above manual processes (restoring dump + applying Oplog for PITR) automatically.

To overcome the above issue, I configured the PBM on the same replica set and took a backup (both full and incremental Oplog). Here’s how to install, set up, and configure PBM.

Below is the process I followed again from step one to step six for PBM, and below are the corresponding ops in Oplog:

Below are the two full backups taken + incremental Oplog backup:

Above, you can see the latest backup was taken till 2023-03-23T12:14:08, and incremental Oplog backup was taken till 2023-03-23T13:06:45.

Now we will drop the database:

Now we will restore the database and perform PITR using PBM:

Below are the logs for restore + PITR:

Below are the documents after restore + PITR via PBM:

Below are the Oplog entries after PBM restore, and we can see that PBM restored the relevant base backup first and started applying Oplog after the last op in the base backup.


Above, you can see that PBM has applied the latest backup and performed the PITR automatically. The reason that we didn’t face the ‘
E11000 duplicate key error’ during PITR using PBM is that PBM handles it automatically, from which Oplog entry it needs to apply the ops after the restore from a full backup. PBM will ensure consistency while restoring a full backup + incremental Oplog backup.

Here’s how Percona Backup for MongoDB works.

Conclusion

So above, we can see how to avoid the ‘E11000 duplicate key error’ using PBM automatically. Another way is also possible, as explained above, but that will require a manual process. Why go with a manual process when PBM is open source, does not require any license, and can handle it automatically?

Please check out our products for Percona Server for MongoDB, Percona Backup for MongoDB, and Percona Operator for MongoDB. We also recommend checking out our blog MongoDB: Why Pay for Enterprise When Open Source Has You Covered?

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

 

Download Percona Distribution for MongoDB Today!

Subscribe
Notify of
guest

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Richard

You had me really hoping — but I’m using PBM to restore automatically and I’m seeing that exact duplicate key error when restoring my database, which had no collection or database drops 🙁 I’m running out of ideas on how to fix it.

Gautam

Could you please elaborate a bit about your situation? Because this error is completely based on scenario.

  1. Using PBM to restore it automatically mean using logical or physical backup?
  2. Are you trying to restore on same replicaSet? If yes does that replicaSet contains other database as well or collections?
  3. Are you trying to restore database or single collection?