Backups are crucial for every database system, and having a reliable, fast, and hot backup is the demand for next-generation database systems. Percona Backup for MongoDB (PBM) is a backup management tool that enhances the existing backup capability of MongoDB by providing various layers of backups such as physical, logical, incremental, PITR, etc.

In this blog post, we are going to see how we can use this backup tool in MongoDB topologies such as replica set and sharding. For the purpose of this demo, I have used a single instance/machine and the mlaunch tool to build the required topologies.

PBM setup/usage for replica set

1) So, let’s assume we already have set up a three-node replica set. 

2) Next, we will download the PBM tool from the official repo.

We can verify the installation and the PBM version as below.

3) Here, we will enable/configure authentication in MongoDB for PBM usage. The below commands need to be executed in the Primary node(localhost:27017).

 

Note: Some roles are built-in however we created one additional role (pbmAnyAction). The commands “db.getUsers()” & “db.getRoles()”  can be used to verify the creations.

4) Now, we will configure the MongoDB connection URL for the pbm-agent process. We need to add the entries in the file [“/etc/sysconfig/pbm-agent”] for the local Mongo node.

Note: A pbm-agent process connects to its localhost mongod node with a standalone type of connection. Do not set up the agent to connect to the replica set URI.

Further, we can persist those settings by defining in [“~/.bashrc”] profile of the user. As these settings could affect the PBM client, so it should connect to the replica set instead of the local node.

Let’s apply this to the current session as well:

5) Next, we can define the PBM configuration and storage-related details in the file [“/etc/pbm_config.yaml“]. Here we are doing the backup on the local system; however, we can define some cloud storage such as ( AWS S3 or Google Cloud storage).

Note: Please ensure to mount the same directory at the same local path[“/home/backups”] on all servers.

Then we can apply the changes below.

Output

6) Now, we will run the PBM agent process separately for all the Mongo nodes.

Note: Since we are running the entire setup on a single server we have used the above command line option to run the PBM agent service. However, in real world or production, we should use the proper service [“systemctl start pbm-agent”] to manage the agents.

6)  Finally, we can verify if all our configurations look good and if the pbm-agent connected fine. The below output looks healthy.

Output

7) Here, we are ready to take our first backup by simply executing the below single command via PBM CLI.

Let’s verify if the backup was completed successfully.

We can see the above folder/files generated after the backup. By default, PBM performs a logical backup unless we specify the “–type” of backup.

E.g.,

8) Now if we want to restore any of the backups out of that list, we can simply execute the below command.


Again we can validate if the restore is done successfully or not with the help of the below command.

Output

Next, we will see how we can perform similar activities in the sharded/distributed environment.

PBM setup/usage for sharding

1) So, here we have a sharding-based setup with nodes below.

2) Next, we will enable the authentication and create a user for PBM in each replica set (primary) instance, including the config servers. So, the user will be created in [config, shardA, and shardB] primary nodes only.

 

 

3) Here, we will configure the MongoDB connection URL for the pbm-agent process. We need to add the entries in the file [“/etc/sysconfig/pbm-agent“] for the local Mongo node, including the config node.

Further, we can persist those settings by defining in [“~/.bashrc“] profile of the user. In case of shared deployments, the PBM client should connect to the config server replica set.

4) Let’s define the PBM configuration and storage-related details in the file [“/etc/pbm_config.yaml“]. So here we are performing the backup in the local storage.

Then we can apply the changes below.

5)  Now, we will run the PBM agent process separately for all the Mongo nodes (data and config).

Note: Since we are running the entire setup on a single server we have used the above command line option to run the PBM agent service. However in the real world or production we should use the proper service [“systemctl start pbm-agent“] to manage the agents.

6)  Finally we can verify if all our configurations look good and the pbm-agent connected fine.

7) Next, we can take the backup, but before that, let’s fill our shard environment with some data to verify the data distribution post the restoration.

So, we now have some data on both shard01 and shard02.

8) Finally, let’s do some backup.

 

9) Again, if we want to restore the backup, we can just execute the simple command (pbm restore …). Let’s first clean the existing data so that we can later verify the fresh restore.

Let’s restore the backup now.

So, if we again connect to the router/mongos node, we can see the database is successfully restored now.

Monitoring/investigating PBM

There are a few ways by which we can investigate/monitor the PBM activity or logs for the backup/restore process.

Physical vs. logical backup

Physical backup is the copying of physical/disk files from Percona Server for MongoDB (PSMDB). While performing restores, the pbm-agents shut down the mongod nodes, cleaned up the data directory, and copied the physical files from the storage.

Logical backup denotes copying of the database data via a logical dump tool (mongodump). A pbm-agent connects to the database, retrieves the data, and writes it to the storage. While restoration the pbm-agent retrieves the data from the storage location and inserts it on every primary node in the cluster. The remaining nodes receive the data during the replication process.

E.g.,

Unfortunately, MongoDB does not support hot/physical backup in the community Mongo edition, so only logical backup will be possible.

Especially in the case of physical backup restorations, we might have to perform some additional steps mentioned below.

  • Restart all mongod nodes and pbm-agents.
  • Resync the backup list from the storage using “pbm config –force-resync –file/etc/pbm_config.yaml”.
  • Start the balancer and the mongos node.

Note: PBM backup by default will use the (“secondary nodes”) for backup based on election, and in case no secondaries respond, then the backup will be initiated on the Primary. We can also control the election behaviour by defining a priority for Mongo nodes.

Conclusion

In this blog post, we explored how simple and convenient to perform backup and restoration tasks using PBM in replica set and sharding topologies. PBM simplifies the whole process in such complex topologies, which might not be ideal with other logical options (MongoDump). Check out part two, covering more backup options and some other areas of the PBM. 

Percona Distribution for MongoDB is a source-available alternative for enterprise MongoDB. A bundling of Percona Server for MongoDB and Percona Backup for MongoDB, Percona Distribution for MongoDB combines the best and most critical enterprise components from the open source community into a single feature-rich and freely available solution.

 

Download Percona Distribution for MongoDB Today!

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments