Using Percona Backup for MongoDB in Replica Set and Sharding Environments: Part One

Backups are crucial for every database system, and having a reliable, fast, and hot backup is the demand for next-generation database systems. Percona Backup for MongoDB (PBM) is a backup management tool that enhances the existing backup capability of MongoDB by providing various layers of backups such as physical, logical, incremental, PITR, etc.

In this blog post, we are going to see how we can use this backup tool in MongoDB topologies such as replica set and sharding. For the purpose of this demo, I have used a single instance/machine and the mlaunch tool to build the required topologies.

PBM setup/usage for replica set

1) So, let’s assume we already have set up a three-node replica set.

localhost:27017
localhost:27018
localhost:27019

localhost:27017

localhost:27018

localhost:27019

2) Next, we will download the PBM tool from the official repo.

shell> sudo yum install -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm
shell> sudo percona-release enable pbm release
shell> sudo yum install percona-backup-mongodb

shell> sudo yum install -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm

shell> sudo percona-release enable pbm release

shell> sudo yum install percona-backup-mongodb

We can verify the installation and the PBM version as below.

shell> pbm version
Version:   2.3.1
Platform:  linux/amd64
GitCommit: 8c4265cfb2d9a7581b782a829246d8fcb6c7d655
GitBranch: release-2.3.1
BuildTime: 2023-11-29_13:30_UTC
GoVersion: go1.19

shell> pbm version

Version: 2.3.1

Platform: linux/amd64

GitCommit: 8c4265cfb2d9a7581b782a829246d8fcb6c7d655

GitBranch: release-2.3.1

BuildTime: 2023-11-29_13:30_UTC

GoVersion: go1.19

3) Here, we will enable/configure authentication in MongoDB for PBM usage. The below commands need to be executed in the Primary node(localhost:27017).

replset:PRIMARY> use admin;
switched to db admin
replset:PRIMARY>
replset:PRIMARY> db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",
...       "privileges": [
...          { "resource": { "anyResource": true },
...            "actions": [ "anyAction" ]
...          }
...       ],
...       "roles": []
...    });
{
"role" : "pbmAnyAction",
"privileges" : [
{
"resource" : {
"anyResource" : true
},
"actions" : [
"anyAction"
]
}
],
"roles" : [ ]
}
replset:PRIMARY>

replset:PRIMARY> use admin;

switched to db admin

replset:PRIMARY>

replset:PRIMARY> db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",

... "privileges": [

... { "resource": { "anyResource": true },

... "actions": [ "anyAction" ]

... }

... ],

... "roles": []

... });

{

"role" : "pbmAnyAction",

"privileges" : [

{

"resource" : {

"anyResource" : true

"actions" : [

"anyAction"

]

}

"roles" : [ ]

}

replset:PRIMARY>

replset:PRIMARY> db.getSiblingDB("admin").createUser({user: "pbmuser",
...        "pwd": "pbmuser",
...        "roles" : [
...           { "db" : "admin", "role" : "readWrite", "collection": "" },
...           { "db" : "admin", "role" : "backup" },
...           { "db" : "admin", "role" : "clusterMonitor" },
...           { "db" : "admin", "role" : "restore" },
...           { "db" : "admin", "role" : "pbmAnyAction" }
...        ]
...     });
Successfully added user: {
"user" : "pbmuser",
"roles" : [
{
"db" : "admin",
"role" : "readWrite",
"collection" : ""
},
{
"db" : "admin",
"role" : "backup"
},
{
"db" : "admin",
"role" : "clusterMonitor"
},
{
"db" : "admin",
"role" : "restore"
},
{
"db" : "admin",
"role" : "pbmAnyAction"
}
]
}

replset:PRIMARY> db.getSiblingDB("admin").createUser({user: "pbmuser",

... "pwd": "pbmuser",

... "roles" : [

... { "db" : "admin", "role" : "readWrite", "collection": "" },

... { "db" : "admin", "role" : "backup" },

... { "db" : "admin", "role" : "clusterMonitor" },

... { "db" : "admin", "role" : "restore" },

... { "db" : "admin", "role" : "pbmAnyAction" }

... ]

... });

Successfully added user: {

"user" : "pbmuser",

"roles" : [

{

"db" : "admin",

"role" : "readWrite",

"collection" : ""

{

"db" : "admin",

"role" : "backup"

{

"db" : "admin",

"role" : "clusterMonitor"

{

"db" : "admin",

"role" : "restore"

{

"db" : "admin",

"role" : "pbmAnyAction"

}

]

}

Note: Some roles are built-in however we created one additional role (pbmAnyAction). The commands “db.getUsers()” & “db.getRoles()” can be used to verify the creations.

4) Now, we will configure the MongoDB connection URL for the pbm-agent process. We need to add the entries in the file [“/etc/sysconfig/pbm-agent”] for the local Mongo node.

PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27017/?authSource=admin"
PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27018/?authSource=admin"
PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27019/?authSource=admin"

PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27017/?authSource=admin"

PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27018/?authSource=admin"

PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27019/?authSource=admin"

Note: A pbm-agent process connects to its localhost mongod node with a standalone type of connection. Do not set up the agent to connect to the replica set URI.

Further, we can persist those settings by defining in [“~/.bashrc”] profile of the user. As these settings could affect the PBM client, so it should connect to the replica set instead of the local node.

export PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27017/?authSource=admin&replSetName=replset"
export PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27018/?authSource=admin&replSetName=replset"
export PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27019/?authSource=admin&replSetName=replset"

export PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27017/?authSource=admin&replSetName=replset"

export PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27018/?authSource=admin&replSetName=replset"

export PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27019/?authSource=admin&replSetName=replset"

Let’s apply this to the current session as well:

source ~/.bashrc

1	source ~/.bashrc

5) Next, we can define the PBM configuration and storage-related details in the file [“/etc/pbm_config.yaml“]. Here we are doing the backup on the local system; however, we can define some cloud storage such as ( AWS S3 or Google Cloud storage).

storage:
  type: filesystem
  filesystem:
    path: /home/backups

storage:

type: filesystem

filesystem:

path: /home/backups

Note: Please ensure to mount the same directory at the same local path[“/home/backups”] on all servers.

Then we can apply the changes below.

shell> pbm config --file /etc/pbm_config.yaml

1	shell> pbm config --file /etc/pbm_config.yaml

Output

pitr:
  enabled: false
  oplogSpanMin: 0
  compression: s2
storage:
  type: filesystem
  filesystem:
    path: /home/backups
backup:
  compression: s2

pitr:

enabled: false

oplogSpanMin: 0

compression: s2

storage:

type: filesystem

filesystem:

path: /home/backups

backup:

compression: s2

6) Now, we will run the PBM agent process separately for all the Mongo nodes.

shell> nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27017/" > /tmp/pbm-agent.27017.log 2>&1 &
shell> nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27018/" > /tmp/pbm-agent.27018.log 2>&1 &
shell> nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27019/" > /tmp/pbm-agent.27019.log 2>&1 &

shell> nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27017/" > /tmp/pbm-agent.27017.log 2>&1 &

shell> nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27018/" > /tmp/pbm-agent.27018.log 2>&1 &

shell> nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27019/" > /tmp/pbm-agent.27019.log 2>&1 &

Note: Since we are running the entire setup on a single server we have used the above command line option to run the PBM agent service. However, in real world or production, we should use the proper service [“systemctl start pbm-agent”] to manage the agents.

6) Finally, we can verify if all our configurations look good and if the pbm-agent connected fine. The below output looks healthy.

shell> pbm status

1	shell> pbm status

Output

Cluster:
========
replset:
  - replset/localhost:27017 [P]: pbm-agent v2.3.1 OK
  - replset/localhost:27018 [S]: pbm-agent v2.3.1 OK
  - replset/localhost:27019 [S]: pbm-agent v2.3.1 OK


PITR incremental backup:
========================
Status [OFF]

Currently running:
==================
(none)

Backups:
========
FS  /home/backups
  (none)

Cluster:

========

replset:

- replset/localhost:27017 [P]: pbm-agent v2.3.1 OK

- replset/localhost:27018 [S]: pbm-agent v2.3.1 OK

- replset/localhost:27019 [S]: pbm-agent v2.3.1 OK

PITR incremental backup:

========================

Status [OFF]

Currently running:

==================

(none)

Backups:

========

FS /home/backups

(none)

7) Here, we are ready to take our first backup by simply executing the below single command via PBM CLI.

shell> pbm backup
Starting backup '2024-02-10T04:50:18Z'....Backup '2024-02-10T04:50:18Z' to remote store '/home/backups' has started

1 2	shell> pbm backup Starting backup '2024-02-10T04:50:18Z'....Backup '2024-02-10T04:50:18Z' to remote store '/home/backups' has started

Let’s verify if the backup was completed successfully.

shell> pbm list

Backup snapshots:
  2024-02-10T04:50:18Z <logical> [restore_to_time: 2024-02-10T04:50:22Z]

shell> pbm list

Backup snapshots:

2024-02-10T04:50:18Z <logical> [restore_to_time: 2024-02-10T04:50:22Z]

shell> ls -lh /home/backups/
total 20K
drwxr-xr-x. 3 vagrant vagrant   21 Feb 10 04:50 2024-02-10T04:50:18Z
-rw-r--r--. 1 vagrant vagrant 1.7K Feb 10 04:50 2024-02-10T04:50:18Z.pbm.json
drwxr-xr-x. 3 vagrant vagrant   21 Feb 10 04:52 2024-02-10T04:52:35Z
-rw-r--r--. 1 vagrant vagrant  16K Feb 10 04:52 2024-02-10T04:52:35Z.pbm.json

shell> ls -lh /home/backups/

total 20K

drwxr-xr-x. 3 vagrant vagrant 21 Feb 10 04:50 2024-02-10T04:50:18Z

-rw-r--r--. 1 vagrant vagrant 1.7K Feb 10 04:50 2024-02-10T04:50:18Z.pbm.json

drwxr-xr-x. 3 vagrant vagrant 21 Feb 10 04:52 2024-02-10T04:52:35Z

-rw-r--r--. 1 vagrant vagrant 16K Feb 10 04:52 2024-02-10T04:52:35Z.pbm.json

We can see the above folder/files generated after the backup. By default, PBM performs a logical backup unless we specify the “–type” of backup.

E.g.,

shell> pbm backup --type=physical

1	shell> pbm backup --type=physical

shell> pbm list
Backup snapshots:
 2024-02-10T04:50:18Z <logical> [restore_to_time: 2024-02-10T04:50:22Z]
 2024-02-10T04:52:35Z <physical> [restore_to_time: 2024-02-10T04:52:37Z]

shell> pbm list

Backup snapshots:

2024-02-10T04:50:18Z <logical> [restore_to_time: 2024-02-10T04:50:22Z]

2024-02-10T04:52:35Z <physical> [restore_to_time: 2024-02-10T04:52:37Z]

8) Now if we want to restore any of the backups out of that list, we can simply execute the below command.

shell> pbm restore 2024-02-10T04:50:18Z
Starting restore 2024-02-10T04:56:13.099066243Z from '2024-02-10T04:50:18Z'...Restore of the snapshot from '2024-02-10T04:50:18Z' has started

1 2	shell> pbm restore 2024-02-10T04:50:18Z Starting restore 2024-02-10T04:56:13.099066243Z from '2024-02-10T04:50:18Z'...Restore of the snapshot from '2024-02-10T04:50:18Z' has started

Again we can validate if the restore is done successfully or not with the help of the below command.

shell> pbm logs --event=restore

1	shell> pbm logs --event=restore

Output

2024-02-10T04:56:19Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] restoring indexes for admin.system.roles: role_1_db_1
2024-02-10T04:56:19Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] restoring indexes for admin.pbmOpLog: opid_1_replset_1
2024-02-10T04:56:19Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] restoring indexes for admin.pbmPITRChunks: rs_1_start_ts_1_end_ts_1, start_ts_1_end_ts_1
2024-02-10T04:56:19Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] restoring indexes for admin.pbmBackups: name_1, start_ts_1_status_1
2024-02-10T04:56:20Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] recovery successfully finished

2024-02-10T04:56:19Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] restoring indexes for admin.system.roles: role_1_db_1

2024-02-10T04:56:19Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] restoring indexes for admin.pbmOpLog: opid_1_replset_1

2024-02-10T04:56:19Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] restoring indexes for admin.pbmPITRChunks: rs_1_start_ts_1_end_ts_1, start_ts_1_end_ts_1

2024-02-10T04:56:19Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] restoring indexes for admin.pbmBackups: name_1, start_ts_1_status_1

2024-02-10T04:56:20Z I [replset/localhost:27017] [restore/2024-02-10T04:56:13.099066243Z] recovery successfully finished

Next, we will see how we can perform similar activities in the sharded/distributed environment.

PBM setup/usage for sharding

1) So, here we have a sharding-based setup with nodes below.

localhost:27017 mongos
localhost:27022 config (configRepl)
localhost:27018 & localhost:27018  shardA (shard01)
localhost:27020 & localhost:27021  shardB (shard02)

localhost:27017 mongos

localhost:27022 config (configRepl)

localhost:27018 & localhost:27018 shardA (shard01)

localhost:27020 & localhost:27021 shardB (shard02)

2) Next, we will enable the authentication and create a user for PBM in each replica set (primary) instance, including the config servers. So, the user will be created in [config, shardA, and shardB] primary nodes only.

configRepl:PRIMARY> db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",

...       "privileges": [
...          { "resource": { "anyResource": true },
...            "actions": [ "anyAction" ]
...          }
...       ],
...       "roles": []
...    });
{
"role" : "pbmAnyAction",
"privileges" : [
{
"resource" : {
"anyResource" : true
},
"actions" : [
"anyAction"
]
}
],
"roles" : [ ]
}
configRepl:PRIMARY>
configRepl:PRIMARY>
configRepl:PRIMARY> db.getSiblingDB("admin").createUser({user: "pbmuser",
...        "pwd": "pbmuser",
...        "roles" : [
...           { "db" : "admin", "role" : "readWrite", "collection": "" },
...           { "db" : "admin", "role" : "backup" },
...           { "db" : "admin", "role" : "clusterMonitor" },
...           { "db" : "admin", "role" : "restore" },
...           { "db" : "admin", "role" : "pbmAnyAction" }
...        ]
...     });
Successfully added user: {
"user" : "pbmuser",
"roles" : [
{
"db" : "admin",
"role" : "readWrite",
"collection" : ""
},
{
"db" : "admin",

"role" : "backup"
},
{
"db" : "admin",
"role" : "clusterMonitor"
},
{
"db" : "admin",
"role" : "restore"
},
{
"db" : "admin",
"role" : "pbmAnyAction"
}
]
}

configRepl:PRIMARY> db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",

... "privileges": [

... { "resource": { "anyResource": true },

... "actions": [ "anyAction" ]

... }

... ],

... "roles": []

... });

{

"role" : "pbmAnyAction",

"privileges" : [

{

"resource" : {

"anyResource" : true

"actions" : [

"anyAction"

]

}

"roles" : [ ]

}

configRepl:PRIMARY>

configRepl:PRIMARY> db.getSiblingDB("admin").createUser({user: "pbmuser",

... "pwd": "pbmuser",

... "roles" : [

... { "db" : "admin", "role" : "readWrite", "collection": "" },

... { "db" : "admin", "role" : "backup" },

... { "db" : "admin", "role" : "clusterMonitor" },

... { "db" : "admin", "role" : "restore" },

... { "db" : "admin", "role" : "pbmAnyAction" }

... ]

... });

Successfully added user: {

"user" : "pbmuser",

"roles" : [

{

"db" : "admin",

"role" : "readWrite",

"collection" : ""

{

"db" : "admin",

"role" : "backup"

{

"db" : "admin",

"role" : "clusterMonitor"

{

"db" : "admin",

"role" : "restore"

{

"db" : "admin",

"role" : "pbmAnyAction"

}

]

}

shard01:PRIMARY> db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",
...       "privileges": [
...          { "resource": { "anyResource": true },
...            "actions": [ "anyAction" ]
...          }
...       ],
...       "roles": []
...    });
{
"role" : "pbmAnyAction",
"privileges" : [
{
"resource" : {
"anyResource" : true
},
"actions" : [
"anyAction"
]
}
],
"roles" : [ ]
}
shard01:PRIMARY> 


shard01:PRIMARY>
shard01:PRIMARY>
shard01:PRIMARY> db.getSiblingDB("admin").createUser({user: "pbmuser",
...        "pwd": "pbmuser",
...        "roles" : [
...           { "db" : "admin", "role" : "readWrite", "collection": "" },
...           { "db" : "admin", "role" : "backup" },
...           { "db" : "admin", "role" : "clusterMonitor" },
...           { "db" : "admin", "role" : "restore" },
...           { "db" : "admin", "role" : "pbmAnyAction" }
...        ]
...     });
Successfully added user: {
"user" : "pbmuser",
"roles" : [
{
"db" : "admin",
"role" : "readWrite",
"collection" : ""
},
{
"db" : "admin",
"role" : "backup"
},
{
"db" : "admin",
"role" : "clusterMonitor"
},
{
"db" : "admin",
"role" : "restore"
},
{
"db" : "admin",
"role" : "pbmAnyAction"
}
]
}

shard01:PRIMARY> db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",

... "privileges": [

... { "resource": { "anyResource": true },

... "actions": [ "anyAction" ]

... }

... ],

... "roles": []

... });

{

"role" : "pbmAnyAction",

"privileges" : [

{

"resource" : {

"anyResource" : true

"actions" : [

"anyAction"

]

}

"roles" : [ ]

}

shard01:PRIMARY>

shard01:PRIMARY> db.getSiblingDB("admin").createUser({user: "pbmuser",

... "pwd": "pbmuser",

... "roles" : [

... { "db" : "admin", "role" : "readWrite", "collection": "" },

... { "db" : "admin", "role" : "backup" },

... { "db" : "admin", "role" : "clusterMonitor" },

... { "db" : "admin", "role" : "restore" },

... { "db" : "admin", "role" : "pbmAnyAction" }

... ]

... });

Successfully added user: {

"user" : "pbmuser",

"roles" : [

{

"db" : "admin",

"role" : "readWrite",

"collection" : ""

{

"db" : "admin",

"role" : "backup"

{

"db" : "admin",

"role" : "clusterMonitor"

{

"db" : "admin",

"role" : "restore"

{

"db" : "admin",

"role" : "pbmAnyAction"

}

]

}

shard02:PRIMARY> db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",
...       "privileges": [


...          { "resource": { "anyResource": true },
...            "actions": [ "anyAction" ]
...          }
...       ],
...       "roles": []
...    });
{
"role" : "pbmAnyAction",
"privileges" : [
{
"resource" : {
"anyResource" : true
},
"actions" : [
"anyAction"
]
}
],
"roles" : [ ]
}
shard02:PRIMARY>
shard02:PRIMARY>
shard02:PRIMARY>
shard02:PRIMARY> db.getSiblingDB("admin").createUser({user: "pbmuser",
...        "pwd": "pbmuser",
...        "roles" : [
...           { "db" : "admin", "role" : "readWrite", "collection": "" },
...           { "db" : "admin", "role" : "backup" },
...           { "db" : "admin", "role" : "clusterMonitor" },
...           { "db" : "admin", "role" : "restore" },
...           { "db" : "admin", "role" : "pbmAnyAction" }
...        ]
...     });

Successfully added user: {
"user" : "pbmuser",
"roles" : [
{
"db" : "admin",
"role" : "readWrite",
"collection" : ""
},
{

"db" : "admin",
"role" : "backup"
},
{
"db" : "admin",
"role" : "clusterMonitor"
},
{
"db" : "admin",
"role" : "restore"
},
{
"db" : "admin",
"role" : "pbmAnyAction"
}
]
}

shard02:PRIMARY> db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",

... "privileges": [

... { "resource": { "anyResource": true },

... "actions": [ "anyAction" ]

... }

... ],

... "roles": []

... });

{

"role" : "pbmAnyAction",

"privileges" : [

{

"resource" : {

"anyResource" : true

"actions" : [

"anyAction"

]

}

"roles" : [ ]

}

shard02:PRIMARY>

shard02:PRIMARY> db.getSiblingDB("admin").createUser({user: "pbmuser",

... "pwd": "pbmuser",

... "roles" : [

... { "db" : "admin", "role" : "readWrite", "collection": "" },

... { "db" : "admin", "role" : "backup" },

... { "db" : "admin", "role" : "clusterMonitor" },

... { "db" : "admin", "role" : "restore" },

... { "db" : "admin", "role" : "pbmAnyAction" }

... ]

... });

Successfully added user: {

"user" : "pbmuser",

"roles" : [

{

"db" : "admin",

"role" : "readWrite",

"collection" : ""

{

"db" : "admin",

"role" : "backup"

{

"db" : "admin",

"role" : "clusterMonitor"

{

"db" : "admin",

"role" : "restore"

{

"db" : "admin",

"role" : "pbmAnyAction"

}

]

}

3) Here, we will configure the MongoDB connection URL for the pbm-agent process. We need to add the entries in the file [“/etc/sysconfig/pbm-agent“] for the local Mongo node, including the config node.

#config
PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27022/?authSource=admin"

#shardA
PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27018/?authSource=admin"
PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27019/?authSource=admin"

#shardB
PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27020/?authSource=admin"
PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27021/?authSource=admin"

#config

PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27022/?authSource=admin"

#shardA

PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27018/?authSource=admin"

PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27019/?authSource=admin"

#shardB

PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27020/?authSource=admin"

PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27021/?authSource=admin"

Further, we can persist those settings by defining in [“~/.bashrc“] profile of the user. In case of shared deployments, the PBM client should connect to the config server replica set.

#config
export PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27022/?authSource=admin&replicaSet=configRepl"

1 2	#config export PBM_MONGODB_URI="mongodb://pbmuser:pbmuser@localhost:27022/?authSource=admin&replicaSet=configRepl"

source ~/.bashrc

1	source ~/.bashrc

4) Let’s define the PBM configuration and storage-related details in the file [“/etc/pbm_config.yaml“]. So here we are performing the backup in the local storage.

storage:
  type: filesystem
  filesystem:
    path: /home/backups

storage:

type: filesystem

filesystem:

path: /home/backups

Then we can apply the changes below.

shell> pbm config --file /etc/pbm_config.yaml

pitr:
  enabled: false
  oplogSpanMin: 0
  compression: s2
storage:
  type: filesystem
  filesystem:
    path: /home/backups
backup:
  compression: s2

shell> pbm config --file /etc/pbm_config.yaml

pitr:

enabled: false

oplogSpanMin: 0

compression: s2

storage:

type: filesystem

filesystem:

path: /home/backups

backup:

compression: s2

5) Now, we will run the PBM agent process separately for all the Mongo nodes (data and config).

#config
nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27022/" > /tmp/pbm-agent.27022.log 2>&1 &

#shardA
nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27018/" > /tmp/pbm-agent.27018.log 2>&1 &
nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27019/" > /tmp/pbm-agent.27019.log 2>&1 &

#shardB
nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27020/" > /tmp/pbm-agent.27020.log 2>&1 &
nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27021/" > /tmp/pbm-agent.27021.log 2>&1 &

#config

nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27022/" > /tmp/pbm-agent.27022.log 2>&1 &

#shardA

nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27018/" > /tmp/pbm-agent.27018.log 2>&1 &

nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27019/" > /tmp/pbm-agent.27019.log 2>&1 &

#shardB

nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27020/" > /tmp/pbm-agent.27020.log 2>&1 &

nohup pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27021/" > /tmp/pbm-agent.27021.log 2>&1 &

Note: Since we are running the entire setup on a single server we have used the above command line option to run the PBM agent service. However in the real world or production we should use the proper service [“systemctl start pbm-agent“] to manage the agents.

6) Finally we can verify if all our configurations look good and the pbm-agent connected fine.

shell> pbm status

Cluster:
========
configRepl:
  - configRepl/localhost:27022 [P]: pbm-agent v2.3.1 OK
shard02:
  - shard02/localhost:27020 [P]: pbm-agent v2.3.1 OK
  - shard02/localhost:27021 [S]: pbm-agent v2.3.1 OK
shard01:
  - shard01/localhost:27018 [P]: pbm-agent v2.3.1 OK
  - shard01/localhost:27019 [S]: pbm-agent v2.3.1 OK


PITR incremental backup:
========================
Status [OFF]

Currently running:
==================
(none)

Backups:
========
FS  /home/backups
  (none

shell> pbm status

Cluster:

========

configRepl:

- configRepl/localhost:27022 [P]: pbm-agent v2.3.1 OK

shard02:

- shard02/localhost:27020 [P]: pbm-agent v2.3.1 OK

- shard02/localhost:27021 [S]: pbm-agent v2.3.1 OK

shard01:

- shard01/localhost:27018 [P]: pbm-agent v2.3.1 OK

- shard01/localhost:27019 [S]: pbm-agent v2.3.1 OK

PITR incremental backup:

========================

Status [OFF]

Currently running:

==================

(none)

Backups:

========

FS /home/backups

(none

7) Next, we can take the backup, but before that, let’s fill our shard environment with some data to verify the data distribution post the restoration.

mongos> sh.enableSharding("test")
mongos> sh.shardCollection("test.users", { "user_id": "hashed" } )
mongos> for (var i = 1; i <= 30000; i++) db.users.insert( { user_id : "user"+i,created_at :new Date() } )

mongos> sh.enableSharding("test")

mongos> sh.shardCollection("test.users", { "user_id": "hashed" } )

mongos> for (var i = 1; i <= 30000; i++) db.users.insert( { user_id : "user"+i,created_at :new Date() } )

mongos> db.users.getShardDistribution()

Shard shard01 at shard01/localhost:27018,localhost:27019
data : 946KiB docs : 15001 chunks : 2
estimated data per chunk : 473KiB
estimated docs per chunk : 7500

Shard shard02 at shard02/localhost:27020,localhost:27021
data : 946KiB docs : 14999 chunks : 2
estimated data per chunk : 473KiB
estimated docs per chunk : 7499

Totals
data : 1.84MiB docs : 30000 chunks : 4
Shard shard01 contains 50% data, 50% docs in cluster, avg obj size on shard : 64B
Shard shard02 contains 49.99% data, 49.99% docs in cluster, avg obj size on shard : 64B

mongos> db.users.getShardDistribution()

Shard shard01 at shard01/localhost:27018,localhost:27019

data : 946KiB docs : 15001 chunks : 2

estimated data per chunk : 473KiB

estimated docs per chunk : 7500

Shard shard02 at shard02/localhost:27020,localhost:27021

data : 946KiB docs : 14999 chunks : 2

estimated data per chunk : 473KiB

estimated docs per chunk : 7499

Totals

data : 1.84MiB docs : 30000 chunks : 4

Shard shard01 contains 50% data, 50% docs in cluster, avg obj size on shard : 64B

Shard shard02 contains 49.99% data, 49.99% docs in cluster, avg obj size on shard : 64B

So, we now have some data on both shard01 and shard02.

8) Finally, let’s do some backup.

shell> pbm backup
Starting backup '2024-02-10T06:08:15Z'....Backup '2024-02-10T06:08:15Z' to remote store '/home/backups' has started

1 2	shell> pbm backup Starting backup '2024-02-10T06:08:15Z'....Backup '2024-02-10T06:08:15Z' to remote store '/home/backups' has started

shell> pbm list

Backup snapshots:
  2024-02-10T06:08:15Z <logical> [restore_to_time: 2024-02-10T06:08:20Z]

shell> pbm list

Backup snapshots:

2024-02-10T06:08:15Z <logical> [restore_to_time: 2024-02-10T06:08:20Z]

9) Again, if we want to restore the backup, we can just execute the simple command (pbm restore …). Let’s first clean the existing data so that we can later verify the fresh restore.

shell> mongo --port 27017

1	shell> mongo --port 27017

mongos> use test
switched to db test
mongos> db.dropDatabase()
{
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1707545529, 41),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1707545529, 39)
}

mongos> use test

switched to db test

mongos> db.dropDatabase()

{

"ok" : 1,

"$clusterTime" : {

"clusterTime" : Timestamp(1707545529, 41),

"signature" : {

"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

"keyId" : NumberLong(0)

}

"operationTime" : Timestamp(1707545529, 39)

}

mongos> show dbs
admin   0.001GB
config  0.004GB

mongos> show dbs

admin 0.001GB

config 0.004GB

Let’s restore the backup now.

shell> pbm restore 2024-02-10T06:08:15Z

Starting restore 2024-02-10T06:14:09.284538667Z from '2024-02-10T06:08:15Z'...Restore of the snapshot from '2024-02-10T06:08:15Z' has started

shell> pbm restore 2024-02-10T06:08:15Z

Starting restore 2024-02-10T06:14:09.284538667Z from '2024-02-10T06:08:15Z'...Restore of the snapshot from '2024-02-10T06:08:15Z' has started

So, if we again connect to the router/mongos node, we can see the database is successfully restored now.

mongos> show dbs
admin   0.001GB
config  0.003GB
test    0.002GB

mongos> show dbs

admin 0.001GB

config 0.003GB

test 0.002GB

mongos> use test
switched to db test

mongos> show collections
users

mongos> db.users.count()
30000

mongos> use test

switched to db test

mongos> show collections

users

mongos> db.users.count()

30000

Monitoring/investigating PBM

There are a few ways by which we can investigate/monitor the PBM activity or logs for the backup/restore process.

shell> pbm logs ### show all log details.
shell> pbm logs --event=backup ### show log details specific to backup
shell> pbm logs --event=restore ### show log details specific to restore
shell> journalctl -u pbm-agent.service ### to check the agent related events
shell> pbm describe-backup backup_name ### to check particular backup related details.

shell> pbm logs ### show all log details.

shell> pbm logs --event=backup ### show log details specific to backup

shell> pbm logs --event=restore ### show log details specific to restore

shell> journalctl -u pbm-agent.service ### to check the agent related events

shell> pbm describe-backup backup_name ### to check particular backup related details.

Physical vs. logical backup

Physical backup is the copying of physical/disk files from Percona Server for MongoDB (PSMDB). While performing restores, the pbm-agents shut down the mongod nodes, cleaned up the data directory, and copied the physical files from the storage.

Logical backup denotes copying of the database data via a logical dump tool (mongodump). A pbm-agent connects to the database, retrieves the data, and writes it to the storage. While restoration the pbm-agent retrieves the data from the storage location and inserts it on every primary node in the cluster. The remaining nodes receive the data during the replication process.

E.g.,

shell> pbm backup --type=physical

1	shell> pbm backup --type=physical

Unfortunately, MongoDB does not support hot/physical backup in the community Mongo edition, so only logical backup will be possible.

Especially in the case of physical backup restorations, we might have to perform some additional steps mentioned below.

Restart all mongod nodes and pbm-agents.
Resync the backup list from the storage using “pbm config –force-resync –file/etc/pbm_config.yaml”.
Start the balancer and the mongos node.

Note: PBM backup by default will use the (“secondary nodes”) for backup based on election, and in case no secondaries respond, then the backup will be initiated on the Primary. We can also control the election behaviour by defining a priority for Mongo nodes.

Conclusion

In this blog post, we explored how simple and convenient to perform backup and restoration tasks using PBM in replica set and sharding topologies. PBM simplifies the whole process in such complex topologies, which might not be ideal with other logical options (MongoDump). Check out part two, covering more backup options and some other areas of the PBM.

Percona Distribution for MongoDB is a source-available alternative for enterprise MongoDB. A bundling of Percona Server for MongoDB and Percona Backup for MongoDB, Percona Distribution for MongoDB combines the best and most critical enterprise components from the open source community into a single feature-rich and freely available solution.

Download Percona Distribution for MongoDB Today!

0 Comments

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Using Percona Backup for MongoDB in Replica Set and Sharding Environments: Part One

PBM setup/usage for replica set

Output

Output

Output

PBM setup/usage for sharding

Monitoring/investigating PBM

Physical vs. logical backup

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Speed Up Repetitive PostgreSQL Statements With \gexec

Hello World… Hello Valkey! Let’s Get Started!

Valkey/Redis Configurations and Persistent Setting of the Key Parameters

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Using Percona Backup for MongoDB in Replica Set and Sharding Environments: Part One

PBM setup/usage for replica set

Output

Output

Output

PBM setup/usage for sharding

Monitoring/investigating PBM

Physical vs. logical backup

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Speed Up Repetitive PostgreSQL Statements With \gexec

Hello World… Hello Valkey! Let’s Get Started!

Valkey/Redis Configurations and Persistent Setting of the Key Parameters

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation