5 Changes You Should Know in MongoDB 7.0

As almost a tradition, a newer major release of MongoDB is born every year, and this year, it’s no different. Many changes and new features are brought to the system, and as part of keeping in tune with the changes and how they can impact us, we go through the changes to better understand them. From that, this article is born.

Our goal here is to review five new changes or implementations you should know in MongoDB 7.0. Not only that, the idea is to present you with changes flying under the radar, which, in my understanding, can influence your daily operation. Without further ado, let’s get started.

1. Dynamic WiredTiger tickets

As a brief recap, WiredTiger tickets in MongoDB serve as the concurrency control mechanism within the WiredTiger storage engine. These tickets are categorized into Read tickets and Write tickets.

When multiple operations, such as reads and writes, attempt to access the database concurrently, WiredTiger uses tickets to ensure these operations do not conflict in a way that would compromise data integrity or performance.

Each transaction obtains a ticket at its onset and releases it back into the ticket pool upon completion. Since the early days of the WiredTiger engine, the number of tickets are governed by:

Those variables control the number of concurrent read/write transactions (tickets) allowed into the WiredTiger storage engine, which by default before MongoDB 7.0 was 128 tickets:

db.runCommand({ getParameter: 1, wiredTigerConcurrentReadTransactions: 1 });

{

"wiredTigerConcurrentReadTransactions" : 128,

[...]}

db.runCommand({ getParameter: 1, wiredTigerConcurrentWriteTransactions: 1 });

{

"wiredTigerConcurrentWriteTransactions" : 128,

[...]}

From MongoDB 7.0:

db.runCommand({ getParameter: 1, wiredTigerConcurrentReadTransactions: 1 });

{

wiredTigerConcurrentReadTransactions: 0,

[...]}

db.runCommand({ getParameter: 1, wiredTigerConcurrentWriteTransactions: 1 });

{

wiredTigerConcurrentWriteTransactions: 0,

[...]}

Tickets are now dynamically set:

1 2	db.serverStatus().wiredTiger.concurrentTransactions.read.totalTickets 12

1 2	db.serverStatus().wiredTiger.concurrentTransactions.write.totalTickets 12

A whole new algorithm to manage the tickets dynamically was implemented, which can be checked here. With that new change, it is worth mentioning to check the WiredTiger metrics in their entirety and not as a separate piece, as it can lead to wrong conclusions.

My personal take on that change: It’s an interesting and valid approach, mainly because it’s common to find a system with limited resources struggling with performance due to an overestimated threshold (very common on many parameters by default in MongoDB), adding unnecessary pressure on it. The dynamic allocation might have flaws, but it’s definitely a good addition.

2. Shard Key Analyzer helper – .analyzeShardKey()

The shard key is a critical component of your cluster, as it dictates the distribution of your data through the shards, and that’s where the problem lies.

A considerable part of shard cluster issues are related to bad shard key choices; we are not going through each component as it’s not the scope of this article, but for a good shard key, you must watch for:

The cardinality of the shard key
The frequency with which shard key values occur
Whether a potential shard key grows monotonically
Sharding Query Patterns

It was immutable in the past, but now (+4.4), it has new features to improve or even change your shard key. Like refine or reshard it.

But still, you had to manually query your collection to answer those questions, if your key has a good cardinality, frequency, and so on.

MongoDB 7.0 adds two new methods that return metrics that are very useful in evaluating either existing or potential new shard keys. The methods are:

db.collection.analyzeShardKey()

db.adminCommand(

{

analyzeShardKey: <string>,

key: <shardKey>,

keyCharacteristics: <bool>,

readWriteDistribution: <bool>,

sampleRate: <double>,

sampleSize: <int>

}

)

The output provides three documents that hold their statistics:

keyCharacteristics: provides metrics about the cardinality, frequency, and monotonicity of the shard key.
readDistribution/writeDistribution: provides metrics about query routing patterns and the hotness of shard key ranges.

Here, we have an example of analyzeShardKey() on two candidates’ shard keys.

Since the collection has not been sharded yet, we set readWriteDistribution: false. You can use the new method to help you evaluate one shard key that has been in use or a future one as follows:

db.adminCommand({analyzeShardKey: "percona.employee", key: {badgeId: "hashed"},readWriteDistribution: false })
{
  keyCharacteristics: {
    numDocsTotal: Long("13591000"),
    avgDocSizeBytes: Long("207"),
    numDocsSampled: Long("10000000"),
    isUnique: false,
    numDistinctValues: Long("9988503"),
    mostCommonValues: [
      {
        value: { badgeId: Long("8472676555298607") },
        frequency: Long("3")
      },
      {
        value: { badgeId: Long("4284333888163027") },
        frequency: Long("3")
      },
      {
        value: { badgeId: Long("3463302535508078") },
        frequency: Long("3")
      },
      {
        value: { badgeId: Long("5807207987841736") },
        frequency: Long("3")
      },
      {
        value: { badgeId: Long("88360867693259") },
        frequency: Long("3")
      }
    ],
    monotonicity: { recordIdCorrelationCoefficient: 0, type: 'not monotonic' }
  },
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1696960527, i: 1 }),
    signature: {
      hash: Binary.createFromBase64("i+DjCSA7WpV3KJAT5HwXhTLb7uw=", 0),
      keyId: Long("7288022888098037764")
    }
  },
  operationTime: Timestamp({ t: 1696960524, i: 1 })
}

db.adminCommand({analyzeShardKey: "percona.employee", key: {badgeId: "hashed"},readWriteDistribution: false })

{

keyCharacteristics: {

numDocsTotal: Long("13591000"),

avgDocSizeBytes: Long("207"),

numDocsSampled: Long("10000000"),

isUnique: false,

numDistinctValues: Long("9988503"),

mostCommonValues: [

{

value: { badgeId: Long("8472676555298607") },

frequency: Long("3")

{

value: { badgeId: Long("4284333888163027") },

frequency: Long("3")

{

value: { badgeId: Long("3463302535508078") },

frequency: Long("3")

{

value: { badgeId: Long("5807207987841736") },

frequency: Long("3")

{

value: { badgeId: Long("88360867693259") },

frequency: Long("3")

}

monotonicity: { recordIdCorrelationCoefficient: 0, type: 'not monotonic' }

ok: 1,

'$clusterTime': {

clusterTime: Timestamp({ t: 1696960527, i: 1 }),

signature: {

hash: Binary.createFromBase64("i+DjCSA7WpV3KJAT5HwXhTLb7uw=", 0),

keyId: Long("7288022888098037764")

}

operationTime: Timestamp({ t: 1696960524, i: 1 })

}

> db.adminCommand({analyzeShardKey: "percona.employee", key: {created: 1},readWriteDistribution: false })
{
  keyCharacteristics: {
    numDocsTotal: Long("13591000"),
    avgDocSizeBytes: Long("207"),
    numDocsSampled: Long("9998918"),
    isUnique: false,
    numDistinctValues: Long("955449"),
    mostCommonValues: [
      {
        value: { created: ISODate("2023-10-10T17:01:02.732Z") },
        frequency: Long("21")
      },
      {
        value: { created: ISODate("2023-10-10T17:14:32.273Z") },
        frequency: Long("20")
      },
      {
        value: { created: ISODate("2023-10-10T17:01:16.426Z") },
        frequency: Long("20")
      },
      {
        value: { created: ISODate("2023-10-10T17:01:03.824Z") },
        frequency: Long("20")
      },
      {
        value: { created: ISODate("2023-10-10T17:02:06.022Z") },
        frequency: Long("20")
      }
    ],
    monotonicity: { recordIdCorrelationCoefficient: 0.9999999911, type: 'monotonic' }
  },
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1696960030, i: 1 }),
    signature: {
      hash: Binary.createFromBase64("LHsIUx9Au7axEpq7eLYcTa0mkmM=", 0),
      keyId: Long("7288022888098037764")
    }
  },
  operationTime: Timestamp({ t: 1696960024, i: 1 })
}

> db.adminCommand({analyzeShardKey: "percona.employee", key: {created: 1},readWriteDistribution: false })

{

keyCharacteristics: {

numDocsTotal: Long("13591000"),

avgDocSizeBytes: Long("207"),

numDocsSampled: Long("9998918"),

isUnique: false,

numDistinctValues: Long("955449"),

mostCommonValues: [

{

value: { created: ISODate("2023-10-10T17:01:02.732Z") },

frequency: Long("21")

{

value: { created: ISODate("2023-10-10T17:14:32.273Z") },

frequency: Long("20")

{

value: { created: ISODate("2023-10-10T17:01:16.426Z") },

frequency: Long("20")

{

value: { created: ISODate("2023-10-10T17:01:03.824Z") },

frequency: Long("20")

{

value: { created: ISODate("2023-10-10T17:02:06.022Z") },

frequency: Long("20")

}

monotonicity: { recordIdCorrelationCoefficient: 0.9999999911, type: 'monotonic' }

ok: 1,

'$clusterTime': {

clusterTime: Timestamp({ t: 1696960030, i: 1 }),

signature: {

hash: Binary.createFromBase64("LHsIUx9Au7axEpq7eLYcTa0mkmM=", 0),

keyId: Long("7288022888098037764")

}

operationTime: Timestamp({ t: 1696960024, i: 1 })

}

With that output, you can understand if your key has high cardinality by checking the total docs divided by the number of distinct values.

The mostCommonValues field adds the understanding of the frequency of a value; if that’s high, it might indicate possible jumbo in the future as the values often repeat, which increases the chunk, thus becoming indivisible.

Last but not least, analyzeShardKey() uses a query sampling configuration to calculate the metrics; as part of new features, you have the ability to configure query sampling via configureQueryAnalyzer().

As we can see, the analyzeShardKey() method is a handy implementation, especially because it supports a more assertive evaluation of the shard key, a critical component of the sharded cluster, and often a source of issues.

3. Cluster metadata checker – checkMetadataConsistency()

Alongside the shard key analyzer, the metadata checker is another step forward regarding additional instrumentation on the DBA’s life.

The sharded cluster is great but, at the same time, a complex feature, especially when we start to understand how It works behind the scenes to comply with things like consistency.

For those who have to administrate a sharded cluster, you likely have faced problems with metadata inconsistency:

Collections with different UUIDS, routing tables with overlapping ranges, and many other unpleasant issues.

The problem here is that dealing with metadata is a challenging task, mainly because modifying it can lead to a series of issues and put the cluster in an undesired state, something that we run from.

Also, metadata inconsistency might be living silently in your cluster, and you will learn that only when a particular operation/scenario happens.

The checkMetadataConsistency() helper is a great addition to help you track that:

db.checkMetadataConsistency() // Performs consistency checks on sharding metadata for the cluster or database.
db.collection.checkMetadataConsistency() // Performs consistency checks on sharding metadata for the collection.
sh.checkMetadataConsistency() // Performs consistency checks on sharding metadata for the cluster.

It performs a series of consistency checks (13 checks) on sharding metadata and indexes(not default), looking for inconsistencies.

That’s very welcome after upgrade/downgrade processes when metadata is touched after FCV change. It’s also helpful when performing regular checks, avoiding getting caught off guard by providing a thorough scan of your cluster on such delicate components.

4. Linux users can’t change taskExecutorPoolSize from 1

If you are a Mac or Windows user, this change doesn’t affect you, but for Linux users, you can not modify your connection pool from mongos:

$ /opt/mongo/7.0.1/bin/mongos --logpath /mongo_data/701_1shard/mongos.log --port 7001 --configdb configRepl/localhost:7005 --keyFile /mongo_data/701_1shard/keyfile --setParameter taskExecutorPoolSize=6 --fork -vvv

{"t":{"$date":"2023-09-27T15:06:29.476Z"},"s":"F", "c":"-", "id":22865, "ctx":"main","msg":"Error during global initialization","attr":{"error":{"code":2,"codeName":"BadValue","errmsg":"Unknown --setParameter 'taskExecutorPoolSize'"}}} {"t":{"$date":"2023-09-27T15:06:29.476Z"},"s":"I", "c":"CONTROL", "id":23138, "ctx":"main","msg":"Shutting down","attr":{"exitCode":14}}

The taskExecutorPoolSize has been using default:1 since MongoDB 4.0. After the rapid release 6.2 (available only for Atlas users), that can not be modified anymore; That change was then introduced to 7.0.

Although taskExecutorPoolSize has been using 1 for a long time, it was possible to “tweak,” which usually brings more bad than good because that parameter itself does not work alone.

The concept and problem behind that is simple:

To find the maximum number of outbound connections each TaskExecutor connection pool can open to any given mongod instance, we do the following calculation:

1 2	shardingTaskExecutorPoolMaxSize * taskExecutorPoolSize 32767 * 1

So, 1 Task pool can open up to 32767 connections to mongod.

Now, let’s assume you have a four-core CPU and you have set taskExecutorPoolSize=4.

Then 4 * 32767=131068

Not only that, but you also have more than one mongoS; let’s also assume four distinct mongoS, which is a good practice:

4 * 131068 = 524272

At the end of the day, your cluster can open up to half a million connections between mongoS and mongod.

That’s a common source of issue because if your application also does not enforce any connection constraint and starts to request connections to fulfill an operation, you can easily end up with a storm of connections performing thousands of operations in a short span of time, which can definitely put pressure on the cluster and slow down the performance.

Not being able to modify taskExecutorPoolSize is a good action to protect against overestimated parametrization.

5. Stricter downgrade policy

The upstream documentation highlights the following:

Starting in MongoDB 7.0, binary downgrades are no longer supported for MongoDB Community Edition.

In previous releases, the downgrade process was relatively simple:

Have a valid backup.
Remove backward-incompatible features.
Adjust the FCV to the previous version.
Bounce the database, replacing the binaries with the older release.

Now, you might be thinking – “What if I perform a downgrade like we used to do?”

At the moment of writing this article, there was no blocker on the MongoDB side, such as a warning or an impossibility of moving forward with the process:

replset [direct: primary] admin> db.adminCommand( { setFeatureCompatibilityVersion: "6.0" , confirm: true} ) { ok: 1, '$clusterTime': { clusterTime: Timestamp({ t: 1695819202, i: 3 }), signature: { hash: Binary.createFromBase64("y9oQXuzpYR0Zedzu6EP2nXSE290=", 0), keyId: Long("7283486964546797575") } }, operationTime: Timestamp({ t: 1695819202, i: 3 }) }

replset [direct: primary] admin>

replset [direct: primary] admin> db.shutdownServer() MongoNetworkError: connection 2 to 127.0.0.1:7001 closed

percona> exit

/opt/mongo/6.0.6-5/bin/mongod --replSet replset --dbpath /mongo_data/701_rs1/replset/rs1/db --logpath /mongo_data/701_rs1/replset/rs1/mongod.log --port 7001 --fork --keyFile /mongo_data/701_rs1/keyfile --wiredTigerCacheSizeGB 1

about to fork child process, waiting until server is ready for connections. forked process: 1609222

child process started successfully, parent exiting

However, even though you have ensured that all backward incompatibilities are cleared, which may also require possible data removal due to incompatibilities, keep in mind this action is at your own risk.

Moving on, MongoDB 7.0 also brought another unsupported action.

Let’s assume you are catching up with the releases and want to be at the latest release available, and then you upgraded from 4.4.x -> 5.0.x -> 6.0.x -> 7.0.x.

After that, you notice a problem and want to Downgrade to 4.4.x or even 5.0x. For MongoDB 7.0, that’s not supported anymore, from the upstream documentation:

MongoDB only supports single-version downgrades. You cannot downgrade to a release that is multiple versions behind your current release.

Although not optimal, it’s not uncommon to see companies trying to catch up with the newest releases due to EOL dates and performing several upgrades in a very short period of time, which can now be a problem if rollback is needed.

You might have a similar question, and the answer is similar.

At the moment of writing this article, there was no blocker on the MongoDB side, such as a warning or an impossibility of moving forward with the process.

However, even though you have ensured that all backward incompatibilities are cleared, and you should be good to go, keep in mind this action is at your own risk.

As a final note, either upgrading or downgrading, It is always recommended to test the procedure in a lower/QA environment that mirrors the production environment, allowing you to get used to the process and clearing any doubts you may have.

Conclusion

MongoDB 7.0 brought several changes in this new release, and the above are the ones we believe might directly impact your daily use, in a good or bad way, such as the new policies about the downgrade. Speaking about that, if you are not comfortable performing upgrades/downgrades or are looking for specialized support for your MongoDB environment, contact us, and our team will assist you with the best service for you.

If you have any questions or found any other interesting change on MongoDB 7.0, feel free to share in the comment session or also open a question in our Community Forum; why not a part two of this article?

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

Download Percona Distribution for MongoDB Today!

0 Comments

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

5 Changes You Should Know in MongoDB 7.0

1. Dynamic WiredTiger tickets

2. Shard Key Analyzer helper – .analyzeShardKey()

3. Cluster metadata checker – checkMetadataConsistency()

4. Linux users can’t change taskExecutorPoolSize from 1

5. Stricter downgrade policy

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Benchmarking MongoDB Performance on Kubernetes

Why MariaDB Is “Better” Than MySQL

Did MyDumper LIKE Triggers?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

5 Changes You Should Know in MongoDB 7.0

1. Dynamic WiredTiger tickets

2. Shard Key Analyzer helper – .analyzeShardKey()

3. Cluster metadata checker – checkMetadataConsistency()

4. Linux users can’t change taskExecutorPoolSize from 1

5. Stricter downgrade policy

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Benchmarking MongoDB Performance on Kubernetes

Why MariaDB Is “Better” Than MySQL

Did MyDumper LIKE Triggers?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation