The Underlying Importance of the server

One of the most underlooked parameters to configure MySQL is server_id, which is an integer number to identify a server inside a replication topology uniquely. Note that two servers within a replication set can’t have the same server_id value.

It is generally set up as a “random” number, just different from the one configured on the other replicas, and once it is set up, it is generally not reviewed or changed later, which is fine, but if overlooked, it could lead to an unwanted skip of transactions in a recovery scenario like the one I will describe below.

Suppose we have the following topology:

db2 - primary - server_id = 82
  _ db1 - replica - server_id = 81

1 2	db2 - primary - server_id = 82 _ db1 - replica - server_id = 81

Once the traffic is moved from the previous primary (db1), we can start the maintenance on it. During the maintenance, there was a disk issue, leaving the data in MySQL inconsistent.

Don’t worry, we have the backups taken from db2 earlier that same day, let’s use that!!

We copy the backup from db2 using rsync into db1.
We run the steps on how to restore a compressed backup from here.
We’re starting MySQL with the same configuration that db1 had before the crash. (server_id=81).
When setting up replication, we can use the replication positions from xtrabackup_binlog_info, as we need the coordinates from db2 at the time that the consistent backup was taken.

Between the backup was taken and the time of the restore, many binlogs were generated that needed to be applied, so we expect the replication to take some time to fully catch up with the primary.

Once the replication was configured, we noticed that the binlogs were advancing too fast.

Suppose we configured replication using the following positions:

mysql> CHANGE MASTER TO
    -> MASTER_HOST='10.10.10.10',
    -> MASTER_USER='repluser',
    -> MASTER_PASSWORD='XXXXXXXXXXXXXXXX',
    -> MASTER_LOG_FILE='mysql-bin.034322',
    -> MASTER_LOG_POS=375677230;
Query OK, 0 rows affected, 2 warnings (0.36 sec)

mysql> CHANGE MASTER TO

-> MASTER_HOST='10.10.10.10',

-> MASTER_USER='repluser',

-> MASTER_PASSWORD='XXXXXXXXXXXXXXXX',

-> MASTER_LOG_FILE='mysql-bin.034322',

-> MASTER_LOG_POS=375677230;

Query OK, 0 rows affected, 2 warnings (0.36 sec)

We review the replication status:

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.10.10
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.034324
              Read_Master_Log_Pos: 3898727534
               Relay_Log_File: mysql-relay-bin.000006
              Relay_Log_Pos: 369
        Relay_Master_Log_File: mysql-bin.034324
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...
          Exec_Master_Log_Pos: 3898727534
...
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2002
                  Master_UUID: 94318777-57fe-11ee-a884-b496916f3834
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 10.10.10.10

Master_User: repluser

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.034324

Read_Master_Log_Pos: 3898727534

Relay_Log_File: mysql-relay-bin.000006

Relay_Log_Pos: 369

Relay_Master_Log_File: mysql-bin.034324

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

...

Exec_Master_Log_Pos: 3898727534

...

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 2002

Master_UUID: 94318777-57fe-11ee-a884-b496916f3834

Master_Info_File: mysql.slave_master_info

SQL_Delay: 0

SQL_Remaining_Delay: NULL

Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

Master_Retry_Count: 86400

Noticing something odd? Yes, Seconds_behind_master: 0, even when there are around 40+ binlogs to apply.

It kept running at a “fast pace”; it advanced 40 binlogs in a matter of a couple of minutes, until it failed:

mysql> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 10.10.10.10
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.034372
          Read_Master_Log_Pos: 107962899
               Relay_Log_File: mysql-relay-bin.000086
                Relay_Log_Pos: 6294
        Relay_Master_Log_File: mysql-bin.034364
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1032
                   Last_Error: Could not execute Update_rows event on table test.mysql1; Can't find record in mysql1, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.034364, end_log_pos 123596930

mysql> show slave statusG

*************************** 1. row ***************************

Slave_IO_State:

Master_Host: 10.10.10.10

Master_User: repluser

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.034372

Read_Master_Log_Pos: 107962899

Relay_Log_File: mysql-relay-bin.000086

Relay_Log_Pos: 6294

Relay_Master_Log_File: mysql-bin.034364

Slave_IO_Running: No

Slave_SQL_Running: No

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 1032

Last_Error: Could not execute Update_rows event on table test.mysql1; Can't find record in mysql1, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.034364, end_log_pos 123596930

Why? Was our backup broken?

Upon checking the binlogs, it’s found that the binlogs events that were processed “too fast” were generated by the server_id=81, which is the same as the current replica (as those records were generated on db1, which was the primary when the backup was taken). Once these records are tried to be applied on a server with the same server_id, those are skipped.

Then, once the failover was done, the binlogs events are now being generated with the server_id=82 (db2). This is when it started to actually apply the events, but since it skipped multiple events from the same server_id, it missed many updates and inserts to the DB, leading it to an inconsistent state.

In this case, there’s no other option but to retry the restoration, this time changing the server_id value on the restore server before starting replication.

Another approach that could have been applied, other than changing the server ID, was to enable the option replicate-same-server-id.

As per the documentation:

“This option is for use on replicas. The default is 0 (FALSE). With this option set to 1 (TRUE), the replica does not skip events that have its own server ID. This setting is normally useful only in rare configurations.”

Conclusion

The scenario described in this blog may not apply to all restorations, but it is a good reminder to review the server_id when performing restores on a replication environment. It can avoid headaches!

It was also mentioned briefly the parameter “replicate-same-server-id” is not recommended to be used unless we’re sure what we’re doing.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Try Percona Distribution for MySQL today!

0 Comments

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

The Underlying Importance of the server_id Parameter

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

The Underlying Importance of the server_id Parameter

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Valkey/Redis: Configuration Best Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation