Exploring MySQL™ Binlog Server – Ripple

6 min read
Exploring MySQL™ Binlog Server – Ripple

SHARE THIS ARTICLE

MySQL does not limit the number of slaves that you can connect to the master server in a replication topology. However, as the number of slaves increases, they will have a toll on the master resources because the binary logs will need to be served to different slaves working at different speeds. If the data churn on the master is high, the serving of binary logs alone could saturate the network interface of the master.

A classic solution for this problem is to deploy a binlog server – an intermediate proxy server that sits between the master and its slaves. The binlog server is set up as a slave to the master, and in turn, acts as a master to the original set of slaves. It receives binary log events from the master, does not apply these events, but serves them to all the other slaves. This way, the load on the master is tremendously reduced, and at the same time, the binlog server serves the binlogs more efficiently to slaves since it does not have to do any other database server processing.

MySQL Binlog Server Deployment Diagram - ScaleGrid Blog

Ripple is an open source binlog server developed by Pavel Ivanov. A blog post from Percona, titled MySQL Ripple: The First Impression of a MySQL Binlog Server, gives a very good introduction to deploying and using Ripple. I had an opportunity to explore Ripple in some more detail and wanted to share my observations through this post.

1. Support for GTID based replication

Ripple supports only GTID mode, and not file and position-based replication. If your master is running in non-GTID mode, you will get this error from Ripple:

Failed to read packet: Got error reading packet from server: The replication sender thread cannot start in AUTO_POSITION mode: this server has GTID_MODE = OFF instead of ON.

You can specify Server_id and UUID for the ripple server using the cmd line options:  -ripple_server_id and  -ripple_server_uuid

Both are optional parameters, and if not specified, Ripple will use the default server_id=112211 and uuid will be auto generated.

2. Connecting to the master using replication user and password

While connecting to the master, you can specify the replication user and password using the command line options:

 -ripple_master_user and  -ripple_master_password

3. Connection endpoint for the Ripple server

You can use the command line options -ripple_server_ports and -ripple_server_address to specify the connection end points for the Ripple server. Ensure to specify the network accessible hostname or IP address of your Ripple server as the  -rippple_server_address. Otherwise, by default, Ripple will bind to localhost and hence you will not be able to connect to it remotely.

4. Setting up slaves to the Ripple server

You can use the CHANGE MASTER TO command to connect your slaves to replicate from the Ripple server.

To ensure that Ripple can authenticate the password that you use to connect to it, you need to start Ripple by specifying the option -ripple_server_password_hash

For example, if you start the ripple server with the command:

rippled -ripple_datadir=./binlog_server -ripple_master_address= <master ip>  -ripple_master_port=3306 -ripple_master_user=repl -ripple_master_password='password' -ripple_server_ports=15000 -ripple_server_address='172.31.23.201' -ripple_server_password_hash='EF8C75CB6E99A0732D2DE207DAEF65D555BDFB8E'

you can use the following CHANGE MASTER TO command to connect from the slave:

CHANGE MASTER TO master_host='172.31.23.201', master_port=15000, master_password=’XpKWeZRNH5#satCI’, master_user=’rep’

Note that the password hash specified for the Ripple server corresponds to the text password used in the CHANGE MASTER TO command. Currently, Ripple does not authenticate based on the usernames and accepts any non-empty username as long as the password matches.

5. Ripple server management

It’s possible to monitor and manage the Ripple server using the MySQL protocol from any standard MySQL client. There are a limited set of commands that are supported which you can see directly in the source code on the mysql-ripple GitHub page.

Some of the useful commands are:

  • SELECT @@global.gtid_executed; – To see the GTID SET of the Ripple server based on its downloaded binary logs.
  • STOP SLAVE; – To disconnect the Ripple server from the master.
  • START SLAVE; – To connect the Ripple server to the master.

Known Issues & Suggestions for Improvement

1. I did not see an option to set up an SSL replication channel from a Ripple server to the master

As a result of this, Ripple server will not be able to connect to a master that mandates encrypted connections. Attempting to connect will result in the error:

0322 09:01:36.555124 14942 mysql_master_session.cc:164] Failed to connected to host: <Hosname>, port: 3306, err: Failed to connect: Connections using insecure transport are prohibited while --require_secure_transport=ON.

2. I was not able to get Ripple server working with the semi-sync option

I started the Ripple server using the option -ripple_semi_sync_slave_enabled=true

On connecting it, the master was able to detect the Ripple server as a semi-sync enabled slave.

mysql> show status like 'rpl%';
------------------------------------------------------
| Variable_name                              | Value |
------------------------------------------------------
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_slave_status                 | OFF   |
------------------------------------------------------

However, trying to execute a transaction in semi-sync mode waited for rpl_semi_sync_master_timeout which was 180000

mysql> create database d12;
Query OK, 1 row affected (3 min 0.01 sec)

I could see that semi-sync got turned off at the master:

mysql> show status like 'rpl%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_status                | OFF   |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+

Corresponding snippet from the mysql error logs:

2020-03-21T10:05:56.000612Z 52 [Note] Start binlog_dump to master_thread_id(52) slave_server(112211), pos(, 4)
2020-03-21T10:05:56.000627Z 52 [Note] Start semi-sync binlog_dump to slave (server_id: 112211), pos(, 4)
20020-03-21T10:08:55.873990Z 2 [Warning] Timeout waiting for reply of binlog (file: mysql-bin.000010, pos: 350), semi-sync up to file , position 4.
2020-03-21T10:08:55.874044Z 2 [Note] Semi-sync replication switched OFF.

There is an issue reported along similar lines here on the MySQL Ripple Github page.

3. Issue when using parallel replication for the slaves of Ripple server

I saw that SQL thread on the slave would often stop with the error:

Last_SQL_Error: Cannot execute the current event group in the parallel mode. Encountered event Gtid, relay-log name /mysql_data/relaylogs/relay-log.000005, position 27023962 which prevents execution of this event group in parallel mode. Reason: The master event is logically timestamped incorrectly.

Analyzing the relay log and position above revealed that the ‘sequence number’ of the transaction at this point was reset to 1. I tracked down the cause to a binlog rotation happening on the original master. Typically, for direct slaves, there is a rotate event due to which relay logs would also rotate based on master binary log rotation. My assessment is that such conditions can be detected and sequence number reset can be handled by parallel threads. But when the sequence number changes without the rotation of the relay logs, we see the parallel threads failing.

This observation is reported as the issue: slave parallel thread failure while syncing from binlog server #26

4. mysqlbinlog utility does not work on the binary logs produced by Ripple server

Trying to run the mysqlbinlog utility on the binary log resulted in the error:

ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 43, event_type: -106

This issue is raised here: Not able to open the binary log files using mysqlbinlog utility. #25

It’s acknowledged by the author as a known issue. I feel that it would be useful to support this utility for debugging purposes.

That’s the report for now from my quick testing. I plan to update this blog post as and when I come across more findings on Ripple. Overall, I found it to be simple and straightforward to use and has the potential to become a standard for binlog servers in MySQL environments.

More tips for you

MySQL Server Health Checks

In a MySQL master-slave high availability (HA) setup, it is important to continuously monitor the health of the master and slave servers so you can detect potential issues and take corrective actions. Learn more


MySQL Rolling Index Builds

How to optimize the MySQL index creation process in such a way that your regular workload is not impacted. If you have a MySQL master-slave replica set, you can create the index one node at a time in a rolling fashion.Learn more


MySQL High Availability

The availability of a system is the percentage of time its services are up during a period of time. It’s generally expressed as a series of 9′s. See the availability and the corresponding downtime measured over one year. Learn more

For more information, please visit www.scalegrid.io. Connect with ScaleGrid on LinkedIn, X, Facebook, and YouTube.
Table of Contents

Stay Ahead with ScaleGrid Insights

Dive into the world of database management with our monthly newsletter. Get expert tips, in-depth articles, and the latest news, directly to your inbox.

Related Posts

pitr mysql

Master MySQL Point in Time Recovery

Data loss or corruption can be daunting. With MySQL point-in-time recovery, you can restore your database to the moment before...

Setting Up MongoDB SSL Encryption

In a world where data security is essential, enabling MongoDB SSL is critical in fortifying your database. This guide walks...

distributed storage system

What is a Distributed Storage System

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and...

NEWS

Add Headline Here