Percona Monitoring and Management (PMM) has become a valuable tool for database professionals, providing comprehensive insights into database health and performance. A recent update (version 2.41.0) introduced a significant enhancement: the ability to run PMM in high availability (HA) mode. This feature, currently in technical preview, offers exciting possibilities for ensuring the reliability and robustness of your database monitoring system.

This guide will walk you through the process of setting up PMM in an HA environment. By leveraging HA for PMM, you can elevate the reliability of your database operations and gain peace of mind knowing your monitoring system remains resilient even during failures.

PMM 2.41.0: HA mode in technical preview

The HA mode in PMM version 2.41.0 is a significant enhancement to our monitoring solution. We encourage users to explore this new feature and share feedback, helping us refine and improve its functionality.

Why PMM in HA mode?

PMM in HA mode provides a constantly active monitoring system, significantly reducing the risk of data loss and ensuring uninterrupted operations, which is critical for maintaining continuous database performance.

Architectural overview:

The HA setup for PMM consists of several key components:

  • PMM instances: These are the core of the monitoring system, actively collecting and processing monitoring data. Multiple instances provide redundancy, ensuring continuous monitoring even in the event of individual instance failures.
  • PostgreSQL cluster: This cluster primarily stores metadata and configuration data for PMM. It includes information about the monitored systems, user settings, and other essential configuration details necessary for PMM’s operation.
  • ClickHouse cluster: Specializing in time-series data, ClickHouse stores query performance metrics, notably the Query Analytics (QAN) data.
  • VictoriaMetrics cluster: This component is focused on storing and managing operational metrics from the monitored databases and hosts. It includes a wide range of metrics such as resource usage, response times, and various performance indicators, providing a comprehensive view of the health and performance of the monitored systems.
  • HAProxy: HAProxy plays a pivotal role in the HA architecture by managing and directing network traffic. It is specifically responsible for routing all traffic to the current leader instance among the PMM instances, ensuring that the active leader always handles requests. This mechanism is crucial for maintaining the high availability and reliability of the PMM system.

PMM High Availability

Leader election in PMM high availability architecture:

Leader election is a critical feature in PMM’s HA setup, ensuring the system remains reliable and effective. In this process, PMM instances determine which one is the leader at any given time. For effective leader election, it’s recommended to have at least three PMM instances. This ensures a majority vote can always be achieved in the election process.

The leader instance coordinates operations and maintains data consistency across the HA environment. PMM employs the Raft consensus algorithm for this purpose, which is known for its robustness in distributed systems. This algorithm prevents conflicting leadership scenarios and ensures a smooth transition of the leader role if the current leader fails.

The election procedure involves continuous communication among PMM instances to affirm the leader’s status. Raft initiates a new election to select a successor in case of leader failure, thereby maintaining uninterrupted monitoring and management. This dynamic leader election mechanism is essential for PMM’s adaptability and continuous operation, particularly crucial for monitoring critical database systems.

Setting up PMM in HA mode:

  1. Create a HA PostgreSQL cluster for data storage.
  2. Integrate clustered versions of ClickHouse and VictoriaMetrics for enhanced data processing.
  3. Configure HAProxy for effective load balancing and network management.
  4. Set up multiple PMM instances for redundancy.

Please refer to our comprehensive guide for detailed instructions on setting up and managing your HA PMM environment.

Best practices and considerations:

Version compatibility: Use service versions specified in our documentation for optimal compatibility with PMM in HA mode.
Utilize clustered versions of services: Opt for clustered versions of ClickHouse, PostgreSQL, and VictoriaMetrics to boost system resilience and efficiency.
Distribute instances: Run PMM instances on different servers to prevent single points of failure.
Regular updates and testing: Maintain your system with regular updates and test failover mechanisms to ensure robustness.

Future plans for PMM:

  1. Scalable PMM with multiple active instances: We are working towards a scalable PMM architecture that supports multiple active instances and load balancing, enhancing performance and reliability.
  2. PMM setup in HA mode on Kubernetes: Development is underway for a Helm chart to facilitate easy PMM setup in HA mode on Kubernetes, simplifying deployment in containerized environments.

Your feedback matters:

As PMM 2.41.0’s HA mode is in technical preview, your feedback is crucial. Share your experiences and suggestions to help us enhance PMM’s HA functionality.

Conclusion:

The ability to run PMM in high availability mode (introduced in version 2.41.0) is a major step forward for database monitoring. This feature helps ensure continuous monitoring and empowers informed decision-making.

Percona Monitoring and Management is a best-of-breed open source database monitoring solution tool for use with MySQL, PostgreSQL, MongoDB, and the servers on which they run. Monitor, manage, and improve the performance of your databases no matter where they are located or deployed.

 

Download Percona Monitoring and Management Today

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments