Coroot, an open source observability tool powered by eBPF, went generally available with version 1.0 last week. As this tool is cloud-native, we were curious to know how it can help troubleshoot databases on Kubernetes.

In this blog post, we will see how to quickly debug PostgreSQL with Coroot and Percona Operator for PostgreSQL.

Prepare

Install Coroot

The easiest way here is to use a helm chart.  Add a repository:

Install Coroot:

This will install Coroot, Prometheus with kube-state-metrics, coroot-node-agent and Clickhouse. Clickhouse is required for profiling, logs, and tracing. 

Check out the detailed installation instructions in the official documentation

Once the pods are up and running, connect to Coroot UI through port-forwarding:

Now access it at http://localhost:8080/.

Deploy PostgreSQL cluster with Operator

For consistency, we will use a helm as well. Add repository:

Deploy the Operator and the cluster:

Verify that the database is up with kubectl get pg. It should be in a ready state.

More insights with PostgreSQL agent

Coroot provides agents for various applications, including PostgreSQL. With agents, users get more insights tailored to their application. To install the agent with Percona Operator, we need to add it as a sidecar and configure the user on the PostgreSQL side.

Under instances.[].sidecars  section add the following:

You can read more about user management in our Operator in the documentation.

Coroot in action

Discovery

Coroot automatically discovers applications in Kubernetes (as many other eBPF tools do). For specific applications it shows all the components that it interacts with. For example, for my PostgreSQL cluster it looked like this:

Coroot in action

I don’t have a lot going on, so there are just a few other containers that interact with my cluster. 

SLO

SLO, Service Level Objectives, is a standard value that can quickly tell if the service-level agreement is met or broken for the application. I was restarting my cluster and nodes a lot, and that immediately showed some issues with SLO:

Service Level Objectives

Once Coroot detects SLO budget breach, it highlights it on the graphs. This issue is visible across all the graphs for easier debugging.

Logs

Built-in centralized logging is a must for a modern observability system. It definitely helps to debug complex applications with multiple components. A highly available PostgreSQL cluster has pgBouncer pods, primary and replica nodes, and backup containers. All these components are hard to debug without proper logging.

Profiling

There is nothing better for good debugging than a nice flame graph. Looking into CPU profiling can tell a lot about what PostgreSQL and its components are doing and where resources are going. Obviously, it might be even more valuable to profile some applications.

PostgreSQL-specific metrics

For PostgreSQL-specific metrics, Coroot relies on pg_stat_statement and pg_stat_activity. It shows basic information about queries and connections. From an infrastructure perspective, I liked how it quickly pointed out when specific instances were restarted and when the switchover between primary and replica happened. For example, during the time the incident was detected, it is clearly visible that there were some restarts of the nodes.

PostgreSQL-specific metrics

You will see the queries that are taking the most of the resources. Also how the load impacts the replication lag. I ran sysbench to generate some load, and this is how it impacted my database:

Coroot’s use of eBPF technology for observability offers a powerful, user-friendly solution that enhances visibility and simplifies troubleshooting. The addition of PostgreSQL-specific agents further tailors this tool to the needs of database administrators by providing precise metrics and logs. Whether you are managing service-level objectives, tracking application interactions, or maintaining database health, Coroot equips you with the necessary insights to efficiently diagnose and resolve issues, ensuring your database operations run smoothly and reliably. This synergy between advanced monitoring tools and Kubernetes operations paves the way for more resilient and performant database environments in cloud-native ecosystems.

 

Try out Percona Operator for PostgreSQL  Try out Coroot

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments