Percona Operators Custom Resource Monitoring With Kube-state-metrics

There are more than 300 Operators in operatorhub, and the number is growing. Percona Operators allow users to easily manage complex database systems in a Kubernetes environment. With Percona Operators, users can easily deploy, monitor, and manage databases orchestrated by Kubernetes, making it easier and more efficient to run databases at scale.

Our Operators come with Custom Resources that have their own statuses and fields to ease monitoring and troubleshooting. For example, PerconaServerMongoDBBackup resource has information about the backup, like the success or failure of the backup. Obviously, there are ways to monitor the backup through storage monitoring or Pod status, but why bother if the Operator already provides this information?

In this article, we will see how someone can monitor Custom Resources that are created by the Operators with kube-state-metrics (KSM), a standard and widely adopted service that listens to the Kubernetes API server and generates metrics. These methods can be applied to any Custom Resources.

Please find the code and recipes from this blog post in this GitHub repository.

The problem

Kube-state-metrics talks to Kubernetes API and captures the information about various resources – Pods, Deployments, Services, etc. Once captured, the metrics are exposed. In the monitoring pipeline, a tool like Prometheus scrapes the metrics exposed.

kube-state-metrics

The problem is that the Custom Resource manifest structure varies depending on the Operator. KSM does not know what to look for in the Kubernetes API. So, our goal is to explain which fields in the Custom Resource we want kube-state-metrics to capture and expose.

The solution

Kube-state-metrics is designed to be extendable for capturing custom resource metrics. It is possible to specify through the custom configuration the resources you need to capture and expose.

Details

Install Kube-state-metrics

To start with, install kube-state-metrics if not done already. We observed issues in scraping custom resource metrics using version 2.5.0. We were able to scrape custom resource metrics without any issues from version >= 2.8.2.

Identify the metrics you want to expose along with the path

Custom resources have a lot of fields. You need to choose the fields that need to be exposed.

For example, the Custom resource “PerconaXtraDBCluster“ has plenty of fields: “spec.crVersion” indicates the CR version, “spec.pxc.size” shows the number of Percona XtraDB Cluster nodes set by the user (We will later look at how to monitor the number of nodes in PXC cluster in a better way).

Metrics can be captured from the status field of the Custom Resources if present. For example:

Following is the status of CustomResource PerconaXtraDBCluster fetched.

status.state indicates the status of Custom Resource, which is very handy information.

$ kubectl get pxc pxc-1 -oyaml | yq 'del(.status.conditions) | .status'
backup: {}
haproxy:
…
  ready: 3
  size: 3
  status: ready
pxc:
 …
  ready: 3
  size: 3
  status: ready
  version: 8.0.29-21.1
ready: 6
size: 6
state: ready

$ kubectl get pxc pxc-1 -oyaml | yq 'del(.status.conditions) | .status'

backup: {}

haproxy:

…

ready: 3

size: 3

status: ready

pxc:

…

ready: 3

size: 3

status: ready

version: 8.0.29-21.1

ready: 6

size: 6

state: ready

Decide the type of metrics for the fields identified

As of today, kube-state-metrics supports three types of metrics available in the open metrics specification:

Gauge
StateSet
Info

Based on the fields selected, map the fields identified to how you want to expose it. For example:

spec.crVersion remains constant throughout the lifecycle of the custom resource until it’s upgraded. Metric type “Info” would be a better fit for this.
spec.pxc.size is a number, and it keeps changing based on the number desired by the user and operator configurations. Even though the number is pretty much constant in the later phase of the lifecycle of the custom resource, it can change. “Gauge” is a great fit for this type of metric.
status.state can take one of the following possible values. “StateSet” would be a better fit for this type of metric.

Derive the configuration to capture custom resource metrics

As per the documentation, configuration needs to be added to kube-state-metrics deployment to define your custom resources and the fields to turn into metrics.

Configuration derived for the three metrics discussed above can be found here.

Consume the configuration in kube-state-metrics deployment

As per the official documentation, there are two ways to apply custom configurations:

Inline: By using --custom-resource-state-config "inline yaml"
Refer a file: By using --custom-resource-state-config-file /path/to/config.yaml

Inline is not handy if the configuration is big. Referring to a file is better and gives more flexibility.

It is important to note that the path to file is the path in the container file system of kube-state-metrics. There are several ways to get a file into the container file system, but one of the options is to mount the data of a ConfigMap to a container.

Steps:

1. Create a configmap with the configurations derived

2. Add configmap as a volume to the kube-state-metrics pod.

  volumes:
      - configMap:
          name: customresource-config-ksm
        name: cr-config

volumes:

- configMap:

3. Mount the volume to the container. As per the Dockerfile of the kube-state-metrics, path “/go/src/k8s.io/kube-state-metrics/” can be used to mount the file.

  volumeMounts:
        - mountPath: /go/src/k8s.io/kube-state-metrics/
          name: cr-config

volumeMounts:

- mountPath: /go/src/k8s.io/kube-state-metrics/

Provide permission to access the custom resources

By default, kube-state-metrics will have permission to access standard resources only as per the ClusterRole. If deployment is done without adding additional privileges, required metrics won’t be scraped.

Add additional privileges based on the custom resource you want to monitor. In this example, we will add additional privileges to monitor PerconaXtraDBCluster, PerconaXtraDBClusterBackup, PerconaXtraDBClusterRestore.

Apply cluster-role and check the logs to see if custom resources are being captured

Validate the metrics being captured

Check the logs of kube-state-metrics

$ kubectl logs -f deploy/kube-state-metrics
I0706 14:43:25.273822       1 wrapper.go:98] "Starting kube-state-metrics"
.
.
.
I0706 14:43:28.285613       1 discovery.go:274] "discovery finished, cache updated"
I0706 14:43:28.285652       1 metrics_handler.go:99] "Autosharding disabled"
I0706 14:43:28.288930       1 custom_resource_metrics.go:79] "Custom resource state added metrics" familyNames=[kube_customresource_pxc_info kube_customresource_pxc_size kube_customresource_pxc_status_state]
I0706 14:43:28.411540       1 builder.go:275] "Active resources" activeStoreNames="certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments,pxc.percona.com/v1, Resource=perconaxtradbclusters"

$ kubectl logs -f deploy/kube-state-metrics

I0706 14:43:25.273822 1 wrapper.go:98] "Starting kube-state-metrics"

I0706 14:43:28.285613 1 discovery.go:274] "discovery finished, cache updated"

I0706 14:43:28.285652 1 metrics_handler.go:99] "Autosharding disabled"

I0706 14:43:28.288930 1 custom_resource_metrics.go:79] "Custom resource state added metrics" familyNames=[kube_customresource_pxc_info kube_customresource_pxc_size kube_customresource_pxc_status_state]

I0706 14:43:28.411540 1 builder.go:275] "Active resources" activeStoreNames="certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments,pxc.percona.com/v1, Resource=perconaxtradbclusters"

Check the kube-state-metrics service to list the metrics scraped.

Open a terminal and keep the port-forward command running:

$ kubectl port-forward svc/kube-state-metrics  8080:8080
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
Handling connection for 8080

$ kubectl port-forward svc/kube-state-metrics 8080:8080

Forwarding from 127.0.0.1:8080 -> 8080

Forwarding from [::1]:8080 -> 8080

Handling connection for 8080

In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).

Observe the metrics kube_customresource_pxc_info, kube_customresource_pxc_status_state, kube_customresource_pxc_size being captured.

# HELP kube_customresource_pxc_info Information of PXC cluster on k8s
# TYPE kube_customresource_pxc_info info
kube_customresource_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",version="1.9.0"} 1
# HELP kube_customresource_pxc_size Desired size for the PXC cluster
# TYPE kube_customresource_pxc_size gauge
kube_customresource_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1"} 3
# HELP kube_customresource_pxc_status_state State of PXC Cluster
# TYPE kube_customresource_pxc_status_state stateset
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="error"} 1
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="initializing"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="paused"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="ready"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="unknown"} 0

# HELP kube_customresource_pxc_info Information of PXC cluster on k8s

# TYPE kube_customresource_pxc_info info

kube_customresource_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",version="1.9.0"} 1

# HELP kube_customresource_pxc_size Desired size for the PXC cluster

# TYPE kube_customresource_pxc_size gauge

kube_customresource_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1"} 3

# HELP kube_customresource_pxc_status_state State of PXC Cluster

# TYPE kube_customresource_pxc_status_state stateset

kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="error"} 1

kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="initializing"} 0

kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="paused"} 0

kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="ready"} 0

kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="unknown"} 0

Customize the metric name, add default labels

As seen above, the metrics captured had the prefix kube_customresource. What if we want to customize it?

There are some standard labels, like the name of the custom resource and namespace of the custom resources, which might need to be captured as labels for all the metrics related to a custom resource. It’s not practical to add this for every single metric captured. Hence, identifiers labelsFromPath and metricNamePrefix are used.

In the below snippet, all the metrics captured for the group pxc.percona.com, version v1, kind PerconaXtrDBCluster will have the metric prefix kube_pxc and all the metrics will have the following labels-

name – Derived from the path metadata.name of the custom resource
namespace – Derived from the path metadata.namespace of the custom resource.

spec:
      resources:
        - groupVersionKind:
            group: pxc.percona.com
            version: v1
            kind: PerconaXtraDBCluster
          labelsFromPath:
            name: [metadata,name]
            namespace: [metadata,namespace]
          metricNamePrefix: kube_pxc

spec:

resources:

- groupVersionKind:

group: pxc.percona.com

version: v1

kind: PerconaXtraDBCluster

labelsFromPath:

namespace: [metadata,namespace]

metricNamePrefix: kube_pxc

Change the configuration present in the configmap and apply the new configmap.

When the new configmap is applied, kube-state-metrics should automatically pick up the configuration changes; you can also do a “kubectl rollout restart deploy kube-state-metrics” to expedite the pod restart.

Once the changes are applied, check the metrics by port-forwarding to kube-state-metrics service.

$ kubectl port-forward svc/kube-state-metrics  8080:8080
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
Handling connection for 8080

$ kubectl port-forward svc/kube-state-metrics 8080:8080

Forwarding from 127.0.0.1:8080 -> 8080

Forwarding from [::1]:8080 -> 8080

Handling connection for 8080

In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).

Observe the metrics:

# HELP kube_pxc_pxc_info Information of PXC cluster on k8s
# TYPE kube_pxc_pxc_info info
kube_pxc_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",version="1.9.0"} 1
# HELP kube_pxc_pxc_size Desired size for the PXC cluster
# TYPE kube_pxc_pxc_size gauge
kube_pxc_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc"} 3
# HELP kube_pxc_pxc_status_state State of PXC Cluster
# TYPE kube_pxc_pxc_status_state stateset
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="error"} 1
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="initializing"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="paused"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="ready"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="unknown"} 0

# HELP kube_pxc_pxc_info Information of PXC cluster on k8s

# TYPE kube_pxc_pxc_info info

kube_pxc_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",version="1.9.0"} 1

# HELP kube_pxc_pxc_size Desired size for the PXC cluster

# TYPE kube_pxc_pxc_size gauge

kube_pxc_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc"} 3

# HELP kube_pxc_pxc_status_state State of PXC Cluster

# TYPE kube_pxc_pxc_status_state stateset

kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="error"} 1

kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="initializing"} 0

kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="paused"} 0

kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="ready"} 0

kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="unknown"} 0

Labels customization

By default, kube-state-metrics doesn’t capture all the labels of the resources. However, this might be handy in deriving co-relations from custom resources to the k8s objects. To add additional labels, use the flag --metric-labels-allowlist as mentioned in the documentation.

To demonstrate, changes are made to the kube-state-metrics deployment and applied.

Check the metrics by doing a port-forward to the service as instructed earlier.

Check the labels captured of pod cluster1-pxc-0:

kube_pod_labels{namespace="pxc",pod="cluster1-pxc-0",uid="1083ac08-5c25-4ede-89ce-1837f2b66f3d",label_app_kubernetes_io_component="pxc",label_app_kubernetes_io_instance="cluster1",label_app_kubernetes_io_managed_by="percona-xtradb-cluster-operator",label_app_kubernetes_io_name="percona-xtradb-cluster",label_app_kubernetes_io_part_of="percona-xtradb-cluster"} 1

kube_pod_labels{namespace="pxc",pod="cluster1-pxc-0",uid="1083ac08-5c25-4ede-89ce-1837f2b66f3d",label_app_kubernetes_io_component="pxc",label_app_kubernetes_io_instance="cluster1",label_app_kubernetes_io_managed_by="percona-xtradb-cluster-operator",label_app_kubernetes_io_name="percona-xtradb-cluster",label_app_kubernetes_io_part_of="percona-xtradb-cluster"} 1

Labels of the pod can be checked in the cluster:

$ kubectl get po -n pxc cluster1-pxc-0 --show-labels
NAME             READY   STATUS    RESTARTS         AGE     LABELS
cluster1-pxc-0   3/3     Running   0                3h54m   app.kubernetes.io/component=pxc,app.kubernetes.io/instance=cluster1,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster,controller-revision-hash=cluster1-pxc-6f4955bbc7,statefulset.kubernetes.io/pod-name=cluster1-pxc-0

$ kubectl get po -n pxc cluster1-pxc-0 --show-labels

NAME READY STATUS RESTARTS AGE LABELS

cluster1-pxc-0 3/3 Running 0 3h54m app.kubernetes.io/component=pxc,app.kubernetes.io/instance=cluster1,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster,controller-revision-hash=cluster1-pxc-6f4955bbc7,statefulset.kubernetes.io/pod-name=cluster1-pxc-0

Adhering to the Prometheus conventions, character . (dot) is replaced with _(underscore). Only labels mentioned in the --metric-labels-allowlist are captured for the labels info.

Checking for the other pod:

$ kubectl get po -n kube-system kube-state-metrics-7bd9c67f64-46ksw --show-labels
NAME                                  READY   STATUS    RESTARTS      AGE    LABELS
kube-state-metrics-7bd9c67f64-46ksw   1/1     Running   1 (40m ago)   120m   app.kubernetes.io/component=exporter,app.kubernetes.io/name=kube-state-metrics,app.kubernetes.io/version=2.9.2,pod-template-hash=7bd9c67f64

$ kubectl get po -n kube-system kube-state-metrics-7bd9c67f64-46ksw --show-labels

NAME READY STATUS RESTARTS AGE LABELS

kube-state-metrics-7bd9c67f64-46ksw 1/1 Running 1 (40m ago) 120m app.kubernetes.io/component=exporter,app.kubernetes.io/name=kube-state-metrics,app.kubernetes.io/version=2.9.2,pod-template-hash=7bd9c67f64

Following are the labels captured in the kube-state-metrics service:

kube_pod_labels{namespace="kube-system",pod="kube-state-metrics-7bd9c67f64-46ksw",uid="d4b30238-d29e-4251-a8e3-c2fad1bff724",label_app_kubernetes_io_component="exporter",label_app_kubernetes_io_name="kube-state-metrics"} 1

1	kube_pod_labels{namespace="kube-system",pod="kube-state-metrics-7bd9c67f64-46ksw",uid="d4b30238-d29e-4251-a8e3-c2fad1bff724",label_app_kubernetes_io_component="exporter",label_app_kubernetes_io_name="kube-state-metrics"} 1

As can be seen above, label app.kubernetes.io/version is not captured because it was not mentioned in the --metric-labels-allowlist flag of kube-state-metrics.

Conclusion

Custom Resource metrics can be captured by modifying kube-state-metrics deployment. Metrics can be captured without writing any code.
Alternate to the above method, the custom exporter can be written to expose the metrics, which gives a lot of flexibility. However, this needs coding and maintenance.
Metrics can be scraped by Prometheus to derive useful insights combined with the other metrics.

If you want to extend the same process to other custom resources related to Percona Operators, use the following ClusterRole to provide permission to read the relevant custom resources. Configurations for some of the important metrics related to the custom resources are captured in this Configmap for you to explore.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

0 Comments

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Percona Operators Custom Resource Monitoring With Kube-state-metrics

The problem

The solution

Details

Install Kube-state-metrics

Identify the metrics you want to expose along with the path

Decide the type of metrics for the fields identified

Derive the configuration to capture custom resource metrics

Consume the configuration in kube-state-metrics deployment

Provide permission to access the custom resources

Validate the metrics being captured

Customize the metric name, add default labels

Labels customization

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Trying out the PostgreSQL pg_tde Tech Preview Release

Benchmarking MongoDB Performance on Kubernetes

Why MariaDB Is “Better” Than MySQL

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Percona Operators Custom Resource Monitoring With Kube-state-metrics

The problem

The solution

Details

Install Kube-state-metrics

Identify the metrics you want to expose along with the path

Decide the type of metrics for the fields identified

Derive the configuration to capture custom resource metrics

Consume the configuration in kube-state-metrics deployment

Provide permission to access the custom resources

Validate the metrics being captured

Customize the metric name, add default labels

Labels customization

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Trying out the PostgreSQL pg_tde Tech Preview Release

Benchmarking MongoDB Performance on Kubernetes

Why MariaDB Is “Better” Than MySQL

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation