Header background

Simplify Kubernetes complexity with advanced AIOps and cloud observability

Kubernetes boosts software development speed and reliability, but complexity is a common issue. See how Dynatrace eliminates Kubernetes complexity with cloud observability backed by AIOps.

As more organizations turn to application containerization, managing the tasks and processes that come with containers becomes critical. Kubernetes helps organizations better manage containerized workloads and services. In turn, it sets the stage for fast, functional, and reliable software development. However, at scale, these deployments can encounter a common roadblock: complexity. That’s where AIOps comes in.

To combat Kubernetes complexity and capitalize on the full benefits of the open-source container orchestration platform, organizations need advanced AIOps that can intelligently manage the environment. Cloud-native observability and artificial intelligence (AI) can help organizations do just that with improved analysis and targeted insight.

At Dynatrace Perform 2022, Dynatrace Product Manager Florian Geigl and Senior Product Manager Matt Reider discuss the key DevOps challenges of Kubernetes complexity and explore how Dynatrace streamlines operations. Additionally, they discuss the need for cloud-native observability with GitOps that provides continuous operational insight across the Kubernetes value stream.

Cloud-native observability delivers enhanced Kubernetes insight

Cloud-native technologies enable organizations to build and run scalable applications in environments like public, private, and hybrid clouds. With these environments, organizations can take advantage of increased flexibility and scalability.

There are many options for conducting cloud-native operations, such as containers, service meshes, microservices, immutable infrastructure, and declarative APIs. Therefore, Reider says, it’s not about the specific technologies companies use.

Enterprises can use any set of cloud-based platforms, tools, and solutions in a cloud-native approach. What matters more than components is the ability to understand what’s happening across the operational stack anytime, anywhere. Dynatrace offers organizations robust, cloud-native observability that provides ongoing insight into operations across the Kubernetes value stream — throughout planning, commits, testing, observation, and analysis.

While Kubernetes excels at returning current compute states to desired states, problems can occur when business conditions change. Kubernetes’ efforts to restore compute conditions can lead to process evictions, out-of-memory kills, and workload throttling, reducing overall performance. “That’s really what Kubernetes’ job is, which is to try to run declared state in production — always trying to return it back to that declared state,” Reider says.

Dynatrace’s cloud-native observability is essential to monitor and manage these performance issues as they occur.

“The desired state of our compute resources [is] suddenly out of step with the current state of these resources. So, Kubernetes is going to try to turn this desired state back to its current state by … evicting less important work that’s affecting our compute resources, or killing work that’s exceeding our memory limit, or throttling work that’s exceeding our CPU limits,” Reider explains. “This cycle leads us to make changes. Those changes can only be made by using Dynatrace for analyzing what those workloads should look like and what the limits should be for memory and CPU based on these new business conditions.”

Solving key Kubernetes compute issues

While observability is the starting point, enterprises also benefit from agile AIOps to manage Kubernetes. AIOps can help to address three key Kubernetes complexities:

  • Requests and limits. Kubernetes makes it easy to set requests and limits. For example, a container request reserves 500 CPU mCores that users can access for specific operations and has a limit of one core, which throttles the process if it needs more than that. This leaves room for flexibility. Processes can use more than 500 mCores as long as they don’t hit the hard limit or the current node is out of CPU. However, as environments evolve, processes can be unexpectedly throttled or evicted.
  • Overprovisioned resource waste. Kubernetes deployments can also lead to overprovisioned resource waste, in part due to the “IKEA effect.” This speaks to the higher value users place on services or processes they build themselves. The result is higher requests that lead to significantly overprovisioned resources — which collectively costs organizations up to $6 billion annually worldwide.
  • Safe resource resizing. Another challenge organizations often run into is undersizing resources to address overprovisioning issues. While the goal is safe resource resizing, Geigl highlights the impact of the “butterfly effect,” which occurs when seemingly unrelated processes have a significant effect on overall performance. To avoid this, organizations need an AI solution that not only reveals a performance issue, but how that issue affects an entire system.

Managing Kubernetes complexity with Davis intelligence

So, how does Dynatrace help to intelligently manage Kubernetes?

First, the dashboard provides an overview of the entire Kubernetes workload, showing current properties, container utilization, and existing Kubernetes pods. Next, the dashboard dynamically evaluates the current Kubernetes workload sizing, which includes workloads that are both under- and overcommitted.

Additionally, Dynatrace’s deterministic, causation-based AIOps engine, Davis, details front-end response time and throughput. From there, Davis pinpoints specific issues — such as response time degradation. Then, it ties these issues to specific business effects and metric anomalies.

Most importantly, Davis identifies root causes so teams can act immediately. “Davis provides the safety net here,” Geigl explains. “It alerts on the service that was impacted due to undercommitment.”

For example, Davis can show users how resource undercommitment slowed down a service. Additionally, the deterministic AI solution highlights how that single-service slowdown had a huge impact on the entire system. This can lead to a poor user experience. Davis is the only AI engine that connects the dots between services and understands how small changes can have a major effect on complex systems.


Keeping Kubernetes simple with Dynatrace

Kubernetes remains a critical part of DevOps and GitOps functionality. But simply deploying Kubernetes isn’t enough. To make the most of this container orchestration system, organizations need cloud-native observability backed by AIOps to automatically discover what’s happening across their development stack, identify key issues, and take action to improve overall output.

To learn more about how Dynatrace simplifies cloud-native Kubernetes management with AIOps, check out the session, “Simplify Kubernetes complexity with advanced AIOps.”

Kubernetes in the wild report 2023

This Kubernetes survey shows how organizations actually use Kubernetes in production. The study analyzes factual Kubernetes production data from thousands of organizations worldwide that are using the Dynatrace Software Intelligence Platform to keep their Kubernetes clusters secure, healthy, and high performing.

Pie charts showing Kubernetes adoption of cloud-hosted clusters vs. on-premises clusters