article thumbnail

SLOs done right: how DevOps teams can build better service-level objectives

Dynatrace

So how do development and operations (DevOps) teams and site reliability engineers (SREs) distinguish among good, great, and suboptimal SLOs? The state of service-level objectives While SLOs play a critical role in helping DevOps and SRE teams align technical objectives with business goals, they’re not always easy to define.

DevOps 208
article thumbnail

Allegro Reduces Kafka Producer Latency Outliers by 82% After Switching to XFS

InfoQ

Allegro experimented with different performance optimization options to improve Apache Kafka producer tail latency and eventually switched all its clusters to the XFS filesystem. The company used Kafka protocol sniffing, JVM profiling, and eBPF, which proved instrumental in identifying and eliminating performance bottlenecks.

Latency 113
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Implementing service-level objectives to improve software quality

Dynatrace

SLOs enable DevOps teams to predict problems before they occur and especially before they affect customer experience. According to Google’s SRE handbook , best practices, there are “ Four Golden Signals ” we can convert into four SLOs for services: reliability, latency, availability, and saturation. Reliability.

Software 256
article thumbnail

What are quality gates? How to use quality gates to deliver better software at speed and scale

Dynatrace

This approach supports innovation, ambitious SLOs, DevOps scalability, and competitiveness. These metrics are latency, traffic, errors, and saturation, all of which must be key considerations when curating user experience. In this example, unlike latency, the remaining three signals did not receive a “pass.”

Speed 200
article thumbnail

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

That’s why good communication between SREs and DevOps teams is important. At the lowest level, SLIs provide a view of service availability, latency, performance, and capacity across systems. The result is safer, more secure releases for DevOps teams and less overhead for SREs.

article thumbnail

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Dynatrace

The new Amazon capability enables customers to improve the startup latency of their functions from several seconds to as low as sub-second (up to 10 times faster) at P99 (the 99th latency percentile). This can cause latency outliers and may lead to a poor end-user experience for latency-sensitive applications.

Lambda 218
article thumbnail

How Dynatrace boosts production resilience with Site Reliability Guardian

Dynatrace

These examples can help you define your starting point for establishing DevOps and SRE best practices in your organization. In this case, the four golden signals (latency, traffic, errors, and saturation) are derived from span attributes and DQL metric queries via Dynatrace Grail™.

DevOps 180