Remove Java Remove Latency Remove Traffic Remove Tuning
article thumbnail

Bending pause times to your will with Generational ZGC

The Netflix TechBlog

If you’re interested in how we use Java at Netflix, Paul Bakker’s talk How Netflix Really Uses Java , is a great place to start. Reduced tail latencies In both our GRPC and DGS Framework services, GC pauses are a significant source of tail latencies. No explicit tuning has been required to achieve these results.

Latency 228
article thumbnail

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

If we had an ID for each streaming session then distributed tracing could easily reconstruct session failure by providing service topology, retry and error tags, and latency measurements for all service calls. We chose Open-Zipkin because it had better integrations with our Spring Boot based Java runtime environment.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

A single API team maintained both the Java implementation of the Falcor framework and the API Server. The control group’s traffic utilized the legacy Falcor stack, while the experiment population leveraged the new GraphQL client and was directed to the GraphQL Shim. The Replay Tester tool samples raw traffic streams from Mantis.

Traffic 353
article thumbnail

Achieving observability in async workflows

The Netflix TechBlog

Prodicle Distribution Our service is required to be elastic and handle bursty traffic. We are expected to process 1,000 watermarks for a single distribution in a minute, with non-linear latency growth as the number of watermarks increases. We wanted a scalable service that was near real-time, 2.

Traffic 160
article thumbnail

The road to observability demo part 3: Collect, instrument, and analyze telemetry data automatically with Dynatrace

Dynatrace

The beauty of OneAgent is it’s a drop-in solution and monitors every supported technology (for example,NET, Java, PHP, Node.js) with little to no manual work required from your side. Garbage collection count Garbage collection is JVM related and indicates how often the Java GC ran.

Metrics 168
article thumbnail

The Speed of Time

Brendan Gregg

A Cassandra database cluster had switched to Ubuntu and noticed write latency increased by over 30%. Since instances of both CentOS and Ubuntu were running in parallel, I could collect flame graphs at the same time (same time-of-day traffic mix) and compare them side by side. This is how Java flame graphs looked at the time.

Speed 126
article thumbnail

Zero Configuration Service Mesh with On-Demand Cluster Discovery

The Netflix TechBlog

In order for a service to talk to another, it needs to know two things: the name of the destination service, and whether or not the traffic should be secure. The ability to run in a degraded but available state during an outage is still a marked improvement over completely stopping traffic flow.

Traffic 220