article thumbnail

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

Replay Traffic Testing Replay traffic refers to production traffic that is cloned and forked over to a different path in the service call graph, allowing us to exercise new/updated systems in a manner that simulates actual production conditions. This approach has a handful of benefits.

Traffic 339
article thumbnail

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

For each route we migrated, we wanted to make sure we were not introducing any regressions: either in the form of missing (or worse, wrong) data, or by increasing the latency of each endpoint. Being able to canary a new route let us verify latency and error rates were within acceptable limits.

Latency 233
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Real user monitoring vs. synthetic monitoring: Understanding best practices

Dynatrace

Data collected on page load events, for example, can include navigation start (when performance begins to be measured), request start (right before the user makes a request from the server), and speed index metrics (measure page load speed). connectivity, access, user count, latency) of geographic regions. The bottom line?

article thumbnail

Bring Your Own Cloud (BYOC) vs. Dedicated Hosting at ScaleGrid

Scalegrid

Deploying your application and database on the same VPC also provides the lowest possible latency path. AWS Security Groups and Azure Network Security Groups allow you to lock down access to your servers through advanced virtual firewalls. This becomes really important for cache solutions like Redis™. Security Groups. No problem.

Cloud 242
article thumbnail

Automating chaos experiments in production

The Morning Paper

They use a combination of timeouts, retries, and fallbacks to try to mitigate the effects of these failures, but these don’t get exercised as often as the happy path, so how can we be confident they’ll work as intended when called upon? RPCs at Netflix are wrapped as Hystrix commands.

Latency 77
article thumbnail

Evaluating the Evaluation: A Benchmarking Checklist

Brendan Gregg

sounds like a homework exercise of purely academic value. If you develop a habit of reading only the operation rate and latency numbers from a lengthy benchmark report (or you have a shell script to do this that feeds a GUI), it's easy to miss other details in the report such as the error rate. ### 4. What's the limiter?" No packets.

article thumbnail

Evaluating the Evaluation: A Benchmarking Checklist

Brendan Gregg

sounds like a homework exercise of purely academic value. If you develop a habit of reading only the operation rate and latency numbers from a lengthy benchmark report (or you have a shell script to do this that feeds a GUI), it's easy to miss other details in the report such as the error rate. ### 4. What's the limiter?" No packets.