Remove Availability Remove Exercise Remove Latency Remove Strategy
article thumbnail

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

This blog series will examine the tools, techniques, and strategies we have utilized to achieve this goal. In this testing strategy, we execute a copy (replay) of production traffic against a system’s existing and new versions to perform relevant validations. This approach has a handful of benefits.

Traffic 339
article thumbnail

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

Over the course of this post, we will talk about our approach to this migration, the strategies that we employed, and the tools we built to support this. Functional Testing Functional testing was the most straightforward of them all: a set of tests alongside each path exercised it against the old and new endpoints.

Latency 233
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Real user monitoring vs. synthetic monitoring: Understanding best practices

Dynatrace

This includes development, user acceptance testing, beta testing, and general availability. connectivity, access, user count, latency) of geographic regions. The result is a more comprehensive and robust monitoring strategy that will have a longer-lasting impact on user performance and experience. The bottom line?

article thumbnail

Bring Your Own Cloud (BYOC) vs. Dedicated Hosting at ScaleGrid

Scalegrid

In this post, we compare ScaleGrid’s Bring Your Own Cloud (BYOC) plan vs. the standard Dedicated Hosting model to help you determine the best strategy for your MySQL, PostgreSQL, Redis™ and MongoDB® database deployment. Both AWS EC2 instances and Azure VM instances are available as Reserved Instances, and can be used through the BYOC plan.

Cloud 242
article thumbnail

Taiji: managing global user traffic for large-scale Internet services at the edge

The Morning Paper

Taiji’s routing table is a materialized representation of how user traffic at various edge nodes ought to be distributed over available data centers to balance data center utilization and minimize latency. For example, balance utilisation across all data centers, or optimise for network latency. Sharing is caring caching.

Traffic 42
article thumbnail

Failure Modes and Continuous Resilience

Adrian Cockcroft

There are many possible failure modes, and each exercises a different aspect of resilience. Collecting some critical metrics at one second intervals, with a total observability latency of ten seconds or less matches the human attention span much better. This is why most AWS regions have three availability zones.

Latency 52
article thumbnail

Trade-offs under pressure: heuristics and observations of teams resolving internet service outages (Part II)

The Morning Paper

1:18pm a key observation was made that an API call to populate the homepage sidebar saw a huge jump in latency. The shop had been closed so no data was available. The process tracing exercise included: Examning IRC transcripts from multiple channels. Gathering timestapms of changes made to application code during the outage.