Exercise, Latency, Strategy and Testing - Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

This blog series will examine the tools, techniques, and strategies we have utilized to achieve this goal. This blog post will provide a detailed analysis of replay traffic testing, a versatile technique we have applied in the preliminary validation phase for multiple migration initiatives. This approach has a handful of benefits.

Traffic

Traffic Latency Tuning Systems

Interpreting A/B test results: false negatives and power

The Netflix TechBlog

OCTOBER 26, 2021

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This is the fourth post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. Have a look at Part 1 (Decision Making at Netflix), Part 2 (What is an A/B Test?), Need to catch up?

Testing

Testing Latency Metrics Innovation

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

Over the course of this post, we will talk about our approach to this migration, the strategies that we employed, and the tools we built to support this. For the migration, testing was a first-class citizen. Replay Testing Enter replay testing.

Latency

Latency Cache Java Traffic

Bring Your Own Cloud (BYOC) vs. Dedicated Hosting at ScaleGrid

Scalegrid

APRIL 16, 2020

In this post, we compare ScaleGrid’s Bring Your Own Cloud (BYOC) plan vs. the standard Dedicated Hosting model to help you determine the best strategy for your MySQL, PostgreSQL, Redis™ and MongoDB® database deployment. Deploying your application and database on the same VPC also provides the lowest possible latency path. Expert Tip.

Cloud

Cloud Azure AWS Database

Real user monitoring vs. synthetic monitoring: Understanding best practices

Dynatrace

JUNE 27, 2022

These development and testing practices ensure the performance of critical applications and resources to deliver loyalty-building user experiences. Because pre-production environments are used for testing before an application is released to end users, teams have no access to real-user data. What is synthetic monitoring?

Best Practices

Best Practices Monitoring Wireless Traffic

Trade-offs under pressure: heuristics and observations of teams resolving internet service outages (Part II)

The Morning Paper

JANUARY 23, 2020

1:18pm a key observation was made that an API call to populate the homepage sidebar saw a huge jump in latency. The process tracing exercise included: Examning IRC transcripts from multiple channels. During incident management, prefer peer review of any code changes to gain confidence as opposed to automated tests or other procedures.

Internet

Internet Internet Cache Engineering

Failure Modes and Continuous Resilience

Adrian Cockcroft

NOVEMBER 11, 2019

There are many possible failure modes, and each exercises a different aspect of resilience. In the same way that we have moved from a few big software releases a year to continuous delivery of many small changes, we need to move from annual disaster recover tests or suffering when things actually break, to continuously tested resilience.

Latency

Latency Engineering Systems Hardware

Failure Modes and Continuous Resilience

Adrian Cockcroft

NOVEMBER 11, 2019

There are many possible failure modes, and each exercises a different aspect of resilience. In the same way that we have moved from a few big software releases a year to continuous delivery of many small changes, we need to move from annual disaster recover tests or suffering when things actually break, to continuously tested resilience.

Latency

Latency Engineering Systems Hardware

Transforming enterprise integration with reactive streams

O'Reilly Software

MARCH 7, 2018

This is mixing concerns and leads to code that becomes strongly coupled, monolithic, hard to write, hard to read, hard to evolve, hard to test, and hard to reuse. Its strategies for flow control are either stop-and-wait (i.e., file , jms , test ; }. enum Transport {. Bootstrap the system. length generateOrders =.

Transportation

Transportation Java Programming Architecture

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

MARCH 6, 2019

Rerun a batch of replays We collect replays from actual MezzFS mounts in production, and we rerun large batches of replays for regression and performance tests. We parallelize rerun jobs with Titus , Netflix’s container management platform, which allows us to exercise many hundreds of replay files in minutes. Sparse/Random ?—?Read

Media

Media Storage Processing Cache

Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Interpreting A/B test results: false negatives and power

Trending Sources

Seamlessly Swapping the API backend of the Netflix Android app

Bring Your Own Cloud (BYOC) vs. Dedicated Hosting at ScaleGrid

Real user monitoring vs. synthetic monitoring: Understanding best practices

Trade-offs under pressure: heuristics and observations of teams resolving internet service outages (Part II)

Failure Modes and Continuous Resilience

Failure Modes and Continuous Resilience

Transforming enterprise integration with reactive streams

MezzFS?—?Mounting object storage in Netflix’s media processing platform

Stay Connected