Remove Exercise Remove Latency Remove Processing Remove Strategy
article thumbnail

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

This blog series will examine the tools, techniques, and strategies we have utilized to achieve this goal. In this testing strategy, we execute a copy (replay) of production traffic against a system’s existing and new versions to perform relevant validations. This approach has a handful of benefits.

Traffic 339
article thumbnail

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

—?Mounting object storage in Netflix’s media processing platform By Barak Alon (on behalf of Netflix’s Media Cloud Engineering team) MezzFS (short for “Mezzanine File System”) is a tool we’ve developed at Netflix that mounts cloud objects as local files via FUSE. Encoding is not a one-time process?—?large We have one file?—?the

Media 214
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Real user monitoring vs. synthetic monitoring: Understanding best practices

Dynatrace

Real user monitoring (RUM) is a performance monitoring process that collects detailed data about users’ interactions with an application. Customized tests based on specific business processes and transactions — for example, a user that is leveraging services when accessing an application. What is real user monitoring?

article thumbnail

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

Over the course of this post, we will talk about our approach to this migration, the strategies that we employed, and the tools we built to support this. Functional Testing Functional testing was the most straightforward of them all: a set of tests alongside each path exercised it against the old and new endpoints.

Latency 233
article thumbnail

A Decade of Dynamo: Powering the next wave of high-performance, internet-scale applications

All Things Distributed

Performant – DynamoDB consistently delivers single-digit millisecond latencies even as your traffic volume increases. DynamoDB automatically re-distributes your data to healthy servers to ensure there are always multiple replicas of your data without you needing to intervene. Auto Scaling is on by default for all new tables and indexes.

Internet 128
article thumbnail

Trade-offs under pressure: heuristics and observations of teams resolving internet service outages (Part II)

The Morning Paper

1:18pm a key observation was made that an API call to populate the homepage sidebar saw a huge jump in latency. A technique called process tracing was employed to try and recover "a record of participant data acquisition, situation assessment, knowledge activation, expectations, intentions, and actions as the case unfolds over time."

article thumbnail

Failure Modes and Continuous Resilience

Adrian Cockcroft

There are many possible failure modes, and each exercises a different aspect of resilience. Staff should be familiar with recovery processes and the behavior of the system when it’s working hard to mitigate failures. A resilient system continues to operate successfully in the presence of failures.

Latency 52