Design, Exercise, Latency and Traffic - Technology Performance Pulse

Design

Exercise

Latency

Traffic

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

For each route we migrated, we wanted to make sure we were not introducing any regressions: either in the form of missing (or worse, wrong) data, or by increasing the latency of each endpoint. If we pare down the problem to absolute basics, we essentially have two services returning JSON. Replay Testing Enter replay testing.

Latency

Latency Cache Java Traffic

Automating chaos experiments in production

The Morning Paper

JULY 4, 2019

This is a fascinating paper from members of Netflix’s Resilience Engineering team describing their chaos engineering initiatives: automated controlled experiments designed to verify hypotheses about how the system should behave under gray failure conditions, and to probe for and flush out any weaknesses. Safeguards.

Latency

Latency Engineering Metrics Traffic

Join 5,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

Dynatrace

A Decade of Dynamo: Powering the next wave of high-performance, internet-scale applications

All Things Distributed

OCTOBER 2, 2017

With these requirements in mind, and a willingness to question the status quo, a small group of distributed systems experts came together and designed a horizontally scalable distributed database that would scale out for both reads and writes to meet the long-term needs of our business. This was the genesis of the Amazon Dynamo database.

Internet

Internet Internet AWS Performance

Scaling Amazon ElastiCache for Redis with Online Cluster Resizing

All Things Distributed

NOVEMBER 21, 2017

Redis's microsecond latency has made it a de facto choice for caching. Four years ago, as part of our AWS fast data journey, we introduced Amazon ElastiCache for Redis , a fully managed, in-memory data store that operates at microsecond latency. TB of in-memory capacity in a single cluster.

Games

Games Retail Latency Education

Failure Modes and Continuous Resilience

Adrian Cockcroft

NOVEMBER 11, 2019

There are many possible failure modes, and each exercises a different aspect of resilience. Another problem is that a design control, intended to mitigate a failure mode, may not work as intended. STPA is based on a functional control diagram of the system, and the safety constraints and requirements for each component in the design.

Latency

Latency Engineering Systems Hardware

Failure Modes and Continuous Resilience

Adrian Cockcroft

NOVEMBER 11, 2019

Latency

Latency Engineering Systems Hardware

Why I hate MPI (from a performance analysis perspective)

John McCalpin

AUGUST 1, 2018

This is an intellectually challenging and labor-intensive exercise, requiring detailed review of the published details of each of the components of the system, and usually requiring significant “detective work” (using customized microbenchmarks, hardware performance counter analysis, and creative thinking) to fill in the gaps.

Hardware

Hardware Transportation Performance Latency

Seamlessly Swapping the API backend of the Netflix Android app

Automating chaos experiments in production

Trending Sources

A Decade of Dynamo: Powering the next wave of high-performance, internet-scale applications

Scaling Amazon ElastiCache for Redis with Online Cluster Resizing

Failure Modes and Continuous Resilience

Failure Modes and Continuous Resilience

Why I hate MPI (from a performance analysis perspective)

Stay Connected