Remove Design Remove Exercise Remove Latency Remove Systems
article thumbnail

Interpreting A/B test results: false negatives and power

The Netflix TechBlog

We then used simple thought exercises based on flipping coins to build intuition around false positives and related concepts such as statistical significance, p-values, and confidence intervals. As a result, if the test treatment results in a small reduction in the latency metric, it’s hard to successfully identify?

Testing 202
article thumbnail

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

For each route we migrated, we wanted to make sure we were not introducing any regressions: either in the form of missing (or worse, wrong) data, or by increasing the latency of each endpoint. Being able to canary a new route let us verify latency and error rates were within acceptable limits.

Latency 233
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Automating chaos experiments in production

The Morning Paper

Are you ready to take your system assurance programme to the next level? In all cases we need to be able to carefully monitor the impact on the system, and back out if things start going badly wrong. Netflix’s system is deployed on the public cloud as complex set of interacting microservices.

Latency 77
article thumbnail

COVID-19 Hazard Analysis using STPA

Adrian Cockcroft

Picture taken by Adrian March 17, 2020 A resilient system continues to operate successfully in the presence of failures. There are many possible failure modes, and each exercises a different aspect of resilience. Hence, one way to reduce risk is to make systems more observable. The first technique is the most generally useful.

article thumbnail

Failure Modes and Continuous Resilience

Adrian Cockcroft

A resilient system continues to operate successfully in the presence of failures. There are many possible failure modes, and each exercises a different aspect of resilience. Hence, one way to reduce risk is to make systems more observable. This discussion focuses on hardware, software and operational failure modes.

Latency 52
article thumbnail

Failure Modes and Continuous Resilience

Adrian Cockcroft

A resilient system continues to operate successfully in the presence of failures. There are many possible failure modes, and each exercises a different aspect of resilience. Hence, one way to reduce risk is to make systems more observable. This discussion focuses on hardware, software and operational failure modes.

Latency 53
article thumbnail

A persistent problem: managing pointers in NVM

The Morning Paper

At the start of November I was privileged to attend HPTS (the High Performance Transaction Systems) conference in Asilomar. Byte-addressable non-volatile memory,) NVM will fundamentally change the way hardware interacts, the way operating systems are designed, and the way applications operate on data. PLOS’19.