The Netflix TechBlog

Experimentation is a major focus of Data Science across Netflix

The Netflix TechBlog

Auto-Diagnosis and Remediation in Netflix Data Platform

The Netflix TechBlog

By Vikram Srivastava and Marcelo Mayworm Netflix has one of the most complex data platforms in the cloud on which our data scientists and engineers run batch and streaming workloads.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Netflix: A Culture of Learning

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , Colin McFarland , Mihir Tendulkar , and Travis Brooks This is the last post in an overview series on experimentation at Netflix. Need to catch up?

Fixing Performance Regressions Before they Happen

The Netflix TechBlog

Angus Croll Netflix is used by 222 million members and runs on over 1700 device types ranging from state-of-the-art smart TVs to low-cost mobile devices. At Netflix we’re proud of our reliability and we want to keep it that way.

Bringing AV1 Streaming to Netflix Members’ TVs

The Netflix TechBlog

by Liwei Guo , Ashwin Kumar Gopi Valliammal , Raymond Tam , Chris Pham , Agata Opalach , Weibo Ni AV1 is the first high-efficiency video codec format with a royalty-free license from Alliance of Open Media (AOMedia), made possible by wide-ranging industry commitment of expertise and resources.

Media 205

How Netflix uses eBPF flow logs at scale for network insight

The Netflix TechBlog

By Alok Tiagi , Hariharan Ananthakrishnan , Ivan Porto Carrero and Keerti Lakshminarayan Netflix has developed a network observability sidecar called Flow Exporter that uses eBPF tracepoints to capture TCP flows at near real time.

What is an A/B Test?

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This is the second post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. See here for Part 1: Decision Making at Netflix.

Snaring the Bad Folks

The Netflix TechBlog

Project by Netflix’s Cloud Infrastructure Security team ( Alex Bainbridge , Mike Grima , Nick Siow) Cloud security is a hard problem, but an even harder one is cloud security at scale.

Practical API Design at Netflix, Part 1: Using Protobuf FieldMask

The Netflix TechBlog

By Alex Borysov , Ricky Gardiner Background At Netflix, we heavily use gRPC for the purpose of backend to backend communication. When we process a request it is often beneficial to know which fields the caller is interested in and which ones they ignore.

Design 203

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

By Xiaomei Liu , Rosanna Lee , Cyril Concolato Introduction Behind the scenes of the beloved Netflix streaming service and content, there are many technology innovations in media processing. Packaging has always been an important step in media processing.

Cloud 192

Open-Sourcing a Monitoring GUI for Metaflow

The Netflix TechBlog

Open-Sourcing a Monitoring GUI for Metaflow, Netflix’s ML Platform tl;dr Today, we are open-sourcing a long-awaited GUI for Metaflow. The Metaflow GUI allows data scientists to monitor their workflows in real-time, track experiments, and see detailed logs and results for every executed task.

Life of a Netflix Partner Engineer?—?The case of extra 40 ms

The Netflix TechBlog

Life of a Netflix Partner Engineer?—?The The case of the extra 40 ms By: John Blair , Netflix Partner Engineering The Netflix application runs on hundreds of smart TVs, streaming sticks and pay TV set top boxes.

Remote Workstations for the Discerning Artists

The Netflix TechBlog

By Michelle Brenner Netflix is poised to become the world’s most prolific producer of visual effects and original animated content. To meet that demand, we need to attract the world’s best artistic talent.

Building confidence in a decision

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , Michael Lindon , and Colin McFarland This is the fifth post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. Need to catch up?

Interpreting A/B test results: false negatives and power

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This is the fourth post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. Need to catch up?

Interpreting A/B test results: false positives and statistical significance

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This is the third post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. Need to catch up?

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

By Andrew Nguonly , Armando Magalhães , Obi-Ike Nwoke , Shervin Afshar , Sreyashi Das , Tongliang Liu , Wei Liu , Yucheng Zeng Background Over the next few years, most content on Netflix will come from Netflix’s own Studio.

The Show Must Go On: Securing Netflix Studios At Scale

The Netflix TechBlog

Written by Jose Fernandez , Arthur Gonigberg , Julia Knecht , and Patrick Thomas In 2017, Netflix Studios was hitting an inflection point from a period of merely rapid growth to the sort of explosive growth that throws “how do we scale?” into every conversation.

Practical API Design at Netflix, Part 2: Protobuf FieldMask for Mutation Operations

The Netflix TechBlog

By Ricky Gardiner , Alex Borysov Background In our previous post , we discussed how we utilize FieldMask as a solution when designing our APIs so that consumers can request the data they need when fetched via gRPC.

Design 175

Decision Making at Netflix

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This introduction is the first in a multi-part series on how Netflix uses A/B tests to make decisions that continuously improve our products, so we can deliver more joy and satisfaction to our members.

Safe Updates of Client Applications at Netflix

The Netflix TechBlog

By Minal Mishra Quality of a client application is of paramount importance to global digital products, as it is the primary way customers interact with a brand. At Netflix, we have significant investments in ensuring new versions of our applications are well tested.

Optimized shot-based encodes for 4K: Now streaming!

The Netflix TechBlog

by Aditya Mavlankar , Liwei Guo , Anush Moorthy and Anne Aaron Netflix has an ever-expanding collection of titles which customers can enjoy in 4K resolution with a suitable device and subscription plan.

Towards a Reliable Device Management Platform

The Netflix TechBlog

By Benson Ma , Alok Ahuja Introduction At Netflix, hundreds of different device types, from streaming sticks to smart TVs, are tested every day through automation to ensure that new software releases continue to deliver the quality of the Netflix experience that our customers enjoy.

Netflix Video Quality at Scale with Cosmos Microservices

The Netflix TechBlog

by Christos G. Bampis , Chao Chen , Anush K. Moorthy and Zhi Li Introduction Measuring video quality at scale is an essential component of the Netflix streaming pipeline.

Media 149

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

Data Engineers of Netflix?—?Interview Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix.

Evolving Container Security With Linux User Namespaces

The Netflix TechBlog

By Fabio Kung , Sargun Dhillon , Andrew Spyker , Kyle , Rob Gulewich, Nabil Schear , Andrew Leung , Daniel Muino, and Manas Alekar As previously discussed on the Netflix Tech Blog, Titus is the Netflix container orchestration system.

Media 223

CAMBI, a banding artifact detector

The Netflix TechBlog

by Joel Sole, Mariana Afonso, Lukas Krasula, Zhi Li, and Pulkit Tandon Introducing the banding artifacts detector developed by Netflix aiming at further improving the delivered video quality Banding artifacts can be pretty annoying. But, first of all, you may wonder, what is a banding artifact?

Edgar: Solving Mysteries Faster with Observability

The Netflix TechBlog

Edgar helps Netflix teams troubleshoot distributed systems efficiently with the help of a summarized presentation of request tracing, logs, analysis, and metadata. by Elizabeth Carretto Everyone loves Unsolved Mysteries. There’s always someone who seems like the surefire culprit.

ConsoleMe: A Central Control Plane for AWS Permissions and Access

The Netflix TechBlog

ConsoleMe: A Central Control Plane for AWS Permissions and Access By Curtis Castrapel , Patrick Sanders , and Hee Won Kim At AWS re:Invent 2020, we open sourced two new tools for managing multi-account AWS permissions and access.

AWS 202

Optimizing the Aural Experience on Android Devices with xHE-AAC

The Netflix TechBlog

By Phill Williams and Vijay Gondi Introduction At Netflix, we are passionate about delivering great audio to our members. We began streaming 5.1 channel surround sound in 2010, Dolby Atmos in 2017 , and adaptive bitrate audio in 2019.

Open Sourcing the Netflix Domain Graph Service Framework: GraphQL for Spring Boot

The Netflix TechBlog

By Paul Bakker and Kavitha Srinivasan , Images by David Simmer , Edited by Greg Burrell Netflix has developed a Domain Graph Service (DGS) framework and it is now open source. The DGS framework simplifies the implementation of GraphQL, both for standalone and federated GraphQL services.

Revisiting BetterTLS: Certificate Path Building

The Netflix TechBlog

By Ian Haken Last year the AddTrust root certificate expired and lots of clients had a bad time. Some Roku devices weren’t working right, Heroku had problems , and some folks couldn’t even curl.

Netflix Android and iOS Studio Apps?—?now powered by Kotlin Multiplatform

The Netflix TechBlog

Netflix Android and iOS Studio Apps?—?now now powered by Kotlin Multiplatform By David Henry & Mel Yahya Over the last few years Netflix has been developing a mobile app called Prodicle to innovate in the physical production of TV shows and movies. The world of physical production is fast-paced, and needs vary significantly between the country, region, and even from one production to the next.

Cache 219

Data Engineers of Netflix?—?Interview with Kevin Wylie

The Netflix TechBlog

Data Engineers of Netflix?—?Interview Interview with Kevin Wylie This post is part of our “Data Engineers of Netflix” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Kevin Wylie is a Data Engineer on the Content Data Science and Engineering team.

The Netflix Cosmos Platform

The Netflix TechBlog

Orchestrated Functions as a Microservice by Frank San Miguel on behalf of the Cosmos team Introduction Cosmos is a computing platform that combines the best aspects of microservices with asynchronous workflows and serverless functions.

Media 184

How Netflix Scales its API with GraphQL Federation (Part 1)

The Netflix TechBlog

Netflix is known for its loosely coupled and highly scalable microservice architecture. Independent services allow for evolving at different paces and scaling independently. Yet they add complexity for use cases that span multiple services.

Hawkins: Diving into the Reasoning Behind our Design System

The Netflix TechBlog

Stranger Things imagery showcasing the inspiration for the Hawkins Design System by Hawkins team member Joshua Godi ; with art contributions by Wiki Chaves Hawkins may be the name of a fictional town in Indiana, most widely known as the backdrop for one of Netflix’s most popular TV series “Stranger Things,” but the name is so much more.

My (Seemingly) Random Walk to Netflix

The Netflix TechBlog

Part of our series on who works in Analytics at Netflix?—?and and what the role entails By Sean Barnes, Studio Production Data Science & Engineering I am going to tell you a story about a person that works for Netflix. That person grew up dreaming of working in the entertainment industry.

Keeping Netflix Reliable Using Prioritized Load Shedding

The Netflix TechBlog

How viewers are able to watch their favorite show on Netflix while the infrastructure self-recovers from a system failure By Manuel Correa , Arthur Gonigberg , and Daniel West Getting stuck in traffic is one of the most frustrating experiences for drivers around the world.

Edge Authentication and Token-Agnostic Identity Propagation

The Netflix TechBlog

by AIM Team Members Karen Casella , Travis Nelson , Sunny Singh ; with prior art and contributions by Justin Ryan , Satyajit Thadeshwar As most developers can attest, dealing with security protocols and identity tokens, as well as user and device authentication, can be challenging.

Beyond REST

The Netflix TechBlog

Rapid Development with GraphQL Microservices by Dane Avilla The entertainment industry has struggled with COVID-19 restrictions impacting productions around the globe.

Analytics at Netflix: Who we are and what we do

The Netflix TechBlog

Analytics at Netflix: Who We Are and What We Do An Introduction to Analytics and Visualization Engineering at Netflix by Molly Jackman & Meghana Reddy Explained: Season 1 (Photo Credit: Netflix) Across nearly every industry, there is recognition that data analytics is key to driving informed business decision-making.

Introducing Netflix Timed Text Authoring Lineage

The Netflix TechBlog

A Script Authoring Specification By: Bhanu Srikanth, Andy Swan, Casey Wilms, Patrick Pearson The Art of Dubbing and Subtitling Dubbing and subtitling are inherently creative processes.

Energy 150