The Netflix TechBlog

Data Mesh?—?A Data Movement and Processing Platform @ Netflix

The Netflix TechBlog

Data Mesh?—?A A Data Movement and Processing Platform @ Netflix By Bo Lei , Guilherme Pires , James Shao , Kasturi Chatterjee , Sujay Jain , Vlad Sydorenko Background Realtime processing technologies (A.K.A

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

The Netflix TechBlog

by Aryan Mehra with Farnaz Karimdady Sharifabad , Prasanna Vijayanathan , Chaïna Wade , Vishal Sharma and Mike Schassberger Aim and Purpose?—?Problem Problem Statement The purpose of this article is to give insights into analyzing and predicting “out of memory” or OOM kills on the Netflix App.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

How Netflix Content Engineering makes a federated graph searchable (Part 2)

The Netflix TechBlog

By Alex Hutter , Falguni Jhaveri , and Senthil Sayeebaba In a previous post , we described the indexing architecture of Studio Search and how we scaled the architecture by building a config-driven self-service platform that allowed teams in Content Engineering to spin up search indices easily.

Rapid Event Notification System at Netflix

The Netflix TechBlog

By: Ankush Gulati , David Gevorkyan Additional credits: Michael Clark , Gokhan Ozer Intro Netflix has more than 220 million active members who perform a variety of actions throughout each session, ranging from renaming a profile to watching a title.

How Netflix Content Engineering makes a federated graph searchable

The Netflix TechBlog

By Alex Hutter , Falguni Jhaveri and Senthil Sayeebaba Over the past few years Content Engineering at Netflix has been transitioning many of its services to use a federated GraphQL platform.

A Survey of Causal Inference Applications at Netflix

The Netflix TechBlog

At Netflix, we want to entertain the world through creating engaging content and helping members discover the titles they will love. Key to that is understanding causal effects that connect changes we make in the product to indicators of member joy.

Experimentation is a major focus of Data Science across Netflix

The Netflix TechBlog

How Netflix uses eBPF flow logs at scale for network insight

The Netflix TechBlog

By Alok Tiagi , Hariharan Ananthakrishnan , Ivan Porto Carrero and Keerti Lakshminarayan Netflix has developed a network observability sidecar called Flow Exporter that uses eBPF tracepoints to capture TCP flows at near real time.

Auto-Diagnosis and Remediation in Netflix Data Platform

The Netflix TechBlog

By Vikram Srivastava and Marcelo Mayworm Netflix has one of the most complex data platforms in the cloud on which our data scientists and engineers run batch and streaming workloads.

Life of a Netflix Partner Engineer?—?The case of extra 40 ms

The Netflix TechBlog

Life of a Netflix Partner Engineer?—?The The case of the extra 40 ms By: John Blair , Netflix Partner Engineering The Netflix application runs on hundreds of smart TVs, streaming sticks and pay TV set top boxes.

Bringing AV1 Streaming to Netflix Members’ TVs

The Netflix TechBlog

by Liwei Guo , Ashwin Kumar Gopi Valliammal , Raymond Tam , Chris Pham , Agata Opalach , Weibo Ni AV1 is the first high-efficiency video codec format with a royalty-free license from Alliance of Open Media (AOMedia), made possible by wide-ranging industry commitment of expertise and resources.

Media 206

Netflix: A Culture of Learning

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , Colin McFarland , Mihir Tendulkar , and Travis Brooks This is the last post in an overview series on experimentation at Netflix. Need to catch up?

Remote Workstations for the Discerning Artists

The Netflix TechBlog

By Michelle Brenner Netflix is poised to become the world’s most prolific producer of visual effects and original animated content. To meet that demand, we need to attract the world’s best artistic talent.

Evolution of ML Fact Store

The Netflix TechBlog

by Vivek Kaushal At Netflix, we aim to provide recommendations that match our members’ interests. To achieve this, we rely on Machine Learning (ML) algorithms. ML algorithms can be only as good as the data that we provide to it.

What is an A/B Test?

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This is the second post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. See here for Part 1: Decision Making at Netflix.

How We Build Micro Frontends With Lattice

The Netflix TechBlog

Written by Michael Possumato , Nick Tomlin , Jordan Andree , Andrew Shim , and Rahul Pilani. As we continue to grow here at Netflix, the needs of Revenue and Growth Engineering are rapidly evolving; and our tools must also evolve just as rapidly.

Practical API Design at Netflix, Part 1: Using Protobuf FieldMask

The Netflix TechBlog

By Alex Borysov , Ricky Gardiner Background At Netflix, we heavily use gRPC for the purpose of backend to backend communication. When we process a request it is often beneficial to know which fields the caller is interested in and which ones they ignore.

Design 205

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

By Xiaomei Liu , Rosanna Lee , Cyril Concolato Introduction Behind the scenes of the beloved Netflix streaming service and content, there are many technology innovations in media processing. Packaging has always been an important step in media processing.

Cloud 197

Demystifying Interviewing for Backend Engineers @ Netflix

The Netflix TechBlog

By Karen Casella, Director of Engineering, Access & Identity Management Have you ever experienced one of the following scenarios while looking for your next role? You study and practice coding interview problems for hours/days/weeks/months, only to be asked to merge two sorted lists.

Fixing Performance Regressions Before they Happen

The Netflix TechBlog

Angus Croll Netflix is used by 222 million members and runs on over 1700 device types ranging from state-of-the-art smart TVs to low-cost mobile devices. At Netflix we’re proud of our reliability and we want to keep it that way.

Optimized shot-based encodes for 4K: Now streaming!

The Netflix TechBlog

by Aditya Mavlankar , Liwei Guo , Anush Moorthy and Anne Aaron Netflix has an ever-expanding collection of titles which customers can enjoy in 4K resolution with a suitable device and subscription plan.

Open-Sourcing a Monitoring GUI for Metaflow

The Netflix TechBlog

Open-Sourcing a Monitoring GUI for Metaflow, Netflix’s ML Platform tl;dr Today, we are open-sourcing a long-awaited GUI for Metaflow. The Metaflow GUI allows data scientists to monitor their workflows in real-time, track experiments, and see detailed logs and results for every executed task.

Data pipeline asset management with Dataflow

The Netflix TechBlog

by Sam Setegne, Jai Balani, Olek Gorajek Glossary asset ?—?any any business logic code in a raw (e.g. SQL) or compiled (e.g. JAR) form to be executed as part of the user defined data pipeline. data pipeline ?—?a a set of tasks (or jobs) to be executed in a predefined order (a.k.a.

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

By Andrew Nguonly , Armando Magalhães , Obi-Ike Nwoke , Shervin Afshar , Sreyashi Das , Tongliang Liu , Wei Liu , Yucheng Zeng Background Over the next few years, most content on Netflix will come from Netflix’s own Studio.

Evolving Container Security With Linux User Namespaces

The Netflix TechBlog

By Fabio Kung , Sargun Dhillon , Andrew Spyker , Kyle , Rob Gulewich, Nabil Schear , Andrew Leung , Daniel Muino, and Manas Alekar As previously discussed on the Netflix Tech Blog, Titus is the Netflix container orchestration system.

Media 229

Edgar: Solving Mysteries Faster with Observability

The Netflix TechBlog

Edgar helps Netflix teams troubleshoot distributed systems efficiently with the help of a summarized presentation of request tracing, logs, analysis, and metadata. by Elizabeth Carretto Everyone loves Unsolved Mysteries. There’s always someone who seems like the surefire culprit.

Snaring the Bad Folks

The Netflix TechBlog

Project by Netflix’s Cloud Infrastructure Security team ( Alex Bainbridge , Mike Grima , Nick Siow) Cloud security is a hard problem, but an even harder one is cloud security at scale.

The Show Must Go On: Securing Netflix Studios At Scale

The Netflix TechBlog

Written by Jose Fernandez , Arthur Gonigberg , Julia Knecht , and Patrick Thomas In 2017, Netflix Studios was hitting an inflection point from a period of merely rapid growth to the sort of explosive growth that throws “how do we scale?” into every conversation.

Practical API Design at Netflix, Part 2: Protobuf FieldMask for Mutation Operations

The Netflix TechBlog

By Ricky Gardiner , Alex Borysov Background In our previous post , we discussed how we utilize FieldMask as a solution when designing our APIs so that consumers can request the data they need when fetched via gRPC.

Design 181

Interpreting A/B test results: false positives and statistical significance

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This is the third post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. Need to catch up?

Decision Making at Netflix

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This introduction is the first in a multi-part series on how Netflix uses A/B tests to make decisions that continuously improve our products, so we can deliver more joy and satisfaction to our members.

Interpreting A/B test results: false negatives and power

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This is the fourth post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. Need to catch up?

ConsoleMe: A Central Control Plane for AWS Permissions and Access

The Netflix TechBlog

ConsoleMe: A Central Control Plane for AWS Permissions and Access By Curtis Castrapel , Patrick Sanders , and Hee Won Kim At AWS re:Invent 2020, we open sourced two new tools for managing multi-account AWS permissions and access.

AWS 208

Optimizing the Aural Experience on Android Devices with xHE-AAC

The Netflix TechBlog

By Phill Williams and Vijay Gondi Introduction At Netflix, we are passionate about delivering great audio to our members. We began streaming 5.1 channel surround sound in 2010, Dolby Atmos in 2017 , and adaptive bitrate audio in 2019.

Towards a Reliable Device Management Platform

The Netflix TechBlog

By Benson Ma , Alok Ahuja Introduction At Netflix, hundreds of different device types, from streaming sticks to smart TVs, are tested every day through automation to ensure that new software releases continue to deliver the quality of the Netflix experience that our customers enjoy.

Building confidence in a decision

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , Michael Lindon , and Colin McFarland This is the fifth post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. Need to catch up?

Open Sourcing the Netflix Domain Graph Service Framework: GraphQL for Spring Boot

The Netflix TechBlog

By Paul Bakker and Kavitha Srinivasan , Images by David Simmer , Edited by Greg Burrell Netflix has developed a Domain Graph Service (DGS) framework and it is now open source. The DGS framework simplifies the implementation of GraphQL, both for standalone and federated GraphQL services.

Netflix Android and iOS Studio Apps?—?now powered by Kotlin Multiplatform

The Netflix TechBlog

Netflix Android and iOS Studio Apps?—?now now powered by Kotlin Multiplatform By David Henry & Mel Yahya Over the last few years Netflix has been developing a mobile app called Prodicle to innovate in the physical production of TV shows and movies. The world of physical production is fast-paced, and needs vary significantly between the country, region, and even from one production to the next.

Cache 219

Safe Updates of Client Applications at Netflix

The Netflix TechBlog

By Minal Mishra Quality of a client application is of paramount importance to global digital products, as it is the primary way customers interact with a brand. At Netflix, we have significant investments in ensuring new versions of our applications are well tested.

CAMBI, a banding artifact detector

The Netflix TechBlog

by Joel Sole, Mariana Afonso, Lukas Krasula, Zhi Li, and Pulkit Tandon Introducing the banding artifacts detector developed by Netflix aiming at further improving the delivered video quality Banding artifacts can be pretty annoying. But, first of all, you may wonder, what is a banding artifact?

The Netflix Cosmos Platform

The Netflix TechBlog

Orchestrated Functions as a Microservice by Frank San Miguel on behalf of the Cosmos team Introduction Cosmos is a computing platform that combines the best aspects of microservices with asynchronous workflows and serverless functions.

Media 190

How Netflix Scales its API with GraphQL Federation (Part 1)

The Netflix TechBlog

Netflix is known for its loosely coupled and highly scalable microservice architecture. Independent services allow for evolving at different paces and scaling independently. Yet they add complexity for use cases that span multiple services.

Hawkins: Diving into the Reasoning Behind our Design System

The Netflix TechBlog

Stranger Things imagery showcasing the inspiration for the Hawkins Design System by Hawkins team member Joshua Godi ; with art contributions by Wiki Chaves Hawkins may be the name of a fictional town in Indiana, most widely known as the backdrop for one of Netflix’s most popular TV series “Stranger Things,” but the name is so much more.