Availability, Big Data, Data Engineering and Tuning

Availability

Big Data

Data Engineering

Tuning

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

The Netflix TechBlog

JULY 21, 2022

More importantly, the low resource availability or “out of memory” scenario is one of the common reasons for crashes/kills. We at Netflix, as a streaming service running on millions of devices, have a tremendous amount of data about device capabilities/characteristics and runtime data in our big data platform.

Big Data

Big Data Cache Engineering Data Engineering

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

MAY 26, 2020

Network Availability: The expected continued growth of our ecosystem makes it difficult to understand our network bottlenecks and potential limits we may be reaching. And in order to gain visibility into these logs, we need to somehow ingest and enrich this data. It is easier to tune a large Spark job for a consistent volume of data.

Network

Network Tuning AWS Big Data

Join 5,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

Dynatrace

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

We adopted the following mission statement to guide our investments: “Provide a complete and accurate data lineage system enabling decision-makers to win moments of truth.” Nonetheless, Netflix data landscape (see below) is complex and many teams collaborate effectively for sharing the responsibility of our data system management.

Infrastructure

Infrastructure Big Data Transportation Architecture

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java

Java Scalability Traffic Architecture

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

Some of the optimizations are prerequisites for a high-performance data warehouse. Sometimes Data Engineers write downstream ETLs on ingested data to optimize the data/metadata layouts to make other ETL processes cheaper and faster. Orient: Gather tuning parameters for a particular table that changed.

Storage

Storage Latency Efficiency Data Engineering

Technology Performance Pulse

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

Trending Sources

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

Optimizing data warehouse storage

Stay Connected