article thumbnail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

Occasionally, these use cases involve terabytes of data, so we have to pay attention to performance. By targeting @titus, Metaflow tasks benefit from these battle-hardened features out of the box, with no in-depth technical knowledge or engineering required from the ML engineers or data scientist end.

Systems 226
article thumbnail

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

We started seeing signs of scale issues, like: Slowness during peak traffic moments like 12 AM UTC, leading to increased operational burden. increasing at > 100% a year, the need for a scalable data workflow orchestrator has become paramount for Netflix’s business needs. the number of iteration in the loop statement, etc.)

Java 202
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Experimentation is a major focus of Data Science across Netflix

The Netflix TechBlog

Curious to learn about what it’s like to be a Data Engineer at Netflix? Hear directly from Samuel Setegne , Dhevi Rajendran , Kevin Wylie , and Pallavi Phadnis in our “Data Engineers of Netflix” interview series. We don’t have unlimited traffic or time, so sometimes we have to make hard choices.

article thumbnail

How HubSpot Uses Apache Kafka Swimlanes for Timely Processing of Workflow Actions

InfoQ

HubSpot adopted routing messages over multiple Kafka topics (called swimlanes) for the same producer to avoid the build-up in the consumer group lag and prioritize the processing of real-time traffic. By Rafal Gancarz

article thumbnail

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

As a micro-service owner, a Netflix engineer is responsible for its innovation as well as its operation, which includes making sure the service is reliable, secure, efficient and performant. How can we develop templated detection modules (rules- and ML-based) and data streams to increases speed of development?

article thumbnail

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

InfoQ

LinkedIn introduced Couchbase as a centralized caching tier for scaling member profile reads to handle increasing traffic that has outgrown their existing database cluster. The new solution achieved over 99% hit rate, helped reduce tail latencies by more than 60% and costs by 10% annually. By Rafal Gancarz

Cache 83
article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. Wednesday?—?December The talk also includes examples of using these tools in the Amazon Elastic Compute Cloud (Amazon EC2) cloud.

AWS 100