article thumbnail

1. Streamlining Membership Data Engineering at Netflix with Psyberg

The Netflix TechBlog

By Abhinaya Shetty , Bharath Mummadisetty At Netflix, our Membership and Finance Data Engineering team harnesses diverse data related to plans, pricing, membership life cycle, and revenue to fuel analytics, power various dashboards, and make data-informed decisions. What is late-arriving data?

article thumbnail

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

Data Engineers of Netflix?—?Interview Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

The Machine Learning Platform (MLP) team at Netflix provides an entire ecosystem of tools around Metaflow , an open source machine learning infrastructure framework we started, to empower data scientists and machine learning practitioners to build and manage a variety of ML systems.

Systems 226
article thumbnail

Ready-to-go sample data pipelines with Dataflow

The Netflix TechBlog

Dataflow Dataflow is a command line utility built to improve experience and to streamline the data pipeline development at Netflix. The most commonly used one is dataflow project , which helps folks in managing their data pipeline repositories through creation, testing, deployment and few other activities. test_sparksql_write.py

article thumbnail

Data pipeline asset management with Dataflow

The Netflix TechBlog

see “data pipeline” Intro The problem of managing scheduled workflows and their assets is as old as the use of cron daemon in early Unix operating systems. The design of a cron job is simple, you take some system command, you pick the schedule to run it on and you are done. Manually constructed continuous delivery system.

Storage 201
article thumbnail

Analytics at Netflix: Who we are and what we do

The Netflix TechBlog

Full ownership often means building new data pipelines, navigating complex schemas and large data sets, developing or improving metrics for business performance, and creating intuitive visualizations and dashboards?—?always Others have grown into new areas as part of their professional development at Netflix.

Analytics 240
article thumbnail

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

Due to its popularity, the number of workflows managed by the system has grown exponentially. The scheduler on-call has to closely monitor the system during non-business hours. As the usage increased, we had to vertically scale the system to keep up and were approaching AWS instance type limits.

Java 202