Supporting Diverse ML Systems at Netflix
The Netflix TechBlog
MARCH 7, 2024
Data: Fast Data Our main data lake is hosted on S3, organized as Apache Iceberg tables. For ETL and other heavy lifting of data, we mainly rely on Apache Spark. In addition to Spark, we want to support last-mile data processing in Python, addressing use cases such as feature transformations, batch inference, and training.
Let's personalize your content