article thumbnail

3. Psyberg: Automated end to end catch up

The Netflix TechBlog

This data pipeline monitors the various stages in the customer lifecycle. In the sequential load ETL, we have the following features: Catchup Threshold : This defines the lookback period for the data being read. This helps overwrite data only when required and minimizes unnecessary reprocessing.

Tuning 244
article thumbnail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

Since then, open-source Metaflow has gained support for Argo Workflows , a Kubernetes-native orchestrator, as well as support for Airflow which is still widely used by data engineering teams. Internally, we use a production workflow orchestrator called Maestro.

Systems 226
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

In this session, we discuss the technologies used to run a global streaming company, growing at scale, billions of metrics, benefits of chaos in production, and how culture affects your velocity and uptime. By watching applications for anomalous actions, security and operations teams can monitor unusual and erroneous behavior.

AWS 100
article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

In this session, we discuss the technologies used to run a global streaming company, growing at scale, billions of metrics, benefits of chaos in production, and how culture affects your velocity and uptime. By watching applications for anomalous actions, security and operations teams can monitor unusual and erroneous behavior.

AWS 100
article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

In this session, we discuss the technologies used to run a global streaming company, growing at scale, billions of metrics, benefits of chaos in production, and how culture affects your velocity and uptime. By watching applications for anomalous actions, security and operations teams can monitor unusual and erroneous behavior.

AWS 37
article thumbnail

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

The need for backfilling could be due to a variety of factors, e.g. (1) upstream data sets got repopulated due to changes in business logic of its data pipeline, (2) business logic was changed in a data pipeline, (3) anew metric was created that needs to be populated for historical time ranges, (4) historical data was found missing, etc.