article thumbnail

Data Pipelines: The Hammer for Every Nail

Abhishek Tiwari

Airflow provides rich scheduling and execution semantics enabling data engineers to easily define complex pipelines, running at regular intervals. From transportation and logistics to e-commerce and food delivery, the core operations of many successful companies can be viewed as workflow problems.

article thumbnail

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

We adopted the following mission statement to guide our investments: “Provide a complete and accurate data lineage system enabling decision-makers to win moments of truth.” To improve data accuracy, we decided to leverage AWS S3 access logs to identify entity relationships not been captured by our traditional ingestion process.