Remove Architecture Remove AWS Remove Data Engineering Remove Tuning
article thumbnail

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

Service Segmentation: The ease of the cloud deployments has led to the organic growth of multiple AWS accounts, deployment practices, interconnection practices, etc. VPC Flow Logs VPC Flow Logs is an AWS feature that captures information about the IP traffic going to and from network interfaces in a VPC. We named this library Sqooby.

Network 150
article thumbnail

Optimizing data warehouse storage

The Netflix TechBlog

By Anupom Syam Background At Netflix, our current data warehouse contains hundreds of Petabytes of data stored in AWS S3 , and each day we ingest and create additional Petabytes. Some of the optimizations are prerequisites for a high-performance data warehouse.

Storage 203
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

We adopted the following mission statement to guide our investments: “Provide a complete and accurate data lineage system enabling decision-makers to win moments of truth.” Nonetheless, Netflix data landscape (see below) is complex and many teams collaborate effectively for sharing the responsibility of our data system management.

article thumbnail

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

Meson was based on a single leader architecture with high availability. As the usage increased, we had to vertically scale the system to keep up and were approaching AWS instance type limits. increasing at > 100% a year, the need for a scalable data workflow orchestrator has become paramount for Netflix’s business needs.

Java 202