Remove articles etl-and-how-it-changed-over-time
article thumbnail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

In this article, we cover a few key integrations that we provide for various layers of the Metaflow stack at Netflix, as illustrated above. These integrations are implemented through Metaflow’s extension mechanism which is publicly available but subject to change, and hence not a part of Metaflow’s stable API yet.

Systems 226
article thumbnail

Ready-to-go sample data pipelines with Dataflow

The Netflix TechBlog

That article was a deep dive into one of the more technical aspects of Dataflow and didn’t properly introduce this tool in the first place. This time we’ll try to give justice to the intro and then we will focus on one of the very first features Dataflow came with. Options: --docker-image TEXT Url of the docker image to run in. --run-in-docker

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Optimizing data warehouse storage

The Netflix TechBlog

There are several benefits of such optimizations like saving on storage, faster query time, cheaper downstream processing, and an increase in developer productivity by removing additional ETLs written only for query performance improvement. Then deep dive into the merging use case of AutoOptimize and share some results and benefits.

Storage 203
article thumbnail

MapReduce Patterns, Algorithms, and Use Cases

Highly Scalable

In this article I digested a number of MapReduce patterns and algorithms to give a systematic view of the different techniques that can be found on the web or scientific articles. For instance, there is a log file where each record contains a response time and it is required to calculate an average response time.

C++ 144