Remove Data Engineering Remove Processing Remove Storage Remove Strategy
article thumbnail

Optimizing data warehouse storage

The Netflix TechBlog

At this scale, we can gain a significant amount of performance and cost benefits by optimizing the storage layout (records, objects, partitions) as the data lands into our warehouse. On the other hand, these optimizations themselves need to be sufficiently inexpensive to justify their own processing cost over the gains they bring.

Storage 203
article thumbnail

Data pipeline asset management with Dataflow

The Netflix TechBlog

Let’s define some requirements that we are interested in delivering to the Netflix data engineers or anyone who would like to schedule a workflow with some external assets in it. This causes the user-managed storage system to be a critical runtime dependency. The slightly improved approach is shown on the diagram below.

Storage 201
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Friends don't let friends build data pipelines

Abhishek Tiwari

In recent times, in order to gain valuable insights or to develop the data-driven products companies such as Netflix, Spotify, Uber, AirBnB have built internal data pipelines. If built correctly, data pipelines can offer strategic advantages to the business. Depending on frameworks, data processing units (a.k.a

Latency 63
article thumbnail

Expanding the Cloud: Introducing Amazon QuickSight

All Things Distributed

In such a data intensive environment, making key business decisions such as running marketing and sales campaigns, logistic planning, financial analysis and ad targeting require deriving insights from these data. However, the data infrastructure to collect, store and process data is geared toward developers (e.g.,

Cloud 137