Data Engineering, Processing, Scalability and Storage

Data Engineering

Processing

Scalability

Storage

Leveraging Infrastructure as Code for Data Engineering Projects: A Comprehensive Guide

DZone

JULY 3, 2023

Data engineering projects often require the setup and management of complex infrastructures that support data processing, storage, and analysis. Traditionally, this process involved manual configuration, leading to potential inconsistencies, human errors, and time-consuming deployments.

Data Engineering

Data Engineering Infrastructure Engineering Code

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

At this scale, we can gain a significant amount of performance and cost benefits by optimizing the storage layout (records, objects, partitions) as the data lands into our warehouse. On the other hand, these optimizations themselves need to be sufficiently inexpensive to justify their own processing cost over the gains they bring.

Storage

Storage Latency Efficiency Data Engineering

Join 5,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

Dynatrace

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

This entertaining romp through the tech stack serves as an introduction to how we think about and design systems, the Netflix approach to operational challenges, and how other organizations can apply our thought processes and technologies. Technology advancements in content creation and consumption have also increased its data footprint.

AWS

AWS Entertainment Open Source Benchmarking

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

AWS

AWS Entertainment Open Source Benchmarking

Back-to-Basics Weekend Reading - The 5 Minute Rule - All Things.

All Things Distributed

AUGUST 24, 2012

Werner Vogels weblog on building scalable and robust distributed systems. The AWS team launched this week Amazon Glacier , a cold storage archive service at the very low price point of $0.01 Which makes this week a good moment to read up on some of the historical work around the costs of data engineering. Comments ().

Storage

Storage Hardware AWS Data Engineering

Kubernetes for Big Data Workloads

Abhishek Tiwari

DECEMBER 27, 2017

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next. Native frameworks.

Big Data

Big Data Storage Benchmarking Hardware

Friends don't let friends build data pipelines

Abhishek Tiwari

JULY 12, 2018

In recent times, in order to gain valuable insights or to develop the data-driven products companies such as Netflix, Spotify, Uber, AirBnB have built internal data pipelines. If built correctly, data pipelines can offer strategic advantages to the business. Depending on frameworks, data processing units (a.k.a

Latency

Latency Analytics Scalability Engineering

Expanding the Cloud: Introducing Amazon QuickSight

All Things Distributed

OCTOBER 7, 2015

In such a data intensive environment, making key business decisions such as running marketing and sales campaigns, logistic planning, financial analysis and ad targeting require deriving insights from these data. However, the data infrastructure to collect, store and process data is geared toward developers (e.g.,

Cloud

Cloud Big Data AWS Analytics