Remove Big Data Remove Data Engineering Remove Performance Remove Storage
article thumbnail

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

Data Engineers of Netflix?—?Interview Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix.

article thumbnail

Kubernetes for Big Data Workloads

Abhishek Tiwari

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next. Performance.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Optimizing data warehouse storage

The Netflix TechBlog

By Anupom Syam Background At Netflix, our current data warehouse contains hundreds of Petabytes of data stored in AWS S3 , and each day we ingest and create additional Petabytes. We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits.

Storage 203
article thumbnail

A case for ELT

Abhishek Tiwari

Cheap storage and on-demand compute in the cloud coupled with the emergence of new big data frameworks and tools are forcing us to rethink the whole ETL and data warehousing architecture. Then we perform frequent batch ETL from application databases to a data warehouse. Classic ETL. Stateless and elastic.

article thumbnail

Expanding the Cloud: Introducing Amazon QuickSight

All Things Distributed

However, the data infrastructure to collect, store and process data is geared toward developers (e.g., In AWS’ quest to enable the best data storage options for engineers, we have built several innovative database solutions like Amazon RDS, Amazon RDS for Aurora, Amazon DynamoDB, and Amazon Redshift.

Cloud 137
article thumbnail

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

A unified data management (UDM) system combines the best of data warehouses, data lakes, and streaming without expensive and error-prone ETL. It offers reliability and performance of a data warehouse, real-time and low-latency characteristics of a streaming system, and scale and cost-efficiency of a data lake.