Remove Big Data Remove Data Remove Data Engineering Remove Database
article thumbnail

Data Engineers of Netflix?—?Interview with Kevin Wylie

The Netflix TechBlog

Data Engineers of Netflix?—?Interview Interview with Kevin Wylie This post is part of our “Data Engineers of Netflix” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Kevin, what drew you to data engineering?

article thumbnail

Optimizing data warehouse storage

The Netflix TechBlog

By Anupom Syam Background At Netflix, our current data warehouse contains hundreds of Petabytes of data stored in AWS S3 , and each day we ingest and create additional Petabytes. We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits.

Storage 203
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

InfoQ

LinkedIn introduced Couchbase as a centralized caching tier for scaling member profile reads to handle increasing traffic that has outgrown their existing database cluster. The new solution achieved over 99% hit rate, helped reduce tail latencies by more than 60% and costs by 10% annually. By Rafal Gancarz

Cache 82
article thumbnail

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and Efficiency By: Di Lin , Girish Lingappa , Jitender Aswani Imagine yourself in the role of a data-inspired decision maker staring at a metric on a dashboard about to make a critical business decision but pausing to ask a question?—?“Can

article thumbnail

A case for ELT

Abhishek Tiwari

Cheap storage and on-demand compute in the cloud coupled with the emergence of new big data frameworks and tools are forcing us to rethink the whole ETL and data warehousing architecture. Then we perform frequent batch ETL from application databases to a data warehouse. Classic ETL. Late transformation.

article thumbnail

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java 202
article thumbnail

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

ETL refers to extract, transform, load and it is generally used for data warehousing and data integration. ETL is a product of the relational database era and it has not evolved much in last decade. There are several emerging data trends that will define the future of ETL in 2018. Unified data management architecture.