Remove Big Data Remove Engineering Remove Storage Remove Systems
article thumbnail

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

High performance, query optimization, open source and polymorphic data storage are the major Greenplum advantages. When handling large amounts of complex data, or big data, chances are that your main machine might start getting crushed by all of the data it has to process in order to produce your analytics results.

Big Data 321
article thumbnail

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

Data Engineers of Netflix?—?Interview Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Optimizing data warehouse storage

The Netflix TechBlog

At this scale, we can gain a significant amount of performance and cost benefits by optimizing the storage layout (records, objects, partitions) as the data lands into our warehouse. We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits.

Storage 203
article thumbnail

In-Stream Big Data Processing

Highly Scalable

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. This system has been designed to supplement and succeed the existing Hadoop-based system that had too high latency of data processing and too high maintenance costs.

Big Data 154
article thumbnail

Kubernetes for Big Data Workloads

Abhishek Tiwari

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next. Performance.

article thumbnail

Scaling Uber’s Apache Hadoop Distributed File System for Growth

Uber Engineering

Three years ago, Uber Engineering adopted Hadoop as the storage ( HDFS ) and compute ( YARN ) infrastructure for our organization’s big data analysis.

Systems 109
article thumbnail

What is container orchestration?

Dynatrace

Containers enable developers to package microservices or applications with the libraries, configuration files, and dependencies needed to run on any infrastructure, regardless of the target system environment. This means organizations are increasingly using Kubernetes not just for running applications, but also as an operating system.