article thumbnail

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

When handling large amounts of complex data, or big data, chances are that your main machine might start getting crushed by all of the data it has to process in order to produce your analytics results. Greenplum features a cost-based query optimizer for large-scale, big data workloads. Query Optimization.

Big Data 321
article thumbnail

ScyllaDB Trends – How Users Deploy The Real-Time Big Data Database

Scalegrid

ScyllaDB is an open-source distributed NoSQL data store, reimplemented from the popular Apache Cassandra database. Released just four years ago in 2015, Scylla has averaged over 220% year-over-year growth in popularity according to DB-Engines. percentile latency is up to 11X better than Cassandra on AWS EC2 bare metal.

Big Data 187
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Offline Data Pipeline Best Practices Part 1:Optimizing Airflow Job Parameters for Apache Hive

DZone

Welcome to the first post in our exciting series on mastering offline data pipeline's best practices, focusing on the potent combination of Apache Airflow and data processing engines like Hive and Spark. Working together, they form the backbone of many modern data engineering solutions.

article thumbnail

Auto-Diagnosis and Remediation in Netflix Data Platform

The Netflix TechBlog

By Vikram Srivastava and Marcelo Mayworm Netflix has one of the most complex data platforms in the cloud on which our data scientists and engineers run batch and streaming workloads. Pensive collects logs for the failed jobs launched by the step from the relevant data platform components and then extracts the stack traces.

Big Data 238
article thumbnail

Expanding the AWS Cloud – Introducing the AWS Europe (Stockholm) Region

All Things Distributed

In April 2017, Amazon Web Services announced that it would launch a new AWS infrastructure region Region in Sweden. Today, I'm happy to announce that the AWS Europe (Stockholm) Region, our 20th Region globally, is now generally available for use by customers. Public sector.

AWS 124
article thumbnail

Expanding the AWS Cloud: Introducing the AWS Canada (Central) Region

All Things Distributed

Earlier this year, Amazon Web Services (AWS) announced it would launch a new AWS infrastructure region in Montreal, Quebec. The AWS Cloud now operates in 40 Availability Zones within 15 geographic regions around the world, with seven more Availability Zones and three more regions coming online in China, France, and the U.K.

AWS 155
article thumbnail

Driving down the cost of Big-Data analytics - All Things Distributed

All Things Distributed

Driving down the cost of Big-Data analytics. The Amazon Elastic MapReduce (EMR) team announced today the ability to seamlessly use Amazon EC2 Spot Instances with their service, significantly driving down the cost of data analytics in the cloud. The posting on the AWS developer blog also has some more background.

Big Data 111