article thumbnail

3 Performance Tricks for Dealing With Big Data Sets

DZone

This article describes 3 different tricks that I used in dealing with big data sets (order of 10 million records) and that proved to enhance performance dramatically. This trick enhanced the performance dramatically. Trick 1: CLOB Instead of Result Set.

Big Data 246
article thumbnail

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

This feature-packed database provides powerful and rapid analytics on data that scales up to petabyte volumes. Greenplum uses an MPP database design that can help you develop a scalable, high performance deployment. High performance, query optimization, open source and polymorphic data storage are the major Greenplum advantages.

Big Data 321
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

ScyllaDB Trends – How Users Deploy The Real-Time Big Data Database

Scalegrid

In fact, according to ScyllaDB’s performance benchmark report, their 99.9 So this type of performance has to come at a cost, right? cost reduction compared to running Cassandra, as they can achieve this performance with only 10% of the nodes. percentile latency is up to 11X better than Cassandra on AWS EC2 bare metal.

Big Data 187
article thumbnail

Performance Monitoring Dashboards in the Age of Big Data Pollution

Rigor

Big data is like the pollution of the information age. The Big Data Struggle and Performance Reporting. Alternatively, a number of organizations have created their own internal home-grown systems for managing and distilling web performance and monitoring data. No fuss, no muss.

article thumbnail

In-Stream Big Data Processing

Highly Scalable

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. The engine should be able to ingest both streaming data and data from Hadoop i.e. serve as a custom query engine atop of HDFS. High performance and mobility.

Big Data 154
article thumbnail

Kubernetes for Big Data Workloads

Abhishek Tiwari

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next. Performance.

article thumbnail

Spark-Radiant: Apache Spark Performance and Cost Optimizer

DZone

Spark-Radiant is Apache Spark Performance and Cost Optimizer. Spark-Radiant will help optimize performance and cost considering catalyst optimizer rules, enhance auto-scaling in Spark, collect important metrics related to a Spark job, Bloom filter index in Spark, etc. Spark-Radiant is now available and ready to use.