Big Data, Hardware and Scalability - Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

Greenplum Database is an open-source , hardware-agnostic MPP database for analytics, based on PostgreSQL and developed by Pivotal who was later acquired by VMware. This feature-packed database provides powerful and rapid analytics on data that scales up to petabyte volumes. What Exactly is Greenplum? At a glance – TLDR.

Big Data

Big Data Database Artificial Intelligence Open Source

What Should You Know About Graph Database’s Scalability?

DZone

JANUARY 20, 2023

Having a distributed and scalable graph database system is highly sought after in many enterprise scenarios. Do Not Be Misled Designing and implementing a scalable graph database system has never been a trivial task.

Scalability

Scalability Big Data Hardware Internet

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

Key Takeaways Distributed storage systems benefit organizations by enhancing data availability, fault tolerance, and system scalability, leading to cost savings from reduced hardware needs, energy consumption, and personnel. Variations within these storage systems are called distributed file systems.

Storage

Storage Systems Big Data Azure

Kubernetes for Big Data Workloads

Abhishek Tiwari

DECEMBER 27, 2017

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next. Key challenges. Performance.

Big Data

Big Data Storage Benchmarking Hardware

In-Stream Big Data Processing

Highly Scalable

AUGUST 20, 2013

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. As a result, the input data typically goes from the data source to the in-stream pipeline via a persistent buffer that allows clients to move their reading pointers back and forth.

Big Data

Big Data Processing Lambda Database

What is IT operations analytics? Extract more data insights from more sources

Dynatrace

MAY 1, 2023

Additionally, ITOA gathers and processes information from applications, services, networks, operating systems, and cloud infrastructure hardware logs in real time. Then, big data analytics technologies, such as Hadoop, NoSQL, Spark, or Grail, the Dynatrace data lakehouse technology, interpret this information.

Analytics

Analytics Artificial Intelligence Big Data Open Source

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

Through effortless provisioning, a larger number of small hosts provide a cost-effective and scalable platform. On-premises data centers invest in higher capacity servers since they provide more flexibility in the long run, while the procurement price of hardware is only one of many cost factors.

Open Source

Open Source Java Operating System Programming

Current status, needs, and challenges in Heterogeneous and Composable Memory from the HCM workshop (HPCA’23)

ACM Sigarch

MAY 31, 2023

Heterogeneous and Composable Memory (HCM) offers a feasible solution for terabyte- or petabyte-scale systems, addressing the performance and efficiency demands of emerging big-data applications. The memory bandwidth will be a key player because the traditional method to add memory bandwidth by adding memory channels is not scalable.

Latency

Latency Hardware Cache Architecture

Even more amazing papers at VLDB 2019 (that I didn’t have space to cover yet)

The Morning Paper

SEPTEMBER 19, 2019

Could it be Analyzing efficient stream processing on modern hardware ? Hyper Dimension Shuffle describes how Microsoft improved the cost of data shuffling, one of the most costly operations, in their petabyte-scale internal big data analytics platform, SCOPE. What’s their secret???

Blockchain

Blockchain Hardware Google Analytics

Dutch Enterprises and The Cloud

All Things Distributed

SEPTEMBER 6, 2013

Shell leverages AWS for big data analytics to help achieve these goals. Due to the exponential growth of the biology and informatics fields, Unilever needs to maintain this new program within a highly-scalable environment that supports parallel computation and heavy data storage demands.

Cloud

Cloud Energy AWS Healthcare

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

All Things Distributed

JULY 13, 2010

Werner Vogels weblog on building scalable and robust distributed systems. Additionally, many high-end HPC applications take advantage of knowing their in-house hardware platforms to achieve major speedup by exploiting the specific processor architecture. Driving down the cost of Big-Data analytics. Comments ().

Cloud

Cloud AWS Automotive Latency

The Winds of Architecture Changes at the USENIX ATC 2019

ACM Sigarch

NOVEMBER 1, 2019

This blog post gives a glimpse of the computer systems research papers presented at the USENIX Annual Technical Conference (ATC) 2019, with an emphasis on systems that use new hardware architectures. The second work presented a novel scalable distributed capability mechanism for security and protection in such systems.

Architecture

Architecture Hardware Cache Storage

Expanding the Cloud: Introducing Amazon QuickSight

All Things Distributed

OCTOBER 7, 2015

Today, I am excited to share with you a brand new service called Amazon QuickSight that aims to simplify the process of deriving insights from a wide variety of data sources in a fast and affordable manner. QuickSight is a fast, cloud native, scalable, business intelligence service for the 1/10th the cost of old-guard BI solutions.

Cloud

Cloud Big Data AWS Analytics

Välkommen till Stockholm – An AWS Region is coming to the Nordics

All Things Distributed

APRIL 4, 2017

After the launch of the AWS EU (Stockholm) Region, there will be 13 Availability Zones in Europe for customers to build flexible, scalable, secure, and highly available applications. It will also give customers another region where they can store their data with the knowledge that it will not leave the EU unless they move it.

AWS

AWS Airlines Latency Games

Amazon EC2 Cluster GPU Instances - All Things Distributed

All Things Distributed

NOVEMBER 15, 2010

Werner Vogels weblog on building scalable and robust distributed systems. This lead to the birth of the Graphics Processing Unit (GPU) which was focused on providing a very fine grained parallel model, with processing organized in multiple stages, where the data would flow through.Â Driving down the cost of Big-Data analytics.

AWS

AWS Latency Programming Architecture

Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

What Should You Know About Graph Database’s Scalability?

Trending Sources

What is a Distributed Storage System

Kubernetes for Big Data Workloads

In-Stream Big Data Processing

What is IT operations analytics? Extract more data insights from more sources

Kubernetes in the wild report 2023

Current status, needs, and challenges in Heterogeneous and Composable Memory from the HCM workshop (HPCA’23)

Even more amazing papers at VLDB 2019 (that I didn’t have space to cover yet)

Dutch Enterprises and The Cloud

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

The Winds of Architecture Changes at the USENIX ATC 2019

Expanding the Cloud: Introducing Amazon QuickSight

Välkommen till Stockholm – An AWS Region is coming to the Nordics

Amazon EC2 Cluster GPU Instances - All Things Distributed

Stay Connected