Design, Exercise, Latency and Programming - Technology Performance Pulse

Design

Exercise

Latency

Programming

A persistent problem: managing pointers in NVM

The Morning Paper

DECEMBER 8, 2019

On the last morning of the conference Daniel Bittman presented some of the work being done in the context of the Twizzler OS project to explore new programming models for NVM. The starting point is a set of three asumptions for an NVM-based programming model: Compared to traditional persistent media, NVM is fast. What about security?

Hardware

Hardware Programming Media Storage

Why I hate MPI (from a performance analysis perspective)

John McCalpin

AUGUST 1, 2018

This is an intellectually challenging and labor-intensive exercise, requiring detailed review of the published details of each of the components of the system, and usually requiring significant “detective work” (using customized microbenchmarks, hardware performance counter analysis, and creative thinking) to fill in the gaps.

Hardware

Hardware Transportation Performance Latency

Join 5,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

Transforming enterprise integration with reactive streams

O'Reilly Software

MARCH 7, 2018

Software today is not typically a single program—something that is executed by an operator or user, producing a result to that person—but rather a service : something that runs for the benefit of its consumers, a provider of value. The most common programming task in the world. Let’s dive into this concept for a bit.

Transportation

Transportation Java Programming Architecture

A peculiar throughput limitation on Intel’s Xeon Phi x200 (Knights Landing)

John McCalpin

JANUARY 22, 2018

There was no deep goal — just a desire to see the maximum GFLOPS in action. The exercise seemed simple enough — just fix one item in the Colfax code and we should be finished. Using the minimum number of accumulator registers needed to tolerate the pipeline latency (12), the assembly code for the inner loop is: B1.8:

Latency

Latency Hardware Code Testing

Amazon EC2 Cluster GPU Instances - All Things Distributed

All Things Distributed

NOVEMBER 15, 2010

We believe that making these GPU resources available for everyone to use at low cost will drive new innovation in the application of highly parallel programming models. For example, the most fundamental abstraction trade-off has always been latency versus throughput. General Purpose GPU programming. From CPU to GPU.

AWS