Bandwidth or Latency: When to Optimise for Which

CSS Wizardry

When it comes to network performance, there are two main limiting factors that will slow you down: bandwidth and latency. Latency is defined as…. Where bandwidth deals with capacity, latency is more about speed of transfer 2. As a web user—often transferring lots of smaller files—reductions in latency will almost always be a welcome improvement. and reduction in latency. This lower number is effectively your latency. Put another way, latency cost us 24.6×

Taskbar Latency and Kernel Calls

Randon ASCII

The fact that this shows up as CPU time suggests that the reads were all hitting in the system cache and the CPU time was the kernel overhead (note ntoskrnl.exe on the first sampled call stack) of grabbing data from the cache. Remember that these are calls to the operating system – kernel calls. I work quickly on my computer and I get frustrated when I am forced to wait on an operation that should be fast.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

SLOG: serializable, low-latency, geo-replicated transactions

The Morning Paper

SLOG: serializable, low-latency, geo-replicated transactions Ren et al., SLOG is another research system motivated by the needs of the application developer (aka, user!). Building correct applications is much easier when the system provides strict serializability guarantees. Strict serializability reduces application code complexity and bugs, since it behaves like a system that is running on a single machine processing transactions sequentially.

Best Practice for Creating Indexes on your MySQL Tables

Scalegrid

During this time, you are also likely to experience a degraded performance of queries as your system resources are busy in index-creation work as well. 95th Percentile Latency. The 95th percentile latency of queries was also 1.8 By having appropriate indexes on your MySQL tables, you can greatly enhance the performance of SELECT queries.

Orbital edge computing: nano satellite constellations as a new class of computer system

The Morning Paper

Orbital edge computing: nanosatellite constellations as a new class of computer system , Denby & Lucia, ASPLOS’20. Only space system architects don’t call it request-response, they call it a ‘ bent-pipe architecture.’. Nanosatellite systems have a GSD of around 3.0m/px.

How to Improve MySQL AWS Performance 2X Over Amazon RDS at The Same Cost

Scalegrid

As organizations continue to migrate to the cloud, it’s important to get in front of performance issues, such as high latency, low throughput, and replication lag with higher distances between your users and cloud infrastructure. ScaleGrid’s MySQL on AWS High Performance deployment can provide 2x-3x the throughput at half the latency of Amazon RDS for MySQL with their added advantage of having 2 read replicas as compared to 1 in RDS. MySQL on AWS Latency Performance Test Averages.

AWS 138

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

Scalegrid

On modern Linux systems, the difference in overhead between forking a process and creating a thread is much lesser than it used to be. While there is plenty of well-documented benefits to using a connection pooler, there are some arguments to be made against using one: Introducing a middleware in the communication inevitably introduces some latency. A long time ago, in a galaxy far far away, ‘threads’ were a programming novelty rarely used and seldom trusted.

Memory Latency on the Intel Xeon Phi x200 “Knights Landing” processor

John McCalpin

The Xeon Phi x200 (Knights Landing) has a lot of modes of operation (selected at boot time), and the latency and bandwidth characteristics are slightly different for each mode. It is also important to remember that the latency can be different for each physical address, depending on the location of the requesting core, the location of the coherence agent responsible for that address, and the location of the memory controller for that address. MCDRAM maximum latency (ns) 156.1

An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems

The Morning Paper

An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems Gan et al., Systems built with lots of microservices have different operational characteristics to those built from a small number of monoliths, we’d like to study and better understand those differences. In this paper we explore the implications microservices have across the cloud system stack. Operating system and network implications.

Unlocking Enterprise systems using voice

All Things Distributed

The interfaces to our digital system have been dictated by the capabilities of our computer systems—keyboards, mice, graphical interfaces, remotes, and touch screens. As a result, they fail to deliver a truly seamless and customer-centric experience that integrates our digital systems into our analog lives. All of these benefits make voice a game changer for interacting with all kinds of digital systems.

Build automated self-healing systems with xMatters and Dynatrace (Part 2 of 3)

Dynatrace

In this alert, xMatters includes all the important incident information from Dynatrace, so there’s no need for you to visit additional system dashboards. Based on this contextual data, resources are prompted with their pre-configured response options, each of which kicks off a workflow across systems (based on the severity of the issue). Step 5 – xMatters triggers a runbook in Ansible to fix the disk latency. Dynatrace news.

Three Other Models of Computer System Performance: Part 1

ACM Sigarch

Computer systems, from the Internet-of-Things devices to datacenters, are complex and optimizing them can enhance capability and save money. Existing systems can be studied with measurement, while prospective systems are most often studied by extrapolating from measurements of prior systems or via simulation software that mimics target system function and provides performance metrics. Can one both minimize latency and maximize throughput for unscheduled work?

Three Other Models of Computer System Performance: Part 2

ACM Sigarch

How many buffers are needed to track pending requests as a function of needed bandwidth and expected latency? Can one both minimize latency and maximize throughput for unscheduled work? The M/M/1 queue will show us a required trade-off among (a) allowing unscheduled task arrivals, (b) minimizing latency, and (c) maximizing throughput. Let L denoted the average total latency to handle a task, equal to Q + S. Figure 2: M/M/1 Queue Latency vs. Throughput Tradeoff.

File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

The Morning Paper

File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution Aghayev et al., In this case, the assumption that a distributed storage backend should clearly be layered on top of a local file system. Ceph is a widely-used, open-source distributed file system that followed this convention [of building on top of a local file system] for a decade. ” Ten years of building on local file systems.

Who monitors the monitoring systems?

Adrian Cockcroft

In reality, in any non-trivial installation, there are multiple tools collecting, storing and displaying overlapping sets of metrics from many types of systems and different levels of abstraction. These monitoring systems provide critical observability capabilities that are needed to successfully configure, deploy, debug and troubleshoot installations and applications. What if your monitoring systems fail? How do you even know when a monitoring system has failed?

Virtual consensus in Delos

The Morning Paper

While ultimately this new system should be able to take advantage of the latest advances in consensus for improved performance, that’s not realistic given a 6-9 month in-production target. replacing Paxos with Raft), or they could be shims over external storage systems.

The Surprising Effectiveness of Non-Overlapping, Sensitivity-Based Performance Models

John McCalpin

The presentation discusses a family of simple performance models that I developed over the last 20 years — originally in support of processor and system design at SGI (1996-1999), IBM (1999-2005), and AMD (2006-2008), but more recently in support of system procurements at The Texas Advanced Computing Center (TACC) (2009-present). This includes all architectures, all compilers, all operating systems, and all system configurations.

Content Management Systems of the Future: Headless, JAMstack, ADN and Functions at the Edge

Abhishek Tiwari

Recently I was asked about content management systems (CMS) of the future - more specifically how they are evolving in the era of microservices, APIs, and serverless computing. Using CDN for the whole website, you can offload most of the website traffic to your CDN which will handle not only large traffic spikes but also reduce the latency of content delivery. Raw content data along with templates are version controlled using Git or similar versioning systems.

Cluster Diagnostics: Troubleshoot Cluster Issues Using Only SQL Queries

DZone

Through a chain reaction of events, the CPU load maxes out, out of memory errors occur, network latency increases, and disk writes and reads slow down. performance sql troubleshooting database administration distributed system tidbTiDB is an open-source, distributed SQL database that supports Hybrid Transactional/Analytical Processing (HTAP) workloads. Ideally, a TiDB cluster should always be efficient and problem-free.

Software-defined far memory in warehouse scale computers

The Morning Paper

” This paper describes a “far memory” system that has been in production deployment at Google since 2016. This boils down to a single digit µs latency toleration in the tail for far memory, and in addition to security and privacy concerns, rules out remote memory solutions. Thus we’re fundamentally trading (de)-compression latency at access time for the ability to pack more data in memory. The far memory system has been deployed in production since 2016.

Seamless offloading of web app computations from mobile device to edge clouds via HTML5 Web Worker migration

The Morning Paper

Edge servers are the middle ground – more compute power than a mobile device, but with latency of just a few ms. The Mobile Web Worker (MWW) System. The Mobile Web Worker (MWW) System introduces a new client-side Mobile Web Worker Manager component which is responsible for managing web workers, including their migration when this is estimated to be beneficial. The MWW System prototype is implemented in Chrome for the browser, and Node.js

Cloudburst: stateful functions-as-a-service

The Morning Paper

On the Cloudburst design teams’ wish list: A running function’s ‘hot’ data should be kept physically nearby for low-latency access. A low-latency autoscaling KVS can serve as both global storage and a DHT-like overlay network. Cloudburst has four key components: function executors, caches, function schedulers, and a resource management system. Uncategorized Distributed SystemsCloudburst: stateful functions-as-a-service , Sreekanti et al.,

Cache 64

Invited Talk at SuperComputing 2016!

John McCalpin

“Memory Bandwidth and System Balance in HPC Systems” If you are planning to attend the SuperComputing 2016 conference in Salt Lake City next month, be sure to reserve a spot on your calendar for my talk on Wednesday afternoon (4:15pm-5:00pm). I will be talking about the technology and market trends that have driven changes in deployed HPC systems, with a particular emphasis on the increasing relative performance cost of memory accesses (vs arithmetic).

Mergeable replicated data types – Part I

The Morning Paper

This paper was published at OOPSLA, but perhaps it’s amongst the distributed systems community that I expect there to be the greatest interest. The paper sets the discussion in the context of geo-replicated distributed systems, but of course the same mechanisms could be equally useful in the context of local-first applications. Uncategorized Algorithms and data structures Distributed SystemsMergeable replicated data types Kaki et al., OOPSLA’19.

Byzantine Fault Tolerance

cdemi

In distributed computer systems, Byzantine Fault Tolerance is a characteristic of a system that tolerates the class of failures known as the Byzantine Generals' Problem ; for which there is an unsolvability proof. The typical mapping of this story onto computer systems is that the computers are the generals and their digital communication system links are the messengers. Several system architectures were designed that implement Byzantine Fault Tolerance.

"0 to 60" : Switching to indirect checkpoints

SQL Performance

I was a bit perplexed by this issue, since the system was certainly no slouch — plenty of cores, 3TB of memory, and XtremIO storage. ( MAX_MEMORY = 4096 KB , MAX_DISPATCH_LATENCY = 30 SECONDS , STARTUP_STATE = ON. ) SQL Performance System Configuration checkpoint indirect checkopintIn a recent tip , I described a scenario where a SQL Server 2016 instance seemed to be struggling with checkpoint times.

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

The Morning Paper

Last time around we looked at the DeathStarBench suite of microservices-based benchmark applications and learned that microservices systems can be especially latency sensitive, and that hotspots can propagate through a microservices architecture in interesting ways. Seer is an online system that observes the behaviour of cloud applications (using the DeathStarBench microservices for the evaluation) and predicts when QoS violations may be about to occur.

A persistent problem: managing pointers in NVM

The Morning Paper

At the start of November I was privileged to attend HPTS (the High Performance Transaction Systems) conference in Asilomar. Byte-addressable non-volatile memory,) NVM will fundamentally change the way hardware interacts, the way operating systems are designed, and the way applications operate on data. This means that the overheads of system calls become much more noticeable. " Uncategorized Hardware Operating Systems

Why Telcos Need a Real-Time Analytics Strategy

VoltDB

Telco networks and the systems that support those networks are some of the most advanced technology solutions in existence. Telco networks are unique, and their success can be defined in two parts: Being able to successfully process volumes of data off network elements and systems without losing any information; Being able to render an accurate bill from this information and create revenue from the services provided. Historically, telco analytics have been limited and difficult.

Why Telcos Need a Real-Time Analytics Strategy

VoltDB

Telco networks and the systems that support those networks are some of the most advanced technology solutions in existence. Telco networks are unique, and their success can be defined in two parts: Being able to successfully process volumes of data off network elements and systems without losing any information; Being able to render an accurate bill from this information and create revenue from the services provided. Historically, telco analytics have been limited and difficult.

Time protection: the missing OS abstraction

The Morning Paper

Just as today’s systems offer memory protection, they call this time protection. A covert cache-based channel (for example) can be built by the sender modulating its footprint in the cache through its execution, and the receiver probing this footprint by systematically touching cache lines and measuring memory latency and by observing its own execution speed. Enforcement of a system’s security policy must not depend on correct application behaviour.

Cache 50

A tale of two abstractions: the case for object space

The Morning Paper

…software operating on persistent data structures requires "global" pointers that remain valid after a process terminates, while hardware requires that a diverse set of devices all have the same mappings they need for bulk transfers to and from memory, and that they be able to do so for a potentially heterogeneous memory system. The logical object space is an abstraction over physical memory that contains the working set of actively used objects local to one system.

Best Practice for Creating Indexes on your MySQL Tables

High Scalability

During this time, you are also likely to experience a degraded performance of queries as your system resources are busy in index-creation work as well. By having appropriate indexes on your MySQL tables, you can greatly enhance the performance of SELECT queries. But, did you know that adding indexes to your tables in itself is an expensive operation, and may take a long time to complete depending on the size of your tables?

Expanding the Cloud: Faster, More Flexible Queries with DynamoDB

All Things Distributed

Werner Vogels weblog on building scalable and robust distributed systems. While DynamoDB already allows you to perform low-latency queries based on your tableâ??s This gives you the ability to perform richer queries while still meeting the low-latency demands of responsive, scalable applications. Milo Milovanovic, Washington Post Principal Systems Architect reports that â??database s will exhibit the same latency and throughput performance as those without any indexes.

Games 72

Friends don't let friends build data pipelines

Abhishek Tiwari

In a nutshell, a data pipeline is a distributed system. Here are 8 fallacies of data pipeline The pipeline is reliable Topology is stateless Pipeline is infinitely scalable Processing latency is minimum Everything is observable There is no domino effect Pipeline is cost­-effective Data is homogeneous The pipeline is reliable The inconvenient truth is that pipeline is not reliable. Data Engineering Distributed Systems

Cloud infrastructure monitoring in action: Dynatrace on Dynatrace

Dynatrace

Sydney, we have a disk write latency problem! It was on August 25 th at 14:00 when Davis initially alerted on a disk write latency issues to Elastic File System (EFS) on one of our EC2 instances in AWS’s Sydney Data Center. Dynatrace news.

Helios: hyperscale indexing for the cloud & edge – part 1

The Morning Paper

As a production system within Microsoft capturing around a quadrillion events and indexing 16 trillion search keys per day it would be interesting in its own right, but there’s a lot more to it than that. It’s limited by the laws of physics in terms of end-to-end latency.

Cloud 72

Expanding the Cloud - Introducing the AWS Asia Pacific (Tokyo.

All Things Distributed

Werner Vogels weblog on building scalable and robust distributed systems. Japanese companies and consumers have become used to low latency and high-speed networking available between their businesses, residences, and mobile devices. The advanced Asia Pacific network infrastructure also makes the AWS Tokyo Region a viable low-latency option for customers from South Korea. All Things Distributed. Expanding the Cloud - Introducing the AWS Asia Pacific (Tokyo) Region.

AWS 66

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Dynatrace

High latency or lack of responses. You receive an alert message from Dynatrace (your infrastructure observability hub) letting you know that the average response latency of all deployed APIs has tripled. This increase is clearly correlated with the increased response latencies.

The Fastest Google Fonts

CSS Wizardry

On this site, in which performance is the only name of the game, I forgo web fonts entirely, opting instead to make use of the visitor’s system font. On a high-latency connection, this spells bad news. However, the execution of this header is bound by the response’s TTFB, which on high-latency networks can be very, very high. Put another-other way, this file is latency-bound, not bandwidth-bound.

Google 219

RPCValet: NI-driven tail-aware balancing of µs-scale RPCs

The Morning Paper

Last week we learned about the [increased tail-latency sensitivity of microservices based applications with high RPC fan-outs. Seer uses estimates of queue depths to mitigate latency spikes on the order of 10-100ms, in conjunction with a cluster manager. Today’s paper choice, RPCValet, operates at latencies 3 orders of magnitude lower, targeting reduction in tail latency for services that themselves have service times on the order of a small number of µs (e.g.,

Keeping Netflix Reliable Using Prioritized Load Shedding

The Netflix TechBlog

How viewers are able to watch their favorite show on Netflix while the infrastructure self-recovers from a system failure By Manuel Correa , Arthur Gonigberg , and Daniel West Getting stuck in traffic is one of the most frustrating experiences for drivers around the world.

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

Mounting object storage in Netflix’s media processing platform By Barak Alon (on behalf of Netflix’s Media Cloud Engineering team) MezzFS (short for “Mezzanine File System”) is a tool we’ve developed at Netflix that mounts cloud objects as local files via FUSE. Arbitrary ranges of a cloud object can be mounted as separate files on the local file system. One of the challenges of efficiently streaming bits in a FUSE system is that the kernel will break reads into chunks.

Media 183