article thumbnail

USENIX LISA2021 Computing Performance: On the Horizon

Brendan Gregg

I also wrote about these topics in detail for my recent [Systems Performance 2nd Edition] book. TCP Extensions for Multipath Operation with Multiple Addresses,” [link] Mar 2020 - [Gregg 20] Brendan Gregg, “Systems Performance: Enterprise and the Cloud, Second Edition,” Addison-Wesley, 2020 - [Hruska 20] Joel Hruska, “Intel Demos PCIe 5.0

article thumbnail

Towards federated learning at scale: system design

The Morning Paper

Towards federated learning at scale: system design Bonawitz et al., SysML 2019. This is a high level paper describing Google’s production system for federated learning. At the core of the system is a federated learning approach called Federated Averaging , with an optional extension for Secure Aggregation.

Systems 71
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Stuff The Internet Says On Scalability For March 22nd, 2019

High Scalability

µs of replication latency on lossy Ethernet, which is faster than or comparable to specialized replication systems that use programmable switches, FPGAs, or RDMA.". It has 41 mostly 5 star reviews. They'll learn a lot and love you even more.5 5 billion : weekly visits to Apple App store; $500m : new US exascale computer; $1.7

Internet 134
article thumbnail

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

Uptime Institute’s 2022 Outage Analysis report found that over 60% of system outages resulted in at least $100,000 in total losses, up from 39% in 2019. At the lowest level, SLIs provide a view of service availability, latency, performance, and capacity across systems. Make SLOs realistic.

article thumbnail

2019 PostgreSQL Trends Report: Private vs. Public Cloud, Migrations, Database Combinations & Top Reasons Used

High Scalability

PostgreSQL is an open source object-relational database system that has soared in popularity over the past 30 years from its active, loyal, and growing community. For the 2nd year in a row, PostgreSQL has kept the title of #1 fastest growing database in the world according to the DBMS of the Year report by the experts at DB-Engines.

Database 177
article thumbnail

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Brendan Gregg

This talk originated from my updates to [Systems Performance 2nd Edition], and this was the first time I've given this talk in person! CXL in a way allows a custom memory controller to be added to a system, to increase memory capacity, bandwidth, and overall performance. Ford, et al., “TCP

article thumbnail

What is? OpenTelemetry??An open-source standard for logs, metrics, and traces

Dynatrace

Loosely defined, observability is the ability to understand what’s happening inside a system from the knowledge of the external data it produces, which are usually logs, metrics, and traces. Logs are important because you’ll naturally want an event-based record of any notable anomalies across the system. Watch webinar now!