article thumbnail

Site reliability engineering: 5 things you need to know

Dynatrace

As a discipline, SRE focuses on improving software system reliability across key categories including availability, performance, latency, efficiency, capacity, and incident response. ” According to Google, “SRE is what you get when you treat operations as a software problem.”

article thumbnail

Site reliability engineering: 5 things to you need to know

Dynatrace

As a discipline, SRE focuses on improving software system reliability across key categories including availability, performance, latency, efficiency, capacity, and incident response. ” According to Google, “SRE is what you get when you treat operations as a software problem.”

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

What is a Site Reliability Engineer (SRE)?

Dotcom-Montior

The term site reliability engineering first came into existence at Google in 2003 when a site reliability team was created. that are required to keep the software deployments live are running efficiently. He was asked in 2003 to create and manage a team of seven engineers which eventually led him to create the new role/title.

article thumbnail

Supercomputing Predictions: Custom CPUs, CXL3.0, and Petalith Architectures

Adrian Cockcroft

Here’s some predictions I’m making: Jack Dongarra’s efforts to highlight the low efficiency of the HPCG benchmark as an issue will influence the next generation of supercomputer architectures to optimize for sparse matrix computations. Jack Dongarra talked about the scores, and pointed out the low efficiency on some important workloads.

article thumbnail

The Back-to-Basics Readings of 2012 - All Things Distributed

All Things Distributed

Jul 4 - Leases: An efficient fault-tolerant mechanism for distributed file cache consistency , Gray, Cary, and David Cheriton, Vol. Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield, in the Proceedings of the 19th ACM Symposium on Operating Systems Principles, October 19-22, 2003, Bolton Landing, NY USA. Gray and David R.

article thumbnail

Fast key-value stores: an idea whose time has come and gone

The Morning Paper

Yes, a bit like those 2nd-level caches we were talking about earlier, e.g. Ehcache from 2003 onwards. The first part of their proposed alternative is to use a local (in-process) in-memory store instead of a RInK. This eliminates marshalling costs to reduce CPU usage, and eliminates network latency. Who knew! ;). From RInK to LInK.

Cache 79
article thumbnail

Rethinking the 'production' of data

All Things Distributed

That was the provocative thesis of a much-talked-about article from 2003 in the Harvard Business Review by the US publicist Nicolas Carr. The benefit for customers: Authorized users can view this data and therefore manage their inventories across different sites, making the maintenance processes much more efficient.