James Hamilton on reliability

Don’t trust hardware or software; then you can build trustworthy hardware and software.

James Hamilton on how to write reliable software in a world where anything that can fail, will fail.

2 thoughts on “James Hamilton on reliability

  1. Tested the 2012-02-27 and 2012-02-28 (Paris time) and the link works.

    Now, after reading the article, I understand the risks (and thus the means to avoid them) related to critical systems like satellites and other space modules, but for a normal application like the ones most of us work on, the prospect of double/triple/checksum testing anything from hardware to software is daunting.

    Quoting the text: “At scale, error detection and correction at lower levels fails to correct or even detect some problems. Software stacks above introduce errors. Hardware introduces more errors. Firmware introduces errors. Errors creep in everywhere and absolutely nobody and nothing can be trusted […] Upon deep investigation at some customer sites, we found the software was fine, but each customer had one, and sometimes several, latent data corruptions on disk. Perhaps it was introduced by hardware, perhaps firmware, or possibly software”

    I just assumed that hardware corruption (one of my HD just died recently, after a long data corrupting agony, so I tasted that bitter medicine) was a “fact of life” and that I had better things to do (like, correcting my own bugs) than trying to protect the customer from hardware faults or others things I had no control upon.

    C++ has multiple virtues, but immunization from hardware problems, electricity interruption, or even alien invasion are not among them (or perhaps, Alien invasion would be ok, if done on Mac by Jeff Goldblum).

    And I still believe this (if not, I would be panicking right now).

    Still interesting to know because of how it could very well apply on different components of the same “application” working together (like a rich client, a server, its plugins, all forming a large application, as far as the client is concerned)… Food for thoughts…

Comments are closed.