Life is not a State-Machine

• 1027 words

Last week I gave a keynote at the ACM Principles of Distributed Computing Conference (PODC) on the topic of technology transfer. My choice of topic was triggered by recent presentations by a number of other research luminaries, who had remarked that the distributed computing research community had failed to make its mark; lots of good ideas, little impact.

I subscribe to a longer term point of view when it comes to the transformation of technology into successful products. Richard Gabriel created a model that lays out the time it takes and means by which innovations become successful consumable products[1]. It is certainly fits the market success of a number of the Xerox Parc innovations, the spreadsheet, or even the Web. To illustrate, hyperlinks and markup languages were developed in the mid sixties, the tcp/ip based networks came to life in the seventies, and it wasn’t until the mid nineties before the combination of these three turned into the basis for a mass consumer product. Gabriel’s presentation on “Models of Software Acceptance: How Winners Win” has more examples and a better connection to the “Crossing the Chasm [2]” style of thinking.

I believe this applies to much of the deep, fundamental, distributed systems material as well. Felipe Cabrera has said, for example, that when Vista ships next year with the support of fine grained transactions in programming languages, it will have been more than 20 years after the concepts where developed in the Quicksilver project at IBM.

But things are accelerating. Amazon.com and other places use high-quality, massively distributed systems to their advantage and require the use of recent research technology to exploit this massive scale. We see an increased adoption of advanced distributed systems beyond established techniques such as edge caching, lazy processing, fusions & aggregation, etc.

Adoption of more recent research technology for use in products is not a walk in the park. Engineers have to be very determined to overcome the many roadblocks that come with early adoption.

Unrealistic assumptions

Research is focused on the details of the technology itself, and not very focused on the application context of the technology. Often, to be able to make progress in research, you need to restrict the environment it can be applied to. For example, many academics will confess to have made the assumption that failures of component are not correlated. This absolutely unrealistic assumption will come back to haunt you in real life, where failures frequently are correlated, as they are often triggered by external or environmental events.

When selecting research technology, it is often a major exercise to discover what exact assumptions the researcher made. Then, the even more difficult exercize is figuring out whether you can live with those assumptions, whether the assumptions were relevant at all, or whether the may impede the adoption of the technology. And in the latter case, whether there is something we can do to bring the research to more realistic standards.

Uncertainty

Many of the insurmountable assumptions deal with reasoning away uncertainty. By turning life into a state machine in which no surprises can be found, one has the perfect world in which everything is clean and organized. There is a limit to how much you can trick life into being predictable and how much control you think you will have to keep life in-check. At small scale you may succeed, but when your systems grow in size and complexity you will lose control. As such, building scalable systems is all about letting go of control. (Turing’s Type I organizations)

In Control Theory, for a long time, researchers were convinced that practitioners did not want to use their research because there was too much complex math in it. It turned out however that the research was largely irrelevant in practice because it didn’t model a realistic world. The moment researchers started to produce work that explicitly took uncertainty into account, their work was rapidly adopted by engineers and architects. Ironically the math has only grown more complex…

In distributed systems we see a similar pattern arising; research which realistically models uncertainty is more readily useful for adoption. Randomization and self-organizing systems are crucial techniques for scaling systems in the real world.

A perfect world

The last topic I want to mention is the use of academic publications as a source of technology selection. Academics often battle out subtle competing views in their research papers. But if there are at least 10 competing approaches to implementing consensus in distributed systems, an engineer needs to make a judgment call on which approach would be best to solve his problems. If the academics can’t even make their mind up on what appears to be the right way, how can their customers be expected to do this for them.

Papers are often written in an extremely positive manner: “This is, once again, the best improvement to life since sliced bread”. There is hardly any self-criticism. There certainly are no details about the things that didn’t work, and why they didn’t work. And let’s not even start talking about the use of statistics in system research papers. Do we really only care about averages? That is, of course, assuming the experiments were realistic in the first place.

You need to re-execute a number of the most promising research achievements in realistic settings to help your selection. There is no way around it. Which means that these research achievements will only be considered if an engineer really needs these results because it will be very time consuming.

Occam’s razor

This is an occasion where we actually use Occam’s Razor in its original sense; if two approaches produce the same result, select the one with the fewest assumptions. We have seen frequently that this selection criteria will lead you to the technology that has the greatest likelihood of being adopted.

entia non sunt multiplicanda praeter necessitatem

[1] Gabriel, Richard, "Money through Innovation Reconsidered" in Patterns of Software: Tales from the Software Community, Oxford University Press, USA; Reprint edition (May 1, 1998) (download book pdf).

[2] Moore, Geoffrey A., “Crossing the Chasm: Marketing and Selling High Tech Products to Mainstream Customers” HarperBusiness; Rev edition (July 1999)