With Anton’s recent posts on date and time issues developers encounter, “Time Out of Joint” and “More on time and time-zones” I thought it would be worth mentioning “Time, Clocks and the Ordering of events in a Distributed System”.
This paper (https://lamport.azurewebsites.net/pubs/time-clocks.pdf) was written by Leslie Lamport (http://www.lamport.org/) and published in 1978. My first encounter with Lamport was while working on my PhD in Physics – my thesis was written in LaTeX (https://www.latex-project.org/) which Lamport developed and for which he wrote the reference book (https://www.amazon.co.uk/LaTeX-Preparation-Reference-Addison-Wesley-Techniques/dp/0201529831). The book has a pithy comment by Lamport surmising that he will be more famous for LaTeX than his work on distributed systems, since then he has won the Turing Award (though his work on LaTeX was referenced in the commendation! (https://amturing.acm.org/award_winners/lamport_1205376.cfm)).
Lamport himself does a nice introduction of the paper (https://lamport.azurewebsites.net/pubs/pubs.html#time-clocks), the opening of which is worth capturing here:
Jim Gray once told me that he had heard two different opinions of this paper: that it’s trivial and that it’s brilliant. I can’t argue with the former, and I am disinclined to argue with the latter.
Lamport was originally a physicist and when he considered the concept of time in a distributed system he approached it from the point of view of someone who understands Einstein’s theory of special relativity. The speed of light is a constant, nothing can travel faster than the speed of light – so a message cannot be received “before” it was sent, using this we can build an ordering within a distributed system.
The most important contribution of the paper is the concept of replicated state machines, start a set of deterministic state machines in the same state and then make sure they process the same set of messages in the same order and they will naturally be replicated. The key is then to make sure the processors in the system agree the order of the messages as the messages may be arriving at different processors, this problem is now known as Consensus (https://en.wikipedia.org/wiki/Consensus_(computer_science)). In the paper Lamport solves the problem using “logical clocks”, these are now known as “Lamport clocks” (https://en.wikipedia.org/wiki/Lamport_timestamps) and there is a variation known as “Vector clocks” (https://en.wikipedia.org/wiki/Vector_clock). This approach to solving the consensus problem has been superseded by better solutions, including Lamport’s own Paxos algorithm (https://en.wikipedia.org/wiki/Paxos_(computer_science)) an approach he “tried” (https://lamport.azurewebsites.net/pubs/pubs.html#lamport-paxos) to publish nearly 12 years after the time and clocks paper.
All of WANdisco’s products are based on the replicated state machine approach, so we are quite fond of this paper!