Update June-28: Thanks to Eyal for finding a bug and suggesting a fix (already incorporated into the post). Let’s consider a simple service that has two main operations: read and write. We expect the service to persist the value passed in write and retrieve it when calling read. Making this service fault tolerant is non-trivial, and encapsulates many … Continue reading A Simple Single-writer
Since most people know of log-based consensus, like Paxos or Raft, it is easier to describe log-less consensus by comparison to log-based consensus. In log-based consensus, the system keeps a global state which evolves over time. The state is evolved by tracking the actual changes of the global state. For example, the global state might … Continue reading Log-less Consensus
I’d like to discuss a paper we’ve published lately. The paper presents work done at Elastifile.com, building a new kind of consensus algorithm. Before describing what our new system does, let’s see why previous consensus algorithms weren’t sufficient. Most consensus algorithms, like Paxos (which I’ve discussed previously in this blog, here and here), Raft or … Continue reading Bizur: A New Key-value Consensus Algorithm
It's been quite a while since I last published a post. I've started a new job at Elastifile at the beginning of 2014, and it kinda took most of my spare time 🙂 I've resolved to get back to blogging, and I hope I succeed... At Elastifile, we've built a distributed file-system that is truly scalable, and I'd … Continue reading Why Are Distributed File-systems Fun?
I think fault tolerance is the most important aspect of distributed algorithms, for two reasons: 1) in practice, things break, and you want your data / system to continue working. 2) If you assume there are no failures, things get "easy". Many, many problems become trivially solved; and that's just boring! So you're convinced that … Continue reading Fault Tolerance
The previous post gave a general overview of the Paxos algorithm. Here's a quick recap: Paxos implements a resilient distributed log, such that items can be added and each item is assigned a unique (and increasing) index. The algorithm can be split into three main blocks: a leader election, a consensus on a single item (also called … Continue reading Understanding Paxos – Part 2
The first time I heard of the Paxos algorithm was during my bachelor's degree way back in 2004, when I participated in a Distributed Algorithms course. In the past few years Paxos came up multiple times, usually in the context of a robust implementation of some scalable storage system. It is almost always uttered in … Continue reading Understanding Paxos – Part 1
Ever since NoSQL databases came into vogue, we hear more and more about eventual consistency. I want to to try and explain not only the difficulties that eventually consistent databases raise, but also why in some cases they can't be avoided. To simplify the discussion, I'll narrow it down to key-value databases, i.e., databases that support … Continue reading Eventual Consistency