Replication

Some of the promises of distributed systems are: increased availability, fault tolerance, and higher performance compared to single machine systems. However, this is not something that one get for free by implementing a distributed system, the system need to be designed to achieve such goals. One of the main techniques for increasing availability and implementing fault tolerance is by using replication, that is having more than one machine running a service. This lecture give an introduction to replication with a look at different techniques for obtaining fault tolerance or increased availability.

Literature

[DS] Chapter 15 (excluding section 15.5).

Exercises