| N-Way Fail-Over Infrastructure for Reliable Servers and Routers |
ps,
ps.gz,
pdf.
To appear in the Proceedings of the IEEE International Conference on
Dependable Systems and Networks (DSN03), San Francisco, June 2003. Obsoletes
Technical Report CNDS-2002-5.
CitationY. Amir and R. Caudy and A. Munjal and T. Schlossnagle and C. Tutu, "N-Way Fail-Over Infrastructure for Reliable Servers and Router", To appear in the Proceedings of the International Conference on Dependable Systems and Networks (DSN03), San Francisco, June 2003.
AuthorsYair Amir, Ryan Caudy, Ashima Munjal, Theo Schlossnagle, and Ciprian Tutu
AbstractMaintaining the availability of critical servers and routers is an important concern for many organizations. At the lowest level, IP addresses represent the global namespace by which services are accessible on the Internet.We introduce Wackamole, a completely distributed software solution based on a provably correct algorithm that negotiates the assignment of IP addresses among the currently available servers upon detection of faults. This reallocation ensures that at any given time any public IP address of the server cluster is covered exactly once, as long as at least one physical server survives the network fault. The same technique is extended to support highly available routers. The paper presents the design considerations, algorithm specification and correctness proof, discusses the practical usage for server clusters and for routers, and evaluates the performance of the system. |