About us
Technology Transfer
Secure Spread


On the Performance of Wide-Area Synchronous Database Replication
ps, ps.gz, pdf. Technical Report CNDS-2002-4, November 2002.

Yair Amir, Claudiu Danilov, Michal Miskin-Amir, Jonathan Stanton, and Ciprian Tutu.

A fundamental challenge in database replication is to maintain a low cost of updates while assuring global system consistency. The difficulty of the problem is magnified for wide-area network settings due to the high latency and the increased likelihood of network partitions. As a consequence, most of the research in the area has focused either on improving the performance of local transaction execution or on replication models with weaker semantics, which rely on application knowledge to resolve potential conflicts.
In this work we identify the performance bottleneck of the existing synchronous replication schemes as residing in the update synchronization algorithm. We compare the performance of several such synchronization algorithms and highlight the large performance gap between various methods. We design a generic, synchronous replication scheme that uses an enhanced synchronization algorithm and demonstrate its practicality by building a prototype that replicates a PostgreSQL database system. We claim that the use of an optimized synchronization engine is the key to building a practical synhronous replication system for wide-area network settings.

From Total Order to Database Replication
ps, ps.gz, pdf. In Proceedings of the 22nd IEEE International Conference on Distributed Computing Systems (ICDCS), Vienna, Austria, July 2002. An extended version of the paper is available as Technical Report CNDS-2001-6, ps, ps.gz, pdf, November 2001.

Yair Amir and Ciprian Tutu.

This paper presents in detail an efficient and provably correct algorithm for database replication over partitionable networks. Our algorithm avoids the need for end-to-end acknowledgments for each action while supporting network partitions and merges and allowing dynamic instantiation of new replicas. One round of end-to-end acknowledgments is required only upon a membership change event such as a network partition. New actions may be introduced to the system at any point, not only while in a primary component. We show how performance can be further improved for applications that allow relaxation of consistency requirements. We provide experimental results that demonstrate the efficiency of our approach.

Maintaining Database Consistency in Peer to Peer Networks
ps, ps.gz, pdf. Technical Report CNDS-2002-2, February 2002.

Baruch Awerbuch, and Ciprian Tutu.

We present an algorithm for persistent consistent distributed commit (distributed database commit) in a dynamic, asynchronous, peer to peer network. The algorithm has constant overhead in time and space and almost constant communication complexity, allowing it to scale to very large size networks. Previous solutions required linear overhead in communication and space, making them unscalable.
We introduce a modular solution based on several well defined blocks with clear formal specifications. These blocks can be implemented in a variety of ways and we give examples of possible implementations. Most of the existing solutions require acknowledgments from every participant for each action. Our algorithm is highly efficient by aggregating these acknowledgments. Also, in contrast with existing solutions, our algorithm does not require any membership knowledge. Components are detected based on local information and the information is disseminated on an overlay spanning tree.
The algorithm may prove to be more suited for practical implementation than the existing ones, because of its simplicity.

Practical Wide-Area Database Replication
ps, ps.gz, pdf. Technical Report CNDS-2002-1, February 2002.

Yair Amir, Claudiu Danilov, Michal Miskin-Amir, Jonathan Stanton and Ciprian Tutu.

This paper explores the architecture, implementation and performance of a wide and local area database replication system. The architecture provides synchronous, peer replication, supporting diverse application semantics, based on a group communication paradigm. Network partitions and merges, computer crashes and recoveries, and message omissions are all handled. Using a generic replication engine and the Spread group communication toolkit, we provide replication services for the PostgreSQL database system. We define three different environments to be used as test-beds: a local area cluster, a wide area network that spans the U.S.A, and an emulated wide area test bed. We conduct an extensive set of experiments on these environments, varying the number of replicas and clients, the mix of updates and queries, and the network latency. Our results show that sophisticated algorithms and careful distributed systems design can make symmetric, synchronous, peer database replication a reality for both local and wide area networks.

Replication Using Group Communication Over a Partitioned Network
ps. Ph.D. Thesis, The Hebrew University of Jerusalem, Israel, August 1995.

Yair Amir.

Although this thesis was written before the formation of CNDS, it is put here for the interested reader. Enjoy :)

Questions or comments to:
webmaster (at)
TEL: (410) 516-5562
FAX: (410) 516-6134
Distributed Systems and Networks Lab
Computer Science Department
Johns Hopkins University
3400 N. Charles Street Baltimore, MD 21218-2686