António Casimiro

Elastic State Machine Replication

André Nogueira, António Casimiro, Alysson Bessani

IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 9, pp. 2486-2499, Sept. 1 2017


Abstract

State machine replication (SMR) is a fundamental technique for implementing stateful dependable systems. A key limitation of this technique is that the performance of a service does not scale with the number of replicas hosting it. Some works have shown that such scalability can be achieved by partitioning the state of the service into shards. The few SMR-based systems that support dynamic partitioning implement ad-hoc state transfer protocols and perform scaling operations as background tasks to minimize the performance degradation during reconfigurations. In this work we go one step further and propose a modular partition transfer protocol for creating and destroying such partitions at runtime, thus providing fast elasticity for crash and Byzantine fault tolerant replicated state machines and making them more suitable for cloud systems.

BibTeX

@article{Nogueira:17a,
  author       = {Andr{\'{e}} Nogueira and Antonio Casimiro and Alysson Bessani},
  title        = {Elastic State Machine Replication},
  journal      = {{IEEE} Trans. Parallel Distrib. Syst.},
  volume       = {28},
  number       = {9},
  pages        = {2486--2499},
  year         = {2017},
  url          = {https://doi.org/10.1109/TPDS.2017.2686383},
  doi          = {10.1109/TPDS.2017.2686383},
  abstractURL  = {http://www.di.fc.ul.pt/~casim/papers/tpds17/tpds17.html},
  documentURL  = {http://www.di.fc.ul.pt/~casim/papers/tpds17/tpds17.pdf},
}

Paper

Download paper