Shriram Rajagopalan, Dan Williams and Hani Jamjoom
ACM Symposium on Cloud Computing (SoCC)
Santa Clara, California, Oct 2013
Abstract. Middleboxes are being rearchitected to be service oriented,
composable, extensible, and elastic. Yet
system-level support for high availability (HA)
continues to introduce significant performance
overhead. In this paper, we propose Pico
Replication (PR), a system-level framework for
middleboxes that exploits their flow-centric
structure to achieve low overhead, fully
customizable HA. Unlike generic (virtual machine
level) techniques, PR operates at the flow level.
Individual flows can be checkpointed at very high
frequencies while the middlebox continues to process
other flows. Furthermore, each flow can have its
own checkpoint frequency, output buffer and target
for backup, enabling rich and diverse policies that
balance---per-flow---performance and utilization.
PR leverages OpenFlow to provide near instant
flow-level failure recovery, by dynamically
rerouting a flow's packets to its replication
target. We have implemented PR and a flow-based HA
policy. In controlled experiments, PR sustains
checkpoint frequencies of 1000Hz, an order of
magnitude improvement over current VM replication
solutions. As a result, PR drastically reduces the
overhead on end-to-end latency from 280% to 15.5%
and throughput overhead from 99.5% to 3.2%.
Keywords. High Availability, Network Function Virtualization, Middleboxes, Software Defined Networking
Bibtex.
@inproceedings{jamjoom-pico-socc-2013,
author = {Shriram and Rajagopalan and Dan and Williams and Hani and Jamjoom},
title = {{Pico Replication: A High Availability Framework for Middleboxes}},
booktitle = {ACM Symposium on Cloud Computing (SoCC)},
address = {Santa Clara, California},
month = {Oct},
year = {2013}
}