Karim Awara, Hani Jamjoom and Panos Kalnis
ACM SIGCOMM Posters and Demos
Hong Kong, China, August 2013
Abstract. The explosive growth of ``big data" is giving rise to a new breed of
large scale graph systems, such as Pregel. This poster describes our
ongoing work in characterizing and minimizing the communication cost
of Bulk Synchronous Parallel (BSP) graph mining systems, like Pregel,
when scaling to 4,096 compute nodes. Existing implementations
generally assume a fixed communication cost. This is sufficient in
small deployments as the BSP programming model (i.e., overlapping
computation and communication) masks small variations in the
underlying network. In large scale deployments, such variations can
dominate the overall runtime characteristics. In this poster, we first
quantify the impact of network communication on the total compute time
of a Pregel system. We then propose an efficient vertex placement
strategy that subsamples highly connected vertices and applies the
Reverse Cuthill-McKee (RCM) algorithm to efficiently partition the
input graph and place partitions closer to each other based on their
expected communication patterns. We finally describe a vertex
replication strategy to further reduce communication overhead.
Keywords. Graph Processing Systems, HPC, Optimization
Bibtex.
@inproceedings{jamjoom-sigcomm-poster-13,
author = {Karim and Awara and Hani and Jamjoom and Panos and Kalnis},
title = {{To 4,000 Compute Nodes and Beyond: Network-aware Vertex
Placement in Large-scale Graph Processing Systems}},
booktitle = {ACM SIGCOMM Posters and Demos},
address = {Hong Kong, China},
month = {August},
year = {2013}
}