TY - GEN
T1 - Scalable facility location for massive graphs on pregel-like systems
AU - Garimella, Kiran
AU - De Francisci Morales, Gianmarco
AU - Gionis, Aristides
AU - Sozio, Mauro
PY - 2015/10/17
Y1 - 2015/10/17
N2 - We propose a new scalable algorithm for the facility-location problem. We study the graph setting, where the cost of serving a client from a facility is represented by the shortest-path distance on a graph. This setting is applicable to various problems arising in the Web and social media, and allows to leverage the inherent sparsity of such graphs. To obtain truly scalable performance, we design a parallel algorithm that operates on clusters of shared-nothing machines. In particular, we target modern Pregel-like architectures, and we implement our algorithm on Apache Giraph. Our work builds upon previous results: a facility location algorithm for the PRAM model, a recent distance-sketching method for massive graphs, and a parallel algorithm to finding maximal independent sets. The main challenge is to adapt those building blocks to the distributed graph setting, while maintaining the approximation guarantee and limiting the amount of distributed communication. Extensive experimental results show that our algorithm scales gracefully to graphs with billions of edges, while, in terms of quality, being competitive with state-of-the-art sequential algorithms.
AB - We propose a new scalable algorithm for the facility-location problem. We study the graph setting, where the cost of serving a client from a facility is represented by the shortest-path distance on a graph. This setting is applicable to various problems arising in the Web and social media, and allows to leverage the inherent sparsity of such graphs. To obtain truly scalable performance, we design a parallel algorithm that operates on clusters of shared-nothing machines. In particular, we target modern Pregel-like architectures, and we implement our algorithm on Apache Giraph. Our work builds upon previous results: a facility location algorithm for the PRAM model, a recent distance-sketching method for massive graphs, and a parallel algorithm to finding maximal independent sets. The main challenge is to adapt those building blocks to the distributed graph setting, while maintaining the approximation guarantee and limiting the amount of distributed communication. Extensive experimental results show that our algorithm scales gracefully to graphs with billions of edges, while, in terms of quality, being competitive with state-of-the-art sequential algorithms.
U2 - 10.1145/2806416.2806508
DO - 10.1145/2806416.2806508
M3 - Conference contribution
AN - SCOPUS:84958236798
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 273
EP - 282
BT - CIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 24th ACM International Conference on Information and Knowledge Management, CIKM 2015
Y2 - 19 October 2015 through 23 October 2015
ER -