TY - JOUR
T1 - Storage-Computation-Communication Tradeoff in Distributed Computing
T2 - Fundamental Limits and Complexity
AU - Yan, Qifa
AU - Yang, Sheng
AU - Wigger, Michele
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2022/8/1
Y1 - 2022/8/1
N2 - Distributed computing has become one of the most important frameworks in dealing with large computation tasks. In this paper, we propose a systematic construction of coded computing schemes for MapReduce-Type distributed systems. The construction builds upon placement delivery arrays (PDA), originally proposed by Yan et al. for coded caching schemes. The main contributions of our work are three-fold. First, we identify a class of PDAs, called Comp-PDAs, and show how to obtain a coded computing scheme from any Comp-PDA. We also characterize the normalized number of stored files (storage load), computed intermediate values (computation load), and communicated bits (communication load), of the obtained schemes in terms of the Comp-PDA parameters. Then, we show that the performance achieved by Comp-PDAs describing Maddah-Ali and Niesen's coded caching schemes matches a new information-Theoretic converse, thus establishing the fundamental region of all achievable performance triples. In particular, we characterize all the Comp-PDAs achieving the pareto-optimal storage, computation, and communication (SCC) loads of the fundamental region. Finally, we investigate the file complexity of the proposed schemes, i.e., the smallest number of files required for implementation. In particular, we describe Comp-PDAs that achieve pareto-optimal SCC triples with significantly lower file complexity than the originally proposed Comp-PDAs.
AB - Distributed computing has become one of the most important frameworks in dealing with large computation tasks. In this paper, we propose a systematic construction of coded computing schemes for MapReduce-Type distributed systems. The construction builds upon placement delivery arrays (PDA), originally proposed by Yan et al. for coded caching schemes. The main contributions of our work are three-fold. First, we identify a class of PDAs, called Comp-PDAs, and show how to obtain a coded computing scheme from any Comp-PDA. We also characterize the normalized number of stored files (storage load), computed intermediate values (computation load), and communicated bits (communication load), of the obtained schemes in terms of the Comp-PDA parameters. Then, we show that the performance achieved by Comp-PDAs describing Maddah-Ali and Niesen's coded caching schemes matches a new information-Theoretic converse, thus establishing the fundamental region of all achievable performance triples. In particular, we characterize all the Comp-PDAs achieving the pareto-optimal storage, computation, and communication (SCC) loads of the fundamental region. Finally, we investigate the file complexity of the proposed schemes, i.e., the smallest number of files required for implementation. In particular, we describe Comp-PDAs that achieve pareto-optimal SCC triples with significantly lower file complexity than the originally proposed Comp-PDAs.
KW - Distributed computing
KW - MapReduce
KW - communication
KW - placement delivery array
KW - storage
U2 - 10.1109/TIT.2022.3158828
DO - 10.1109/TIT.2022.3158828
M3 - Article
AN - SCOPUS:85126684179
SN - 0018-9448
VL - 68
SP - 5496
EP - 5512
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 8
ER -