TY - GEN
T1 - On the weakest failure detector ever
AU - Guerraoui, Rachid
AU - Herlihy, Maurice
AU - Kouznetsov, Petr
AU - Lynch, Nancy
AU - Newport, Calvin
PY - 2007/12/14
Y1 - 2007/12/14
N2 - Many problems in distributed computing are impossible when no information about process failures is available. It is common to ask what information about failures is necessary and sufficient to circumvent some specific impossibility, e.g., consensus, atomic commit, mutual exclusion, etc. This paper asks what information about failures is needed to circumvent any impossibility and sufficient to circumvent some impossibility. In other words, what is the minimal yet non-trivial failure informatio. We present an abstraction, denoted , that provides very little failure information. In every run of the distributed system, eventually informs the processes that some set of processes in the system cannot be the set of correct processes in that run. Although seemingly weak, for it might provide random information for an arbitrarily long period of time, and it only excludes one possibility of correct set among many, still captures non-trivial failure information. We show that is sufficient to circumvent the fundamental wait-free set-agreement impossibility. While doing so, we (a) disprove previous conjectures about the weakest failure detector to solve set-agreement and we (b) prove that solving set-agreement with registers is strictly weaker than solving n+1-process consensus using n-process consensus. We prove that is, in a precise sense, minimal to circumvent any wait-free impossibility. Roughly, we show that is the weakest eventually stable failure detect or to circumvent any wait-free impossibility. Our results are generalized through an abstraction f that we introduce and prove necessary to solve any problem that cannot be solved in an f-resilient manner, and yet sufficient to solve f-resilient f-set-agreement.
AB - Many problems in distributed computing are impossible when no information about process failures is available. It is common to ask what information about failures is necessary and sufficient to circumvent some specific impossibility, e.g., consensus, atomic commit, mutual exclusion, etc. This paper asks what information about failures is needed to circumvent any impossibility and sufficient to circumvent some impossibility. In other words, what is the minimal yet non-trivial failure informatio. We present an abstraction, denoted , that provides very little failure information. In every run of the distributed system, eventually informs the processes that some set of processes in the system cannot be the set of correct processes in that run. Although seemingly weak, for it might provide random information for an arbitrarily long period of time, and it only excludes one possibility of correct set among many, still captures non-trivial failure information. We show that is sufficient to circumvent the fundamental wait-free set-agreement impossibility. While doing so, we (a) disprove previous conjectures about the weakest failure detector to solve set-agreement and we (b) prove that solving set-agreement with registers is strictly weaker than solving n+1-process consensus using n-process consensus. We prove that is, in a precise sense, minimal to circumvent any wait-free impossibility. Roughly, we show that is the weakest eventually stable failure detect or to circumvent any wait-free impossibility. Our results are generalized through an abstraction f that we introduce and prove necessary to solve any problem that cannot be solved in an f-resilient manner, and yet sufficient to solve f-resilient f-set-agreement.
KW - Failure detectors
KW - Set-agreement
KW - Wait-free impossibilities
KW - Weakest failure detector ever
U2 - 10.1145/1281100.1281135
DO - 10.1145/1281100.1281135
M3 - Conference contribution
AN - SCOPUS:36849049248
SN - 1595936165
SN - 9781595936165
T3 - Proceedings of the Annual ACM Symposium on Principles of Distributed Computing
SP - 235
EP - 243
BT - PODC'07
T2 - PODC'07: 26th Annual ACM Symposium on Principles of Distributed Computing
Y2 - 12 August 2007 through 15 August 2007
ER -