Correlating node centrality metrics with node resilience in self-healing systems with limited neighbourhood information

Research output: Contribution to journalArticlepeer-review

Abstract

Resilient systems must self-heal their components and connections to maintain their topology and function when failures occur. This ability becomes essential to many networked and distributed systems, e.g., virtualisation platforms, cloud services, microservice architectures and decentralised algorithms. This paper builds upon a self-healing approach where failed nodes are recreated and reconnected automatically based on topology information, which is maintained within each node's neighbourhood. The paper proposes two novel contributions. First, it offers a generic method for establishing the minimum size of a network neighbourhood to be known by each node in order to recover the system's component interconnection topology under a certain probability of node failure. This improves the previous proposal by reducing resource consumption, as only local information is communication and stored. Second, it adopts analysis techniques from complex networks theory to correlate a node's recovery probability with its closeness centrality within the self-healing system. This allows strengthening a system's resilience by analysing its topological characteristics and rewiring weakly-connected nodes. These contributions are supported by extensive simulation experiments on different systems with various topological characteristics. Obtained results confirm that nodes which propagate their topology information to more neighbours are more likely to be recovered; while requiring more resources. The proposed contributions can help practitioners to: identify the most fragile nodes in their distributed systems; consider corrective measures by increasing each node's connectivity; and, establish a suitable compromise between system resilience and costs.

Original languageEnglish
Article number107553
JournalFuture Generation Computer Systems
Volume163
DOIs
Publication statusPublished - 1 Feb 2025

Keywords

  • Centrality metrics
  • Correlation
  • Limited hop information
  • Network topology
  • Self-healing system

Fingerprint

Dive into the research topics of 'Correlating node centrality metrics with node resilience in self-healing systems with limited neighbourhood information'. Together they form a unique fingerprint.

Cite this