TY - GEN
T1 - Adaptive algorithms for estimating betweenness and k-path centralities
AU - Chehreghani, Mostafa Haghir
AU - Bifet, Albert
AU - Abdessalem, Talel
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/11/3
Y1 - 2019/11/3
N2 - Betweenness centrality and k-path centrality are two important indices that are widely used to analyze social, technological and information networks. In the current paper, first given a directed network G and a vertex r ∈ V(G), we present a novel adaptive algorithm for estimating betweenness score of r. Our algorithm first computes two subsets of the vertex set of G, called RF (r) and RT (r). They define the sample spaces of the start-points and the end-points of the samples. Then, it adaptively samples from RF (r) and RT (r) and stops as soon as some condition is satisfied. The stopping condition depends on the samples met so far, |RF (r)| and |RT (r)|. We show that compared to the well-known existing algorithms, our algorithm gives a better (λ, δ)-approximation. Then, we propose a novel algorithm for estimating k-path centrality of r. Our algorithm is based on computing two sets RF (r) and D(r). While RF (r) defines the sample space of the source vertices of the sampled paths, D(r) defines the sample space of the other vertices of the paths. We show that in order to give a (λ, δ)-approximation of the k-path score of r, our algorithm requires considerably less samples. Moreover, it processes each sample faster and with less memory. Finally, we empirically evaluate our proposed algorithms and show their superior performance. Also, we show that they can be used to efficiently compute centrality scores of a set of vertices.
AB - Betweenness centrality and k-path centrality are two important indices that are widely used to analyze social, technological and information networks. In the current paper, first given a directed network G and a vertex r ∈ V(G), we present a novel adaptive algorithm for estimating betweenness score of r. Our algorithm first computes two subsets of the vertex set of G, called RF (r) and RT (r). They define the sample spaces of the start-points and the end-points of the samples. Then, it adaptively samples from RF (r) and RT (r) and stops as soon as some condition is satisfied. The stopping condition depends on the samples met so far, |RF (r)| and |RT (r)|. We show that compared to the well-known existing algorithms, our algorithm gives a better (λ, δ)-approximation. Then, we propose a novel algorithm for estimating k-path centrality of r. Our algorithm is based on computing two sets RF (r) and D(r). While RF (r) defines the sample space of the source vertices of the sampled paths, D(r) defines the sample space of the other vertices of the paths. We show that in order to give a (λ, δ)-approximation of the k-path score of r, our algorithm requires considerably less samples. Moreover, it processes each sample faster and with less memory. Finally, we empirically evaluate our proposed algorithms and show their superior performance. Also, we show that they can be used to efficiently compute centrality scores of a set of vertices.
KW - Adaptive algorithm
KW - Approximate algorithm
KW - Betweenness centrality
KW - Coverage centrality
KW - Directed graphs
KW - K-path centrality
KW - Social network analysis
U2 - 10.1145/3357384.3358064
DO - 10.1145/3357384.3358064
M3 - Conference contribution
AN - SCOPUS:85075428089
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1231
EP - 1240
BT - CIKM 2019 - Proceedings of the 28th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 28th ACM International Conference on Information and Knowledge Management, CIKM 2019
Y2 - 3 November 2019 through 7 November 2019
ER -