TY - GEN
T1 - Reconstructing the Unseen
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024
AU - Serrano, Richard
AU - Laclau, Charlotte
AU - Jeudy, Baptiste
AU - Largeron, Christine
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - In recent years, there has been a significant surge in machine learning techniques, particularly in the domain of deep learning, tailored for handling attributed graphs. Nevertheless, to work, these methods assume that the attributes values are fully known, which is not realistic in numerous real-world applications. This paper explores the potential of Optimal Transport (OT) to impute missing attributes on graphs. To proceed, we design a novel multi-view OT loss function that can encompass both node feature data and the underlying topological structure of the graph by utilizing multiple graph representations. We then utilize this novel loss to train efficiently a Graph Convolutional Neural Network (GCN) architecture capable of imputing all missing values over the graph at once. We evaluate the interest of our approach with experiments both on synthetic data and real-world graphs, including different missingness mechanisms and a wide range of missing data. These experiments demonstrate that our method is competitive with the state-of-the-art in all cases and of particular interest on weakly homophilic graphs.
AB - In recent years, there has been a significant surge in machine learning techniques, particularly in the domain of deep learning, tailored for handling attributed graphs. Nevertheless, to work, these methods assume that the attributes values are fully known, which is not realistic in numerous real-world applications. This paper explores the potential of Optimal Transport (OT) to impute missing attributes on graphs. To proceed, we design a novel multi-view OT loss function that can encompass both node feature data and the underlying topological structure of the graph by utilizing multiple graph representations. We then utilize this novel loss to train efficiently a Graph Convolutional Neural Network (GCN) architecture capable of imputing all missing values over the graph at once. We evaluate the interest of our approach with experiments both on synthetic data and real-world graphs, including different missingness mechanisms and a wide range of missing data. These experiments demonstrate that our method is competitive with the state-of-the-art in all cases and of particular interest on weakly homophilic graphs.
KW - Attributed Graph
KW - Missing Data Imputation
KW - Optimal Transport
UR - https://www.scopus.com/pages/publications/85203880002
U2 - 10.1007/978-3-031-70365-2_16
DO - 10.1007/978-3-031-70365-2_16
M3 - Conference contribution
AN - SCOPUS:85203880002
SN - 9783031703645
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 269
EP - 286
BT - Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2024, Proceedings
A2 - Bifet, Albert
A2 - Davis, Jesse
A2 - Krilavičius, Tomas
A2 - Kull, Meelis
A2 - Ntoutsi, Eirini
A2 - Žliobaitė, Indrė
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 9 September 2024 through 13 September 2024
ER -