TY - GEN
T1 - Understanding Semantics in Feature Selection for Fault Diagnosis in Network Telemetry Data
AU - Feltin, Thomas
AU - Fuertes, Juan Antonio Cordero
AU - Brockners, Frank
AU - Clausen, Thomas Heide
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - Expert systems for fault diagnosis are computationally expensive to build and maintain, and lack scalability and inherent adaptability to unknown events or modifications in the topology of the monitored system. While data-driven feature selection mechanisms can facilitate diagnosis without the hardship of developing and maintaining expert systems, purely data-driven mechanisms lack understanding of semantic importance within a feature set, and would benefit from additional domain knowledge. Part of this additional knowledge can be extracted from metadata. The proposed approach combines data-driven metrics and semantic information contained in the feature names to produce selections of features which best represent an underlying event. This study extends a cross entropy based optimization method to join semantic importance with data behavior. A benchmarking architecture is introduced to evaluate the benefits of semantic analysis, and demonstrate the performance and robustness of semantic feature selection on different types of faults in network telemetry datasets, modeled with the YANG data modeling language. The results illustrate the interest of such a complementary meta-data analysis for data-driven fault diagnosis, and highlight the robustness of the studied approach against variations in the input feature set.
AB - Expert systems for fault diagnosis are computationally expensive to build and maintain, and lack scalability and inherent adaptability to unknown events or modifications in the topology of the monitored system. While data-driven feature selection mechanisms can facilitate diagnosis without the hardship of developing and maintaining expert systems, purely data-driven mechanisms lack understanding of semantic importance within a feature set, and would benefit from additional domain knowledge. Part of this additional knowledge can be extracted from metadata. The proposed approach combines data-driven metrics and semantic information contained in the feature names to produce selections of features which best represent an underlying event. This study extends a cross entropy based optimization method to join semantic importance with data behavior. A benchmarking architecture is introduced to evaluate the benefits of semantic analysis, and demonstrate the performance and robustness of semantic feature selection on different types of faults in network telemetry datasets, modeled with the YANG data modeling language. The results illustrate the interest of such a complementary meta-data analysis for data-driven fault diagnosis, and highlight the robustness of the studied approach against variations in the input feature set.
KW - Fault Diagnosis
KW - Feature Selection
KW - Telemetry
UR - https://www.scopus.com/pages/publications/85164694266
U2 - 10.1109/NOMS56928.2023.10154455
DO - 10.1109/NOMS56928.2023.10154455
M3 - Conference contribution
AN - SCOPUS:85164694266
T3 - Proceedings of IEEE/IFIP Network Operations and Management Symposium 2023, NOMS 2023
BT - Proceedings of IEEE/IFIP Network Operations and Management Symposium 2023, NOMS 2023
A2 - Akkaya, Kemal
A2 - Festor, Olivier
A2 - Fung, Carol
A2 - Rahman, Mohammad Ashiqur
A2 - Granville, Lisandro Zambenedetti
A2 - dos Santos, Carlos Raniery Paula
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 36th IEEE/IFIP Network Operations and Management Symposium, NOMS 2023
Y2 - 8 May 2023 through 12 May 2023
ER -