TY - GEN
T1 - Predicting over-indebtedness on batch and streaming data
AU - Montiel, Jacob
AU - Bifet, Albert
AU - Abdessalem, Talel
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/1
Y1 - 2017/7/1
N2 - Detecting over-indebtedness, the difficulties meeting household payment commitments, poses multiple Big Data challenges for banking institutions. We present a novel data-driven framework for predicting over-indebtedness on real-world data. A warning mechanism that generates predictions 6 months ahead, improving the chances of financial recovery. This framework is based on the combination of feature selection and supervised learning techniques, and uses data balancing for fine-tuning the predictive models. We propose two versions of the framework based on state-of-the-art batch and streaming learning techniques. To the best of our knowledge, the proposed framework is the first to cast over-indebtedness prediction as a stream learning problem. The appeal of stream learning rises from the large amount of data continuously generated, and the fact that batch models become obsolete over time as financial data evolves, while stream models are continuously updated as new data is available. We use credit data from two banks from the Groupe BPCE (the second-largest banking institution in France) and apply multi-metric criteria to evaluate model performance and fairness. Test results show the framework's interbank applicability and that the proposed batch and stream frameworks outperform the current solution for both single and multi-metric criteria. Additionally, the generic structure of the framework serves as a template for systematically approaching similar classification problems.
AB - Detecting over-indebtedness, the difficulties meeting household payment commitments, poses multiple Big Data challenges for banking institutions. We present a novel data-driven framework for predicting over-indebtedness on real-world data. A warning mechanism that generates predictions 6 months ahead, improving the chances of financial recovery. This framework is based on the combination of feature selection and supervised learning techniques, and uses data balancing for fine-tuning the predictive models. We propose two versions of the framework based on state-of-the-art batch and streaming learning techniques. To the best of our knowledge, the proposed framework is the first to cast over-indebtedness prediction as a stream learning problem. The appeal of stream learning rises from the large amount of data continuously generated, and the fact that batch models become obsolete over time as financial data evolves, while stream models are continuously updated as new data is available. We use credit data from two banks from the Groupe BPCE (the second-largest banking institution in France) and apply multi-metric criteria to evaluate model performance and fairness. Test results show the framework's interbank applicability and that the proposed batch and stream frameworks outperform the current solution for both single and multi-metric criteria. Additionally, the generic structure of the framework serves as a template for systematically approaching similar classification problems.
KW - Batch/Stream Mining
KW - Data-Driven
KW - Government/Banking Regulations
KW - Over-indebtedness
KW - Risk Prediction
KW - Warning Mechanism
U2 - 10.1109/BigData.2017.8258084
DO - 10.1109/BigData.2017.8258084
M3 - Conference contribution
AN - SCOPUS:85047879113
T3 - Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
SP - 1504
EP - 1513
BT - Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
A2 - Nie, Jian-Yun
A2 - Obradovic, Zoran
A2 - Suzumura, Toyotaro
A2 - Ghosh, Rumi
A2 - Nambiar, Raghunath
A2 - Wang, Chonggang
A2 - Zang, Hui
A2 - Baeza-Yates, Ricardo
A2 - Baeza-Yates, Ricardo
A2 - Hu, Xiaohua
A2 - Kepner, Jeremy
A2 - Cuzzocrea, Alfredo
A2 - Tang, Jian
A2 - Toyoda, Masashi
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th IEEE International Conference on Big Data, Big Data 2017
Y2 - 11 December 2017 through 14 December 2017
ER -