DGHNL: A new deep genetic hierarchical network of learners for prediction of credit scoring

TytułDGHNL: A new deep genetic hierarchical network of learners for prediction of credit scoring
Publication TypeJournal Article
Rok publikacji2020
AutorzyPławiak P, Abdar M, Pławiak J, Makarenkov V, U Acharya R
JournalInformation Sciences
Volume516
Pagination401 - 418
ISSN0020-0255
Słowa kluczoweCredit scoring, Data mining, Deep learning, Ensemble learning, Feature extraction and selection, Genetic algorithm, Machine learning
Abstract

Credit scoring (CS) is an effective and crucial approach used for risk management in banks and other financial institutions. It provides appropriate guidance on granting loans and reduces risks in the financial area. Hence, companies and banks are trying to use novel automated solutions to deal with CS challenge to protect their own finances and customers. Nowadays, different machine learning (ML) and data mining (DM) algorithms have been used to improve various aspects of CS prediction. In this paper, we introduce a novel methodology, named Deep Genetic Hierarchical Network of Learners (DGHNL). The proposed methodology comprises different types of learners, including Support Vector Machines (SVM), k-Nearest Neighbors (kNN), Probabilistic Neural Networks (PNN), and fuzzy systems. The Statlog German (1000 instances) credit approval dataset available in the UCI machine learning repository is used to test the effectiveness of our model in the CS domain. Our DGHNL model encompasses five kinds of learners, two kinds of data normalization procedures, two extraction of features methods, three kinds of kernel functions, and three kinds of parameter optimizations. Furthermore, the model applies deep learning, ensemble learning, supervised training, layered learning, genetic selection of features (attributes), genetic optimization of learners parameters, and novel genetic layered training (selection of learners) approaches used along with the cross-validation (CV) training-testing method (stratified 10-fold). The novelty of our approach relies on a proper flow and fusion of information (DGHNL structure and its optimization). We show that the proposed DGHNL model with a 29-layer structure is capable to achieve the prediction accuracy of 94.60% (54 errors per 1000 classifications) for the Statlog German credit approval data. It is the best prediction performance for this well-known credit scoring dataset, compared to the existing work in the field.

URLhttp://www.sciencedirect.com/science/article/pii/S0020025519311569
DOI10.1016/j.ins.2019.12.045