BIROn - Birkbeck Institutional Research Online

    NOTE: non-parametric oversampling technique for explainable credit scoring

    Han, Seongil and Jung, H. and Yoo, Paul and Provetti, Alessandro and Cali, Andrea (2024) NOTE: non-parametric oversampling technique for explainable credit scoring. Scientific Reports 14 (26070), ISSN 2045-2322.

    [img]
    Preview
    Text
    s41598-024-78055-5.pdf - Published Version of Record
    Available under License Creative Commons Attribution.

    Download (2MB) | Preview

    Abstract

    Credit scoring models are critical for financial institutions to assess borrower risk and maintain profitability. Although machine learning models have improved credit scoring accuracy, imbalanced class distributions remain a major challenge. The widely used Synthetic Minority Oversampling TEchnique (SMOTE) struggles with high-dimensional, non-linear data and may introduce noise through class overlap. Generative Adversarial Networks (GANs) have emerged as an alternative, offering the ability to model complex data distributions. Conditional Wasserstein GANs (cWGANs) have shown promise in handling both numerical and categorical features in credit scoring datasets. However, research on extracting latent features from non-linear data and improving model explainability remains limited. To address these challenges, this paper introduces the Non-parametric Oversampling Technique for Explainable credit scoring (NOTE). The NOTE offers a unified approach that integrates a Non-parametric Stacked Autoencoder (NSA) for capturing non-linear latent features, cWGAN for oversampling the minority class, and a classification process designed to enhance explainability. The experimental results demonstrate that NOTE surpasses state-of-the-art oversampling techniques by improving classification accuracy and model stability, particularly in non-linear and imbalanced credit scoring datasets, while also enhancing the explainability of the results.

    Metadata

    Item Type: Article
    Keyword(s) / Subject(s): Conditional Wasserstein generative adversarial networks, Stacked autoencoder, Explainable AI, Imbalanced class, Oversampling, Credit scoring
    School: Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences
    Depositing User: Paul Yoo
    Date Deposited: 05 Nov 2024 16:05
    Last Modified: 06 Nov 2024 01:28
    URI: https://eprints.bbk.ac.uk/id/eprint/54507

    Statistics

    Activity Overview
    6 month trend
    3Downloads
    6 month trend
    29Hits

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item
    Edit/View Item