Document Type
Thesis
First Faculty Advisor
Son Nguyen
Second Faculty Advisor
Rick Gorvett
Keywords
bankruptcy; imbalance; sampling; machine learning
Publisher
Bryant University
Rights Management
CC 4.0 BY-NC-SA
Abstract
Bankruptcy prediction is a widely researched topic. However, few studies focus on dealing with the imbalance problem. This paper proposes a new technique that applies a bagging undersampling procedure to balance the data and compares it to random undersampling and five oversampling techniques. The performance of the algorithm is evaluated by a random forest’s balanced accuracy, sensitivity, and specificity. The results show that models trained after applying the oversampling techniques are prone to overfitting, and the model trained after applying the proposed method had the highest balanced accuracy without overfitting.