Dimension Reduction in Bankruptcy Prediction: A Case Study of North American Companies
bankruptcy predication; solvency prediction; imbalanced data; rare events; classification; sliced inversed regression
Emerald Publishing Limited
Advances in Business and Management Forecasting (Advances in Business and Management Forecasting, Vol. 13)
Bankruptcy prediction has attracted a great deal of research in the data mining/machine learning community, due to its significance in the world of accounting, finance, and investment. This chapter examines the influence of different dimension reduction techniques on decision tree model applied to the bankruptcy prediction problem. The studied techniques are principal component analysis (PCA), sliced inversed regression (SIR), sliced average variance estimation (SAVE), and factor analysis (FA). To focus on the impact of the dimension reduction techniques, we chose only to use decision tree as our predictive model and “undersampling” as the solution to the issue of data imbalance. Our computation shows that the choice of dimension reduction technique greatly affects the performances of predictive models and that one could use dimension reduction techniques to improve the predictive power of the decision tree model. Also, in this study, we propose a method to estimate the true dimension of the data.