Title
An Oversampling Technique for Classifying Imbalanced Datasets
Document Type
Book Chapter
Keywords
oversampling; imbalanced data; rare events; classification; SMOTE; sensitivity
Identifier Data
https://doi.org/10.1108/S1477-407020170000012004
Publisher
Emerald Insight
Publication Source
Advances in Business and Management Forecasting, Volume 13
Abstract
We propose an oversampling technique to increase the true positive rate (sensitivity) in classifying imbalanced datasets (i.e., those with a value for the target variable that occurs with a small frequency) and hence boost the overall performance measurements such as balanced accuracy, G-mean and area under the receiver operating characteristic (ROC) curve, AUC. This oversampling method is based on the idea of applying the Synthetic Minority Oversampling Technique (SMOTE) on only a selective portion of the dataset instead of the entire dataset. We demonstrate the effectiveness of our oversampling method with four real and simulated datasets generated from three models.
Comments
Is part of Advances in Business Management Forecasting.