Abstract
Imbalanced datasets exist widely in real life. The identification of the minority class in imbalanced datasets tends to be the focus of classification. As a variant of enhanced support vector machine (SVM), the twin support vector machine (TWSVM) provides an effective technique for data classification. TWSVM is based on a relative balance in the training sample dataset and distribution to improve the classification accuracy of the whole dataset, however, it is not effective in dealing with imbalanced data classification problems. In this paper, we propose to combine a re-sampling technique, which utilizes over-sampling and under-sampling to balance the training data, with TWSVM to deal with imbalanced data classification. Experimental results show that our proposed approach outperforms other state-of-art methods.
Original language | English |
---|---|
Pages (from-to) | 579-595 |
Number of pages | 17 |
Journal | Computer Science and Information Systems |
Volume | 14 |
Issue number | 3 |
DOIs | |
Publication status | Published - Sept 2017 |
Externally published | Yes |
Keywords
- Classification
- Imbalanced dataset
- Over-sampling
- TWSVM
- Under-sampling