Imbalanced data classification based on hybrid re-sampling and twin support vector machine

Lu Cao, Hong Shen

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

Imbalanced datasets exist widely in real life. The identification of the minority class in imbalanced datasets tends to be the focus of classification. As a variant of enhanced support vector machine (SVM), the twin support vector machine (TWSVM) provides an effective technique for data classification. TWSVM is based on a relative balance in the training sample dataset and distribution to improve the classification accuracy of the whole dataset, however, it is not effective in dealing with imbalanced data classification problems. In this paper, we propose to combine a re-sampling technique, which utilizes over-sampling and under-sampling to balance the training data, with TWSVM to deal with imbalanced data classification. Experimental results show that our proposed approach outperforms other state-of-art methods.

Original languageEnglish
Pages (from-to)579-595
Number of pages17
JournalComputer Science and Information Systems
Volume14
Issue number3
DOIs
Publication statusPublished - Sept 2017
Externally publishedYes

Keywords

  • Classification
  • Imbalanced dataset
  • Over-sampling
  • TWSVM
  • Under-sampling

Fingerprint

Dive into the research topics of 'Imbalanced data classification based on hybrid re-sampling and twin support vector machine'. Together they form a unique fingerprint.

Cite this