Virtual Sample Generation Approach for Imbalanced Classification

Cao Lu, Hong Shen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

Imbalanced classification problem is a hot topic in machine learning and data mining. The traditional classification algorithms assume that class distribution is balanced and the effect is not ideal when handling imbalanced datasets. In this paper, the support vector machine is used as basic classifier and a virtual sample generation method based on support vector is proposed to solve the problem of imbalanced classification and to improve the recognition rate of the minority class according to the characteristic that support vector machine is a classifier that relies heavily on support vectors. Firstly, support vector machine is used to learn training set to obtain support vectors of the minority class. Then, a certain number of virtual samples are generated around the support vector of the minority samples through the smoothness hypothesis to balance the data set. The generated samples can conform to the statistical characteristics of the original training data, which proves the rationality of the generated virtual samples. Finally, the new dataset is learned by support vector machine. Experimental results show that the method is effective in both artificial datasets and UCI standard datasets.

Original languageEnglish
Title of host publicationProceedings - 2018 9th International Conference on Parallel Architectures, Algorithms and Programming, PAAP 2018
PublisherIEEE Computer Society
Pages177-182
Number of pages6
ISBN (Electronic)9781538694039
DOIs
Publication statusPublished - 2 Jul 2018
Externally publishedYes
Event9th International Conference on Parallel Architectures, Algorithms and Programming, PAAP 2018 - Taipei, Taiwan, Province of China
Duration: 26 Dec 201828 Dec 2018

Publication series

NameProceedings - International Symposium on Parallel Architectures, Algorithms and Programming, PAAP
Volume2018-December
ISSN (Print)2168-3034
ISSN (Electronic)2168-3042

Conference

Conference9th International Conference on Parallel Architectures, Algorithms and Programming, PAAP 2018
Country/TerritoryTaiwan, Province of China
CityTaipei
Period26/12/1828/12/18

Keywords

  • Imbalanced-classiciation
  • Oversampling
  • Support-vector
  • Support-vector-machine

Fingerprint

Dive into the research topics of 'Virtual Sample Generation Approach for Imbalanced Classification'. Together they form a unique fingerprint.

Cite this