Abstract
Small sample size (SSS) problems pose a tremendous challenge in modeling tasks due to insufficient training samples, especially in process industry where thousands of useless samples overwhelm very limited valuable samples, leading to deterioration on the prediction ability of trained models for key variables. In this study, the prediction ability to forecast models is enhanced by generating virtual samples. Considering the integrated effects of attributes, a new data augment approach, called ITNN-VSG, which integrates virtual sample generation (VSG) with input-training neural network (ITNN), was put forward to enlarge training datasets for improving the performance of forecasting models. In the absence of any available domain-specific knowledge about target models, a query-driven interpolation process was first developed to explore the overall tendency of data distribution in both sparse regions and dense regions. Second, an ITNN with fixed weights was used to calculate the input corresponding to the virtual output generated by the interpolation process. To validate the effectiveness of the proposed approach, several in silico experiments were carried out on a benchmark dataset from sinc(x) function, followed by a real-world application to purified terephthalic acid (PTA) solvent system. The experimental results demonstrated that the proposed approach outperformed other existing approaches such as mega-trend-diffusion and tree-based-trend-diffusion.
Original language | English |
---|---|
Pages (from-to) | 6489-6504 |
Number of pages | 16 |
Journal | Soft Computing |
Volume | 25 |
Issue number | 8 |
DOIs | |
Publication status | Published - Apr 2021 |
Externally published | Yes |
Keywords
- Input-training neural network
- Interpolation
- Modeling
- Purified terephthalic acid solvent system
- Small sample size problems
- Virtual sample generation