Support vector machines (SVM), as a novel learning machine, was used to develop the non-linear quantitative structure-mobility relationship model of the peptides based on the calculated descriptors for the first time. The molecular descriptors representing the structural features of the compounds included the constitutional and topological descriptors calculated by CODESSA program, which can be obtained easily without optimizing the structure of the molecule, and CPSA (charged partial surface area) descriptors obtained by SYBYL software. The MLR method was used to select the descriptors responsible for the electrophoretic mobility of peptides and develop the linear model. The prediction result of the SVM model (ε = 0.04, γ = 0.002 and C = 100) is much better than that obtained by MLR method. The RMS error of the training set, the test set and the whole set is 0.0569, 0.0553, 0.0565 and the prediction correlation coefficient is 0.925, 0.912 and 0.922, respectively. The prediction results are in agreement with the experimental values. This paper provided a new and effective method for predicting the electrophoretic behavior of peptide and some insight into what structural features are related to the electrophoretic mobility of peptides. Moreover, it also offered an idea about dealing with the structural optimization and obtaining their structural descriptors for biomacromolecules.
- Electrophoretic mobility
- Quantitative structure-mobility relationship
- Support vector machine