TY - JOUR
T1 - Recognition of protein folding kinetics pathways based on amino acid properties information derived from primary sequence
AU - Xi, Lili
AU - Li, Shuyan
AU - Wei, Yuhui
AU - Wu, Xin'an
AU - Liu, Huanxiang
AU - Yao, Xiaojun
N1 - Funding Information:
This work was supported by the National Natural Science Foundation of China (Grant No. 21175063 ), the Fundamental Research Funds for the Central Universities (Grant Nos: lzujbky-2011-19 to Xiaojun Yao, lzujbky-2013-153 to Lili Xi.)
PY - 2013/7/5
Y1 - 2013/7/5
N2 - Recognition of protein folding kinetics pathways is an effective approach for the study of protein folding behaviors, and thereby to get a better understanding of mechanism that how a protein folds into a functional structure. In this study, we presented a novel method for the classification of protein folding kinetics pathways based on a new class of features weighted by amino acid properties, which were derived from protein primary sequence. According to the leave-one-out and bootstrap cross-validation results, the model with eight features was the best one, and it achieved a satisfactory prediction accuracy of 91.67% for training set; while n-fold cross-validation had also been performed and the results showed that the built model was stable. Besides, the external test set was employed to evaluate the predictive ability of the built model. The accuracy for external test set achieved 88.24% and MCC was 0.79. Next, the selected important features were analyzed for a better understanding of the protein folding mechanisms. The analysis suggests that long-range interaction and unfolding Gibbs free energy change are important factors in determining the protein folding kinetics pathways. Besides, hydrophobicity, secondary structure and charges are also implied to be the important properties that affect the behavior of protein folding.
AB - Recognition of protein folding kinetics pathways is an effective approach for the study of protein folding behaviors, and thereby to get a better understanding of mechanism that how a protein folds into a functional structure. In this study, we presented a novel method for the classification of protein folding kinetics pathways based on a new class of features weighted by amino acid properties, which were derived from protein primary sequence. According to the leave-one-out and bootstrap cross-validation results, the model with eight features was the best one, and it achieved a satisfactory prediction accuracy of 91.67% for training set; while n-fold cross-validation had also been performed and the results showed that the built model was stable. Besides, the external test set was employed to evaluate the predictive ability of the built model. The accuracy for external test set achieved 88.24% and MCC was 0.79. Next, the selected important features were analyzed for a better understanding of the protein folding mechanisms. The analysis suggests that long-range interaction and unfolding Gibbs free energy change are important factors in determining the protein folding kinetics pathways. Besides, hydrophobicity, secondary structure and charges are also implied to be the important properties that affect the behavior of protein folding.
KW - Amino acid properties
KW - Least Squares-Support Vector Machines (LS-SVMs)
KW - Protein folding kinetics pathway
KW - Support Vector Machine-Recursive Feature Elimination (SVM-RFE)
UR - http://www.scopus.com/inward/record.url?scp=84878508517&partnerID=8YFLogxK
U2 - 10.1016/j.chemolab.2013.04.019
DO - 10.1016/j.chemolab.2013.04.019
M3 - Article
AN - SCOPUS:84878508517
SN - 0169-7439
VL - 126
SP - 76
EP - 82
JO - Chemometrics and Intelligent Laboratory Systems
JF - Chemometrics and Intelligent Laboratory Systems
ER -