Abstract
In this study, a novel method was developed to predict the binding affinity of protein-ligand based on a comprehensive set of structurally diverse protein-ligand complexes (PLCs). The 1300 PLCs with binding affinity (493 complexes with Kd and 807 complexes with Ki) from the refined dataset of PDBbind Database (release 2007) were studied in the predictive model development. In this method, each complex was described using calculated descriptors from three blocks: protein sequence, ligand structure, and binding pocket. Thereafter, the PLCs data were rationally split into representative training and test sets by full consideration of the validation of the models. The molecular descriptors relevant to the binding affinity were selected using the ReliefF method combined with least squares support vector machines (LS-SVMs) modeling method based on the training data set. Two final optimized LS-SVMs models were developed using the selected descriptors to predict the binding affinities of Kd and Ki. The correlation coefficients (R) of training set and test set for Kd model were 0.890 and 0.833. The corresponding correlation coefficients for the Ki model were 0.922 and 0.742, respectively. The prediction method proposed in this work can give better generalization ability than other recently published methods and can be used as an alternative fast filter in the virtual screening of large chemical database.
Original language | English |
---|---|
Pages (from-to) | 900-909 |
Number of pages | 10 |
Journal | Journal of Computational Chemistry |
Volume | 30 |
Issue number | 6 |
DOIs | |
Publication status | Published - 30 Apr 2009 |
Externally published | Yes |
Keywords
- Least squares support vector machines (LS-SVMs)
- Model validation
- Protein-ligand binding affinity
- ReliefF method