TY - JOUR
T1 - Accurate prediction of aquatic toxicity of aromatic compounds based on Genetic Algorithm and Least Squares Support Vector Machines
AU - Lei, Beilei
AU - Li, Jiazhong
AU - Liu, Huanxiang
AU - Yao, Xiaojun
PY - 2008/7
Y1 - 2008/7
N2 - Quantitative Structure - Toxicity Relationship (QSTR) plays an important role in ecotoxicology for its fast and practical ability to assess the potential negative effects of chemicals. The aim of this investigation was to develop accurate QSTR models for the aquatic toxicity of a large set of aromatic compounds through the combination of Least Squares Support Vector Machines (LS-SVMs) and a Genetic Algorithm (GA). Molecular descriptors calculated by DRAGON package and log P were used to describe the molecular structures. The most relevant descriptors used to build QSTR models were selected by a GA-Variable Subset Selection procedure. Multiple Linear Regression (MLR) and nonlinear LS-SVMs methods were employed to build QSTR models. The predictive ability of the derived models was validated using both the test set, selected from the whole data set by the Kennard-Stone Algorithm, and an external prediction set. The model applicability domain was checked by the leverage approach and the external prediction set was used to verify the predictive reliability of the models. The results indicated that the proposed QSTR models are robust and satisfactory, and can provide a feasible and promising tool for the rapid assessment of the toxicity of chemicals.
AB - Quantitative Structure - Toxicity Relationship (QSTR) plays an important role in ecotoxicology for its fast and practical ability to assess the potential negative effects of chemicals. The aim of this investigation was to develop accurate QSTR models for the aquatic toxicity of a large set of aromatic compounds through the combination of Least Squares Support Vector Machines (LS-SVMs) and a Genetic Algorithm (GA). Molecular descriptors calculated by DRAGON package and log P were used to describe the molecular structures. The most relevant descriptors used to build QSTR models were selected by a GA-Variable Subset Selection procedure. Multiple Linear Regression (MLR) and nonlinear LS-SVMs methods were employed to build QSTR models. The predictive ability of the derived models was validated using both the test set, selected from the whole data set by the Kennard-Stone Algorithm, and an external prediction set. The model applicability domain was checked by the leverage approach and the external prediction set was used to verify the predictive reliability of the models. The results indicated that the proposed QSTR models are robust and satisfactory, and can provide a feasible and promising tool for the rapid assessment of the toxicity of chemicals.
KW - Applicability domain
KW - Aquatic toxicity
KW - Genetic algorithm
KW - Kennard - Stone algorithm
KW - Least squares support vector machines
KW - Quantitative structure - toxicity relationships
UR - http://www.scopus.com/inward/record.url?scp=54949109779&partnerID=8YFLogxK
U2 - 10.1002/qsar.200760167
DO - 10.1002/qsar.200760167
M3 - Article
AN - SCOPUS:54949109779
SN - 1611-020X
VL - 27
SP - 850
EP - 865
JO - QSAR and Combinatorial Science
JF - QSAR and Combinatorial Science
IS - 7
ER -