跳至主導覽 跳至搜尋 跳過主要內容

Applicability domains for classification problems: Benchmarking of distance to models for ames mutagenicity set

  • Iurii Sushko
  • , Sergii Novotarskyi
  • , Robert Körner
  • , Anil Kumar Pandey
  • , Artem Cherkasov
  • , Jiazhong Li
  • , Paola Gramatica
  • , Katja Hansen
  • , Timon Schroeter
  • , Klaus Robert Müller
  • , Lili Xi
  • , Huanxiang Liu
  • , Xiaojun Yao
  • , Tomas Öberg
  • , Farhad Hormozdiari
  • , Phuong Dao
  • , Cenk Sahinalp
  • , Roberto Todeschini
  • , Pavel Polishchuk
  • , Anatoliy Artemenko
  • Victor Kuz'Min, Todd M. Martin, Douglas M. Young, Denis Fourches, Eugene Muratov, Alexander Tropsha, Igor Baskin, Dragos Horvath, Gilles Marcou, Christophe Muller, Alexander Varnek, Volodymyr V. Prokopenko, Igor V. Tetko
  • Helmholtz Zentrum München - German Research Center for Environmental Health
  • University of British Columbia
  • University of Insubria
  • Technical University of Berlin
  • Bayer AG
  • Lanzhou University
  • Linnaeus University
  • Simon Fraser University
  • University of Milan - Bicocca
  • NASU - Bogatsky Physico-Chemical Institute
  • United States Environmental Protection Agency
  • University of North Carolina at Chapel Hill
  • Lomonosov Moscow State University
  • Université de Strasbourg
  • NASU - Institute of Bioorganic Chemistry and Petrochemistry

研究成果: Article同行評審

236 引文 斯高帕斯(Scopus)

摘要

The estimation of accuracy and applicability of QSAR and QSPR models for biological and physicochemical properties represents a critical problem. The developed parameter of "distance to model" (DM) is defined as a metric of similarity between the training and test set compounds that have been subjected to QSAR/QSPR modeling. In our previous work, we demonstrated the utility and optimal performance of DM metrics that have been based on the standard deviation within an ensemble of QSAR models. The current study applies such analysis to 30 QSAR models for the Ames mutagenicity data set that were previously reported within the 2009 QSAR challenge. We demonstrate that the DMs based on an ensemble (consensus) model provide systematically better performance than other DMs. The presented approach identifies 30-60% of compounds having an accuracy of prediction similar to the interlaboratory accuracy of the Ames test, which is estimated to be 90%. Thus, the in silico predictions can be used to halve the cost of experimental measurements by providing a similar prediction accuracy. The developed model has been made publicly available at http://ochem.eu/models/1.

原文English
頁(從 - 到)2094-2111
頁數18
期刊Journal of Chemical Information and Modeling
50
發行號12
DOIs
出版狀態Published - 27 12月 2010
對外發佈

指紋

深入研究「Applicability domains for classification problems: Benchmarking of distance to models for ames mutagenicity set」主題。共同形成了獨特的指紋。

引用此