TY - JOUR
T1 - RetroPrime
T2 - A Diverse, plausible and Transformer-based method for Single-Step retrosynthesis predictions
AU - Wang, Xiaorui
AU - Li, Yuquan
AU - Qiu, Jiezhong
AU - Chen, Guangyong
AU - Liu, Huanxiang
AU - Liao, Benben
AU - Hsieh, Chang Yu
AU - Yao, Xiaojun
N1 - Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/9/15
Y1 - 2021/9/15
N2 - Retrosynthesis prediction is a crucial task for organic synthesis. In this work, we propose a single-step template-free and Transformer-based method dubbed RetroPrime, integrating chemists’ retrosynthetic strategy of (1) decomposing a molecule into synthons then (2) generating reactants by attaching leaving groups. These two stages are accomplished with versatile Transformer models, respectively. RetroPrime achieves the Top-1 accuracy of 64.8% and 51.4%, when the reaction type is known and unknown, respectively, in the USPTO-50 K dataset. And the Top-1 accuracy is close to the state-of-the-art transformer-based method in the large dataset USPTO-full. It is known that outputs of the Transformer-based retrosynthesis model tend to suffer from insufficient diversity and high chemical implausibility. These problems may limit the potential of Transformer-based methods in real practice, yet few works address both issues simultaneously. RetroPrime is designed to tackle these challenges.
AB - Retrosynthesis prediction is a crucial task for organic synthesis. In this work, we propose a single-step template-free and Transformer-based method dubbed RetroPrime, integrating chemists’ retrosynthetic strategy of (1) decomposing a molecule into synthons then (2) generating reactants by attaching leaving groups. These two stages are accomplished with versatile Transformer models, respectively. RetroPrime achieves the Top-1 accuracy of 64.8% and 51.4%, when the reaction type is known and unknown, respectively, in the USPTO-50 K dataset. And the Top-1 accuracy is close to the state-of-the-art transformer-based method in the large dataset USPTO-full. It is known that outputs of the Transformer-based retrosynthesis model tend to suffer from insufficient diversity and high chemical implausibility. These problems may limit the potential of Transformer-based methods in real practice, yet few works address both issues simultaneously. RetroPrime is designed to tackle these challenges.
KW - Deep Learning
KW - Natural Language Processing
KW - Template-free Single-Step Retrosynthesis
UR - http://www.scopus.com/inward/record.url?scp=85105832744&partnerID=8YFLogxK
U2 - 10.1016/j.cej.2021.129845
DO - 10.1016/j.cej.2021.129845
M3 - Article
AN - SCOPUS:85105832744
SN - 1385-8947
VL - 420
JO - Chemical Engineering Journal
JF - Chemical Engineering Journal
M1 - 129845
ER -