TY - JOUR
T1 - Could statistical potential models achieve comparable or better performance than deep learning models?
AU - Wang, Zhihao
AU - Wang, Sheng
AU - Guo, Jingjing
AU - Mu, Yuguang
AU - Liu, Xiangdong
AU - Zheng, Liangzhen
AU - Li, Weifeng
N1 - Publisher Copyright:
© The Author(s) 2026. Published by Oxford University Press.
PY - 2026/3/1
Y1 - 2026/3/1
N2 - Accurately predicting protein–ligand interactions is vital for structure-based drug discovery. Although deep learning (DL) models have shown strong performance, the potential of traditional statistical potentials under data-limited conditions remains underexplored. Here, we systematically assess several statistical potential models in docking and virtual screening. We find that docking benefits from distance-dependent pairwise atom–atom potentials with clear physical meanings, while screening relies more on orientation-dependent atom–residue potentials that capture local chemical environments. Based on these findings, we propose HybridSP, a hybrid potential combining distance-dependent atom–atom, atom–residue, and orientation-dependent atom–residue terms. An affinity-weighted scheme is applied to correct biases in statistical distributions. On the CASF-2016 benchmark, HybridSP achieves a 91.6% docking success rate and an enrichment factor of 29.35 at the top 1%, rivaling and even surpassing state-of-the-art DL models. Its strong screening ability is further validated on directory of useful decoys-enhanced and directory of useful decoys-adjusted. These results demonstrate that well-designed statistical potentials can achieve high performance and interpretability without complex DL architectures, offering an efficient alternative for scoring function design. The models are available at: https://github.com/zelixirSH/HybridSP.git.
AB - Accurately predicting protein–ligand interactions is vital for structure-based drug discovery. Although deep learning (DL) models have shown strong performance, the potential of traditional statistical potentials under data-limited conditions remains underexplored. Here, we systematically assess several statistical potential models in docking and virtual screening. We find that docking benefits from distance-dependent pairwise atom–atom potentials with clear physical meanings, while screening relies more on orientation-dependent atom–residue potentials that capture local chemical environments. Based on these findings, we propose HybridSP, a hybrid potential combining distance-dependent atom–atom, atom–residue, and orientation-dependent atom–residue terms. An affinity-weighted scheme is applied to correct biases in statistical distributions. On the CASF-2016 benchmark, HybridSP achieves a 91.6% docking success rate and an enrichment factor of 29.35 at the top 1%, rivaling and even surpassing state-of-the-art DL models. Its strong screening ability is further validated on directory of useful decoys-enhanced and directory of useful decoys-adjusted. These results demonstrate that well-designed statistical potentials can achieve high performance and interpretability without complex DL architectures, offering an efficient alternative for scoring function design. The models are available at: https://github.com/zelixirSH/HybridSP.git.
KW - protein–ligand interaction
KW - scoring function
KW - statistical potential
UR - https://www.scopus.com/pages/publications/105031626152
U2 - 10.1093/bib/bbag088
DO - 10.1093/bib/bbag088
M3 - Article
C2 - 41766645
AN - SCOPUS:105031626152
SN - 1467-5463
VL - 27
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 2
M1 - bbag088
ER -