TY - JOUR
T1 - BERT-AmPEP60
T2 - A BERT-Based Transfer Learning Approach to Predict the Minimum Inhibitory Concentrations of Antimicrobial Peptides for Escherichia coli and Staphylococcus aureus
AU - Cai, Jianxiu
AU - Yan, Jielu
AU - Un, Chonwai
AU - Wang, Yapeng
AU - Campbell-Valois, François Xavier
AU - Siu, Shirley W.I.
N1 - Publisher Copyright:
© 2025 The Authors. Published by American Chemical Society.
PY - 2025/4/14
Y1 - 2025/4/14
N2 - Antimicrobial peptides (AMPs) are a promising alternative for combating bacterial drug resistance. While current computer prediction models excel at binary classification of AMPs based on sequences, there is a lack of regression methods to accurately quantify AMP activity against specific bacteria, making the identification of highly potent AMPs a challenge. Here, we present a deep learning method, BERT-AmPEP60, based on the fine-tuned Bidirectional Encoder Representations from Transformers (BERT) architecture to extract embedding features from input sequences. Using the transfer learning strategy, we built regression models to predict the minimum inhibitory concentration (MIC) of peptides for Escherichia coli (EC) and Staphylococcus aureus (SA). In five independent experiments with 10% leave-out sequences as the test sets, the optimal EC and SA models outperformed the state-of-the-art regression method and traditional machine learning methods, achieving an average mean squared error of 0.2664 and 0.3032 (log μM), respectively. They also showed a Pearson correlation coefficient of 0.7955 and 0.7530, and a Kendall correlation coefficient of 0.5797 and 0.5222, respectively. Our models outperformed existing deep learning and machine learning methods that rely on conventional sequence features. This work underscores the effectiveness of utilizing BERT with transfer learning for training quantitative AMP prediction models specific for different bacterial species. The web server of BERT-AmPEP60 can be found at https://app.cbbio.online/ampep/home. To facilitate development, the program source codes are available at https://github.com/janecai0714/AMP_regression_EC_SA.
AB - Antimicrobial peptides (AMPs) are a promising alternative for combating bacterial drug resistance. While current computer prediction models excel at binary classification of AMPs based on sequences, there is a lack of regression methods to accurately quantify AMP activity against specific bacteria, making the identification of highly potent AMPs a challenge. Here, we present a deep learning method, BERT-AmPEP60, based on the fine-tuned Bidirectional Encoder Representations from Transformers (BERT) architecture to extract embedding features from input sequences. Using the transfer learning strategy, we built regression models to predict the minimum inhibitory concentration (MIC) of peptides for Escherichia coli (EC) and Staphylococcus aureus (SA). In five independent experiments with 10% leave-out sequences as the test sets, the optimal EC and SA models outperformed the state-of-the-art regression method and traditional machine learning methods, achieving an average mean squared error of 0.2664 and 0.3032 (log μM), respectively. They also showed a Pearson correlation coefficient of 0.7955 and 0.7530, and a Kendall correlation coefficient of 0.5797 and 0.5222, respectively. Our models outperformed existing deep learning and machine learning methods that rely on conventional sequence features. This work underscores the effectiveness of utilizing BERT with transfer learning for training quantitative AMP prediction models specific for different bacterial species. The web server of BERT-AmPEP60 can be found at https://app.cbbio.online/ampep/home. To facilitate development, the program source codes are available at https://github.com/janecai0714/AMP_regression_EC_SA.
UR - http://www.scopus.com/inward/record.url?scp=86000719737&partnerID=8YFLogxK
U2 - 10.1021/acs.jcim.4c01749
DO - 10.1021/acs.jcim.4c01749
M3 - Article
AN - SCOPUS:86000719737
SN - 1549-9596
VL - 65
SP - 3186
EP - 3202
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 7
ER -