跳至主導覽 跳至搜尋 跳過主要內容

Research on Plant RNA-Binding Protein Prediction Method Based on Improved Ensemble Learning

研究成果: Article同行評審

摘要

(1) RNA-binding proteins (RBPs) play a crucial role in regulating gene expression in plants, affecting growth, development, and stress responses. Accurate prediction of plant-specific RBPs is vital for understanding gene regulation and enhancing genetic improvement. (2) Methods: We propose an ensemble learning method that integrates shallow and deep learning. It integrates prediction results from SVM, LR, LDA, and LightGBM into an enhanced TextCNN, using K-Peptide Composition (KPC) encoding (k = 1, 2) to form a 420-dimensional feature vector, extended to 424 dimensions by including those four prediction outputs. Redundancy is minimized using a Pearson correlation threshold of 0.80. (3) Results: On the benchmark dataset of 4992 sequences, our method achieved an ACC of 97.20% and 97.06% under 5-fold and 10-fold cross-validation, respectively. On an independent dataset of 1086 sequences, our method attained an ACC of 99.72%, an (Formula presented.) of 99.72%, an MCC of 99.45%, an SN of 99.63%, and an SP of 99.82%, outperforming RBPLight by 12.98 percentage points in ACC and the original TextCNN by 25.23 percentage points. (4) Conclusions: These results highlight our method’s superior accuracy and efficiency over PSSM-based approaches, enabling large-scale plant RBP prediction.

原文English
文章編號672
期刊Biology
14
發行號6
DOIs
出版狀態Published - 6月 2025

指紋

深入研究「Research on Plant RNA-Binding Protein Prediction Method Based on Improved Ensemble Learning」主題。共同形成了獨特的指紋。

引用此