Abstract
(1) RNA-binding proteins (RBPs) play a crucial role in regulating gene expression in plants, affecting growth, development, and stress responses. Accurate prediction of plant-specific RBPs is vital for understanding gene regulation and enhancing genetic improvement. (2) Methods: We propose an ensemble learning method that integrates shallow and deep learning. It integrates prediction results from SVM, LR, LDA, and LightGBM into an enhanced TextCNN, using K-Peptide Composition (KPC) encoding (k = 1, 2) to form a 420-dimensional feature vector, extended to 424 dimensions by including those four prediction outputs. Redundancy is minimized using a Pearson correlation threshold of 0.80. (3) Results: On the benchmark dataset of 4992 sequences, our method achieved an ACC of 97.20% and 97.06% under 5-fold and 10-fold cross-validation, respectively. On an independent dataset of 1086 sequences, our method attained an ACC of 99.72%, an (Formula presented.) of 99.72%, an MCC of 99.45%, an SN of 99.63%, and an SP of 99.82%, outperforming RBPLight by 12.98 percentage points in ACC and the original TextCNN by 25.23 percentage points. (4) Conclusions: These results highlight our method’s superior accuracy and efficiency over PSSM-based approaches, enabling large-scale plant RBP prediction.
| Original language | English |
|---|---|
| Article number | 672 |
| Journal | Biology |
| Volume | 14 |
| Issue number | 6 |
| DOIs | |
| Publication status | Published - Jun 2025 |
Keywords
- RBPs
- RNA-binding proteins
- TextCNN
- ensemble learning
- plant
Fingerprint
Dive into the research topics of 'Research on Plant RNA-Binding Protein Prediction Method Based on Improved Ensemble Learning'. Together they form a unique fingerprint.Press/Media
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver