摘要
Recent advancements in large language models (LLMs) have significantly enhanced text generation across various sectors; however, their medical application faces critical challenges regarding both accuracy and real-time responsiveness. To address these dual challenges, we propose a novel two-step retrieval and ranking retrieval-augmented generation (RAG) framework that synergistically combines embedding search with Elasticsearch technology. Built upon a dynamically updated medical knowledge base incorporating expert-reviewed documents from leading healthcare institutions, our hybrid architecture employs ColBERTv2 for context-aware result ranking while maintaining computational efficiency. Experimental results show a 10% improvement in accuracy for complex medical queries compared to standalone LLM and single-search RAG variants, while acknowledging that latency challenges remain in emergency situations requiring sub-second responses in an experimental setting, which can be achieved in real-time using more powerful hardware in real-world deployments. This work establishes a new paradigm for reliable medical AI assistants that successfully balances accuracy and practical deployment considerations.
| 原文 | English |
|---|---|
| 文章編號 | 18062 |
| 期刊 | Scientific Reports |
| 卷 | 15 |
| 發行號 | 1 |
| DOIs | |
| 出版狀態 | Published - 12月 2025 |
指紋
深入研究「Dual retrieving and ranking medical large language model with retrieval augmented generation」主題。共同形成了獨特的指紋。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver