TY - JOUR
T1 - MDD-LLM
T2 - Towards accuracy large language models for major depressive disorder diagnosis
AU - Sha, Yuyang
AU - Pan, Hongxin
AU - Xu, Wei
AU - Meng, Weiyu
AU - Luo, Gang
AU - Du, Xinyu
AU - Zhai, Xiaobing
AU - Tong, Henry H.Y.
AU - Shi, Caijuan
AU - Li, Kefeng
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/11/1
Y1 - 2025/11/1
N2 - Background: Major depressive disorder (MDD) impacts >300 million individuals worldwide, highlighting a significant public health issue. However, the uneven distribution of medical resources and the complexity of diagnostic methods have resulted in inadequate attention to this disorder in numerous countries and regions. Methods: This paper introduces a high-performance MDD diagnosis tool named MDD-LLM, an AI-driven framework that utilizes fine-tuned large language models (LLMs) and extensive real-world samples to tackle challenges in MDD diagnosis. Specifically, we select 274,348 individual records from the UK Biobank cohort and design three tabular data transformation methods to create a large corpus for training and evaluating the proposed method. To illustrate the advantages of MDD-LLM, we perform comprehensive experiments and provide several comparative analyses against existing model-based solutions across multiple evaluation metrics. Results: Experimental results show that MDD-LLM (70B) achieves an accuracy of 0.8378 and an AUC of 0.8919 (95 % CI: 0.8799–0.9040), significantly outperforming existing machine and deep learning frameworks for MDD diagnosis. Given the limited exploration of LLMs in MDD diagnosis, we examine numerous factors that may influence the performance of our proposed method, including tabular data transformation techniques and different fine-tuning strategies. Furthermore, we also analyze the model's interpretability, requiring the MDD-LLM to explain its predictions and provide corresponding reasons. Conclusion: This paper investigates the application of LLMs and large-scale training samples for diagnosing MDD. The findings indicate that LLMs-driven schemes offer significant potential for accuracy, robustness, and interpretability in MDD diagnosis compared to traditional model-based solutions.
AB - Background: Major depressive disorder (MDD) impacts >300 million individuals worldwide, highlighting a significant public health issue. However, the uneven distribution of medical resources and the complexity of diagnostic methods have resulted in inadequate attention to this disorder in numerous countries and regions. Methods: This paper introduces a high-performance MDD diagnosis tool named MDD-LLM, an AI-driven framework that utilizes fine-tuned large language models (LLMs) and extensive real-world samples to tackle challenges in MDD diagnosis. Specifically, we select 274,348 individual records from the UK Biobank cohort and design three tabular data transformation methods to create a large corpus for training and evaluating the proposed method. To illustrate the advantages of MDD-LLM, we perform comprehensive experiments and provide several comparative analyses against existing model-based solutions across multiple evaluation metrics. Results: Experimental results show that MDD-LLM (70B) achieves an accuracy of 0.8378 and an AUC of 0.8919 (95 % CI: 0.8799–0.9040), significantly outperforming existing machine and deep learning frameworks for MDD diagnosis. Given the limited exploration of LLMs in MDD diagnosis, we examine numerous factors that may influence the performance of our proposed method, including tabular data transformation techniques and different fine-tuning strategies. Furthermore, we also analyze the model's interpretability, requiring the MDD-LLM to explain its predictions and provide corresponding reasons. Conclusion: This paper investigates the application of LLMs and large-scale training samples for diagnosing MDD. The findings indicate that LLMs-driven schemes offer significant potential for accuracy, robustness, and interpretability in MDD diagnosis compared to traditional model-based solutions.
KW - Artificial intelligence
KW - Large language models
KW - Major depressive disorder
KW - Medical data processing
KW - Supervised fine-tuning
UR - http://www.scopus.com/inward/record.url?scp=105009347770&partnerID=8YFLogxK
U2 - 10.1016/j.jad.2025.119774
DO - 10.1016/j.jad.2025.119774
M3 - Article
AN - SCOPUS:105009347770
SN - 0165-0327
VL - 388
JO - Journal of Affective Disorders
JF - Journal of Affective Disorders
M1 - 119774
ER -