TY - GEN
T1 - Large Language Model-Guided Credit Scoring
AU - Shi, Si
AU - Yuan, Hongxu
AU - Li, Huijie
AU - Luo, Wuman
AU - Pau, Giovanni
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Large language Models (LLMs) have revolutionized the financial analysis realm. In the field of credit scoring, where a loaner's default probability is measured, LLMs show great potential. Three issues remain in the related studying currently: 1) alignment with credit risk expertise; 2) model efficiency and vast computing power consumption; 3) model hallucination. They are challenging because of the insufficiency of annotated credit data, the huge complexity of models, and the diversity of LLMs hallucination. To address these issues and tackle the challenges, we proposed a light-weight novel LLMs credit scoring framework: Demographic and Behavioral-Credit-Large Language Model (DB-Credit-LLM). In our framework, we combined financial domain knowledge with LLMs to achieve alignment and captured demographic and transactional characteristics with newly designed linear attention mechanism. Additionally, we simplified the mitigation of hallucination with two-step prompting techniques. Besides, we improved the efficiency of LLMs through fine-tuning DeepSeek with efficient framework and Chain-ofThought removing. We proved the satisfactory zero-shot learning performance of DeepSeek. Finally, we generated a simulated dataset with LLMs, and it demonstrates the generalization ability of our framework and alleviates the label scarcity. We verified our framework on three open-source credit scoring datasets and one simulated dataset. To the best of our knowledge, it achieves state-of-the-art performance among 7B-level LLM-based methods in most cases and is one of the most efficient LLMs so far. Our code is available at https://github.com/puding26/DB-Credit-LLM.git.
AB - Large language Models (LLMs) have revolutionized the financial analysis realm. In the field of credit scoring, where a loaner's default probability is measured, LLMs show great potential. Three issues remain in the related studying currently: 1) alignment with credit risk expertise; 2) model efficiency and vast computing power consumption; 3) model hallucination. They are challenging because of the insufficiency of annotated credit data, the huge complexity of models, and the diversity of LLMs hallucination. To address these issues and tackle the challenges, we proposed a light-weight novel LLMs credit scoring framework: Demographic and Behavioral-Credit-Large Language Model (DB-Credit-LLM). In our framework, we combined financial domain knowledge with LLMs to achieve alignment and captured demographic and transactional characteristics with newly designed linear attention mechanism. Additionally, we simplified the mitigation of hallucination with two-step prompting techniques. Besides, we improved the efficiency of LLMs through fine-tuning DeepSeek with efficient framework and Chain-ofThought removing. We proved the satisfactory zero-shot learning performance of DeepSeek. Finally, we generated a simulated dataset with LLMs, and it demonstrates the generalization ability of our framework and alleviates the label scarcity. We verified our framework on three open-source credit scoring datasets and one simulated dataset. To the best of our knowledge, it achieves state-of-the-art performance among 7B-level LLM-based methods in most cases and is one of the most efficient LLMs so far. Our code is available at https://github.com/puding26/DB-Credit-LLM.git.
KW - credit scoring
KW - large language models
KW - linear attention mechanism
KW - prompt engineering
KW - zero-shot learning
UR - https://www.scopus.com/pages/publications/105035380230
U2 - 10.1109/ICDMW69685.2025.00110
DO - 10.1109/ICDMW69685.2025.00110
M3 - Conference contribution
AN - SCOPUS:105035380230
T3 - IEEE International Conference on Data Mining Workshops, ICDMW
SP - 927
EP - 936
BT - Proceedings - 25th IEEE International Conference on Data Mining Workshops, ICDMW 2025
PB - IEEE Computer Society
T2 - 25th IEEE International Conference on Data Mining Workshops, ICDMW 2025
Y2 - 12 November 2025 through 15 November 2025
ER -