Code Retrieval with Mixture of Experts Prototype Learning Based on Classification

  • Feng Ling
  • , Guoheng Huang
  • , Jingchao Wang
  • , Xiaochen Yuan
  • , Xuhang Chen
  • , Xue Yong Zhang
  • , Fanlong Zhang
  • , Chi Man Pun

研究成果: Conference contribution同行評審

摘要

The semantic connection between code and queries is crucial for code retrieval, but many human-written queries fail to accurately capture the code’s core intent, leading to ambiguity. This ambiguity complicates the code search process, as the queries do not provide a clear overview of the code’s purpose. Our analysis reveals that while ambiguous queries may not precisely summarize the intent of the code, they often share the same general topics as the corresponding code. In light of this discovery, we propose Code Retrieval with Mixture of Experts Prototype Learning Based on Classification (CRME), a novel approach that combines classification for prototype-based representation learning and result ensembling. CRME utilizes specialized pre-trained models focused on the specific domains of ambiguous queries. It consists of two key components: Multiple Classification Prototype and Representation Learning with a Prototype-based Multi-model Contrastive (PMC) Loss during training, and Multi-Prototype Mixture of Experts Integration (MP-MoE) module for fine-grained ensemble inference. Our method can effectively address the issue of query ambiguity and improves search precision. Experimental results on the CodeSearchNet dataset, covering six sub-datasets, show that CRME outperforms existing methods, achieving an average MRR score of 81.4%. When applied to pre-trained models like CodeBERT, GraphCodeBERT, UniXcoder and CodeT5+, CRME can effectively boosts their performances.

原文English
主出版物標題16th International Conference on Internetware, Internetware 2025 - Proceedings
編輯Hong Mei, Jian Lv, Zhi Jin, Xuandong Li, Thomas Zimmermann, Ge Li, Lei Bu, Xin Xia
發行者Association for Computing Machinery, Inc
頁面47-58
頁數12
ISBN(電子)9798400719264
DOIs
出版狀態Published - 27 10月 2025
事件16th International Conference on Internetware, Internetware 2025 - Trondheim, Norway
持續時間: 20 6月 202522 6月 2025

出版系列

名字16th International Conference on Internetware, Internetware 2025 - Proceedings

Conference

Conference16th International Conference on Internetware, Internetware 2025
國家/地區Norway
城市Trondheim
期間20/06/2522/06/25

指紋

深入研究「Code Retrieval with Mixture of Experts Prototype Learning Based on Classification」主題。共同形成了獨特的指紋。

引用此