Code Retrieval with Mixture of Experts Prototype Learning Based on Classification

  • Feng Ling
  • , Guoheng Huang
  • , Jingchao Wang
  • , Xiaochen Yuan
  • , Xuhang Chen
  • , Xue Yong Zhang
  • , Fanlong Zhang
  • , Chi Man Pun

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The semantic connection between code and queries is crucial for code retrieval, but many human-written queries fail to accurately capture the code’s core intent, leading to ambiguity. This ambiguity complicates the code search process, as the queries do not provide a clear overview of the code’s purpose. Our analysis reveals that while ambiguous queries may not precisely summarize the intent of the code, they often share the same general topics as the corresponding code. In light of this discovery, we propose Code Retrieval with Mixture of Experts Prototype Learning Based on Classification (CRME), a novel approach that combines classification for prototype-based representation learning and result ensembling. CRME utilizes specialized pre-trained models focused on the specific domains of ambiguous queries. It consists of two key components: Multiple Classification Prototype and Representation Learning with a Prototype-based Multi-model Contrastive (PMC) Loss during training, and Multi-Prototype Mixture of Experts Integration (MP-MoE) module for fine-grained ensemble inference. Our method can effectively address the issue of query ambiguity and improves search precision. Experimental results on the CodeSearchNet dataset, covering six sub-datasets, show that CRME outperforms existing methods, achieving an average MRR score of 81.4%. When applied to pre-trained models like CodeBERT, GraphCodeBERT, UniXcoder and CodeT5+, CRME can effectively boosts their performances.

Original languageEnglish
Title of host publication16th International Conference on Internetware, Internetware 2025 - Proceedings
EditorsHong Mei, Jian Lv, Zhi Jin, Xuandong Li, Thomas Zimmermann, Ge Li, Lei Bu, Xin Xia
PublisherAssociation for Computing Machinery, Inc
Pages47-58
Number of pages12
ISBN (Electronic)9798400719264
DOIs
Publication statusPublished - 27 Oct 2025
Event16th International Conference on Internetware, Internetware 2025 - Trondheim, Norway
Duration: 20 Jun 202522 Jun 2025

Publication series

Name16th International Conference on Internetware, Internetware 2025 - Proceedings

Conference

Conference16th International Conference on Internetware, Internetware 2025
Country/TerritoryNorway
CityTrondheim
Period20/06/2522/06/25

Keywords

  • Code Retrieval
  • Mixture of Experts
  • Prototype Learning

Fingerprint

Dive into the research topics of 'Code Retrieval with Mixture of Experts Prototype Learning Based on Classification'. Together they form a unique fingerprint.

Cite this