A Neural Architecture for Detecting Identifier Renaming from Diff

Qiqi Gu, Wei Ke

研究成果: Conference contribution同行評審

3 引文 斯高帕斯(Scopus)

摘要

In software engineering, code review controls code quality and prevents bugs. Although many commits to a codebase add features, some commits are code refactoring, including renaming of identifiers. Reviewing code refactoring requires a bit of different efforts than that of reviewing functional changes. For instance, renaming an identifier has to make sure that the new name not only is more descriptive and follows the naming convention of the institution, but also does not collide with any other identifiers. We propose in this paper a machine learning model to automatically identify commits consisting of pure identifier renaming, from only the diff files. This technique helps code review enforce naming and coding conventions of the institution, and let quality assurance testers focus more on functional changes. In contrast to the traditional way of detecting such changes by parsing the full source code before and after the commit, which is less efficient and requires rigorous syntactical completeness and correctness, our novel approach based on neural networks is able to read only the diff and gives a confidence value of whether it is a renaming or not. Since there had been no existing labeled dataset on repository commits, we labeled a dataset with more than 1,000 repos from GitHub by Java syntax analysis. Then we trained a neural network to classify these commits as whether they are renaming, obtaining the test accuracy of 85.65% and the false positive rate of 2.03%. The methods in our experiment also have significance for general static analysis with neural network approaches.

原文English
主出版物標題Intelligent Data Engineering and Automated Learning - 22nd International Conference, IDEAL 2021, Proceedings
編輯David Camacho, Peter Tino, Richard Allmendinger, Hujun Yin, Antonio J. Tallón-Ballesteros, Ke Tang, Sung-Bae Cho, Paulo Novais, Susana Nascimento
發行者Springer Science and Business Media Deutschland GmbH
頁面33-44
頁數12
ISBN(列印)9783030916077
DOIs
出版狀態Published - 2021
事件22nd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2021 - Virtual, Online
持續時間: 25 11月 202127 11月 2021

出版系列

名字Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
13113 LNCS
ISSN(列印)0302-9743
ISSN(電子)1611-3349

Conference

Conference22nd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2021
城市Virtual, Online
期間25/11/2127/11/21

指紋

深入研究「A Neural Architecture for Detecting Identifier Renaming from Diff」主題。共同形成了獨特的指紋。

引用此