Zhang, X. and Zhou, Y. and Han, Tingting and Chen, Taolue (2021) Training deep code comment generation models via data augmentation. In: Internetware '20: 12th Asia-Pacific Symposium on Internetware, 12-14 May 2020, Singapore.
|
Text
internetware.pdf Download (537kB) | Preview |
Abstract
With the development of deep neural networks (DNNs) and the publicly available source code repositories, deep code comment generation models have demonstrated reasonable performance on test datasets. However, it has been confirmed in computer vision (CV) and natural language processing (NLP) that DNNs are vulner- able to adversarial examples. In this paper, we investigate how to maintain the performance of the models against these perturbed samples. We propose a simple, but effective, method to improve the robustness by training the model via data augmentation. We conduct experiments to evaluate our approach on two mainstream sequence-sequence (seq2seq) architectures which are based on the LSTM and the Transformer with a large-scale publicly available dataset. The experimental results demonstrate that our method can efficiently improve the capability of different models to defend the perturbed samples.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
School: | Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences |
Depositing User: | Tingting Han |
Date Deposited: | 21 Mar 2023 16:23 |
Last Modified: | 09 Aug 2023 12:50 |
URI: | https://eprints.bbk.ac.uk/id/eprint/44291 |
Statistics
Additional statistics are available via IRStats2.