Yang, G. and Zhou, Y. and Chen, X. and Zhang, X. and Han, Tingting and Chen, Taolue (2022) ExploitGen: template-augmented exploit code generation based on CodeBERT. Journal of Systems and Software , ISSN 0164-1212.
Abstract
Exploit code is widely used for detecting vulnerabilities and implementing defensive measures. However, automatic generation of exploit code for security assessment is a challenging task. In this paper, we propose a novel template-augmented exploit code generation approach ExploitGen based on CodeBERT. Specifically, we first propose a rule-based Template Parser to generate template-augmented natural language descriptions (NL). Both the raw and template-augmented NL sequences are encoded to context vectors by the respective encoders. For better learning semantic information, ExploitGen incorporates a semantic attention layer, which uses the attention mechanism to extract and calculate each layer’s representational information. In addition, ExploitGen computes the interaction information between the template information and the semantics of the raw NL and designs a residual connection to append the template information into the semantics of the raw NL. Comprehensive experiments on two datasets show the effectiveness of ExploitGen after comparison with six state-of-the-art baselines. Apart from the automatic evaluation, we conduct a human study to evaluate the quality of generated code in terms of syntactic and semantic correctness. The results also confirm the effectiveness of ExploitGen.
Metadata
Item Type: | Article |
---|---|
School: | Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences |
Depositing User: | Tingting Han |
Date Deposited: | 06 Dec 2022 15:29 |
Last Modified: | 09 Aug 2023 12:54 |
URI: | https://eprints.bbk.ac.uk/id/eprint/50138 |
Statistics
Additional statistics are available via IRStats2.