ml4code
diff --git a/‎paper-abstracts.json‎
Lines changed: 1 addition & 0 deletions b/‎paper-abstracts.json‎
Lines changed: 1 addition & 0 deletions
@@ -221,6 +221,7 @@
 {"key": "li2021learning", "year": "2021", "title":"Learning to Extend Program Graphs to Work-in-Progress Code", "abstract": "<p>Source code spends most of its time in a broken or incomplete state during software development. This presents a challenge to machine learning for code, since high-performing models typically rely on graph structured representations of programs derived from traditional program analyses. Such analyses may be undefined for broken or incomplete code. We extend the notion of program graphs to work-in-progress code by learning to predict edge relations between tokens, training on well-formed code before transferring to work-in-progress code. We consider the tasks of code completion and localizing and repairing variable misuse in a work-in-process scenario. We demonstrate that training relation-aware models with fine-tuned edges consistently leads to improved performance on both tasks.</p>\n", "tags": ["Transformer","autocomplete","repair"] },
 {"key": "li2021toward", "year": "2021", "title":"Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models", "abstract": "<p>Code completion is widely used by software developers to provide coding suggestions given a partially written code snippet. Apart from the traditional code completion methods, which only support single token completion at minimal positions, recent studies show the ability to provide longer code completion at more flexible positions. However, such frequently triggered and longer completion results reduce the overall precision as they generate more invalid results. Moreover, different studies are mostly incompatible with each other. Thus, it is vital to develop an ensemble framework that can combine results from multiple models to draw merits and offset defects of each model.\nThis paper conducts a coding simulation to collect data from code context and different code completion models and then apply the data in two tasks. First, we introduce an acceptance model which can dynamically control whether to display completion results to the developer. It uses simulation features to predict whether correct results exist in the output of these models. Our best model reduces the percentage of false-positive completion from 55.09% to 17.44%. Second, we design a fusion ranking scheme that can automatically identify the priority of the completion results and reorder the candidates from multiple code completion models. This scheme is flexible in dealing with various models, regardless of the type or the length of their completion results. We integrate this ranking scheme with two frequency models and a GPT-2 styled language model, along with the acceptance model to yield 27.80% and 37.64% increase in TOP1 and TOP5 accuracy, respectively. In addition, we propose a new code completion evaluation metric, Benefit-Cost Ratio(BCR), taking into account the benefit of keystrokes saving and hidden cost of completion list browsing, which is closer to real coder experience scenario.</p>\n", "tags": ["autocomplete","language model","optimization","Transformer"] },
 {"key": "li2022codereviewer", "year": "2022", "title":"CodeReviewer: Pre-Training for Automating Code Review Activities", "abstract": "<p>Code review is an essential part to software development lifecycle since it aims at guaranteeing the quality of codes. Modern code review activities necessitate developers viewing, understanding and even running the programs to assess logic, functionality, latency, style and other factors. It turns out that developers have to spend far too much time reviewing the code of their peers. Accordingly, it is in significant demand to automate the code review process. In this research, we focus on utilizing pre-training techniques for the tasks in the code review scenario. We collect a large-scale dataset of real world code changes and code reviews from open-source projects in nine of the most popular programming languages. To better understand code diffs and reviews, we propose CodeReviewer, a pre-trained model that utilizes four pre-training tasks tailored specifically for the code review senario. To evaluate our model, we focus on three key tasks related to code review activities, including code change quality estimation, review comment generation and code refinement. Furthermore, we establish a high-quality benchmark dataset based on our collected data for these three tasks and conduct comprehensive experiments on it. The experimental results demonstrate that our model outperforms the previous state-of-the-art pre-training approaches in all tasks. Further analysis show that our proposed pre-training tasks and the multilingual pre-training dataset benefit the model on the understanding of code changes and reviews.</p>\n", "tags": ["review"] },
+{"key": "li2022exploring", "year": "2022", "title":"Exploring Representation-Level Augmentation for Code Search", "abstract": "<p>Code search, which aims at retrieving the most relevant code fragment for a given natural language query, is a common activity in software development practice. Recently, contrastive learning is widely used in code search research, where many data augmentation approaches for source code (e.g., semantic-preserving program transformation) are proposed to learn better representations.  However, these augmentations are at the raw-data level, which requires additional code analysis in the preprocessing stage and additional training costs in the training stage. In this paper, we explore augmentation methods that augment data (both code and query) at representation level which does not require additional data processing and training, and based on this we propose a general format of representation-level augmentation that unifies existing methods. Then, we propose three new augmentation methods (linear extrapolation, binary interpolation, and Gaussian scaling) based on the general format. Furthermore, we theoretically analyze the advantages of the proposed augmentation methods over traditional contrastive learning methods on code search. We experimentally evaluate the proposed representation-level augmentation methods with state-of-the-art code search models on a large-scale public dataset consisting of six programming languages. The experimental results show that our approach can consistently boost the performance of the studied code search models.</p>\n", "tags": ["search","Transformer"] },
 {"key": "liguori2021shellcode_ia32", "year": "2021", "title":"Shellcode_IA32: A Dataset for Automatic Shellcode Generation", "abstract": "<p>We take the first step to address the task of automatically generating shellcodes, i.e., small pieces of code used as a payload in the exploitation of a software vulnerability, starting from natural language comments. We assemble and release a novel dataset (Shellcode_IA32), consisting of challenging but common assembly instructions with their natural language descriptions. We experiment with standard methods in neural machine translation (NMT) to establish baseline performance levels on this task.</p>\n", "tags": ["code generation","dataset"] },
 {"key": "lin2017program", "year": "2017", "title":"Program Synthesis from Natural Language Using Recurrent Neural Networks", "abstract": "<p>Oftentimes, a programmer may have difficulty implementing a\ndesired operation. Even when the programmer can describe her\ngoal in English, it can be difficult to translate into code. Existing\nresources, such as question-and-answer websites, tabulate specific\noperations that someone has wanted to perform in the past, but\nthey are not effective in generalizing to new tasks, to compound\ntasks that require combining previous questions, or sometimes even\nto variations of listed tasks.</p>\n\n<p>Our goal is to make programming easier and more productive by\nletting programmers use their own words and concepts to express\nthe intended operation, rather than forcing them to accommodate\nthe machine by memorizing its grammar. We have built a system\nthat lets a programmer describe a desired operation in natural language, then automatically translates it to a programming language\nfor review and approval by the programmer. Our system, Tellina,\ndoes the translation using recurrent neural networks (RNNs), a\nstate-of-the-art natural language processing technique that we augmented with slot (argument) filling and other enhancements.</p>\n\n<p>We evaluated Tellina in the context of shell scripting. We trained\nTellina’s RNNs on textual descriptions of file system operations\nand bash one-liners, scraped from the web. Although recovering\ncompletely correct commands is challenging, Tellina achieves top-3\naccuracy of 80% for producing the correct command structure. In a\ncontrolled study, programmers who had access to Tellina outperformed those who did not, even when Tellina’s predictions were\nnot completely correct, to a statistically significant degree.</p>\n", "tags": ["bimodal","code generation"] },
 {"key": "lin2018nl2bash", "year": "2018", "title":"NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System", "abstract": "<p>We present new data and semantic parsing methods for the problem of mapping english sentences to Bash commands (NL2Bash). Our long-term goal is to enable any user to easily solve otherwise repetitive tasks (such as file manipulation, search, and application-specific scripting) by simply stating their intents in English. We take a first step in this domain, by providing a large new dataset of challenging but commonly used commands paired with their English descriptions, along with the baseline methods to establish performance levels on this task.</p>\n", "tags": ["bimodal","code generation"] },