ml4code
diff --git a/‎index.html‎
Lines changed: 1 addition & 0 deletions b/‎index.html‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎paper-abstracts.json‎
Lines changed: 1 addition & 0 deletions b/‎paper-abstracts.json‎
Lines changed: 1 addition & 0 deletions
@@ -137,6 +137,7 @@ <h4 id="-browse-papers-by-tag">🏷 Browse Papers by Tag</h4>
 <tag><a href="/tags.html#documentation">documentation</a></tag>
 <tag><a href="/tags.html#dynamic">dynamic</a></tag>
 <tag><a href="/tags.html#edit">edit</a></tag>
+<tag><a href="/tags.html#editing">editing</a></tag>
 <tag><a href="/tags.html#education">education</a></tag>
 <tag><a href="/tags.html#evaluation">evaluation</a></tag>
 <tag><a href="/tags.html#execution">execution</a></tag>
 
@@ -148,6 +148,7 @@
 {"key": "gupta2018deep", "year": "2018", "title":"Deep Reinforcement Learning for Programming Language Correction", "abstract": "<p>Novice programmers often struggle with the formal\nsyntax of programming languages.  To assist them,\nwe design a novel programming language correction  framework  amenable  to  reinforcement  learning.  The framework allows an agent to mimic human  actions  for  text  navigation  and  editing.   We\ndemonstrate that the agent can be trained through\nself-exploration directly from the raw input, that is,\nprogram text itself, without any knowledge of the\nformal syntax of the programming language.   We\nleverage expert demonstrations for one tenth of the\ntraining data to accelerate training.  The proposed\ntechnique  is  evaluated  on 6975\nerroneous  C  programs with typographic errors, written by students\nduring an introductory programming course.  Our\ntechnique fixes 14%\nmore programs and 29% more\ncompiler error messages relative to those fixed by\na state-of-the-art tool, DeepFix, which uses a fully\nsupervised neural machine translation approach.</p>\n", "tags": ["repair","code generation"] },
 {"key": "gupta2018intelligent", "year": "2018", "title":"Intelligent code reviews using deep learning", "abstract": "<p>Peer code review is a best practice in Software Engineering where source code is reviewed manually by one or more peers(reviewers) of the code author. It is widely acceptable both in industry and open-source software (OSS) systems as a process for early detection and reduction of software defects. A larger chunk of reviews given during peer reviews are related to common issues such as coding style, documentations, and best practices. This makes the code review process less effective as reviewers focus less on finding important defects. Hence, there is a need to automatically find such common issues and help reviewers perform focused code reviews. Some of this is solved by rule based systems called linters but they are rigid and needs a lot of manual effort to adapt them for a new issue.</p>\n\n<p>In this work, we present an automatic, flexible, and adaptive code analysis system called DeepCodeReviewer (DCR). DCR learns how to recommend code reviews related to common issues using historical peer reviews and deep learning. DCR uses deep learning to learn review relevance to a code snippet and recommend the right review from a repository of common reviews. DCR is trained on histroical peer reviews available from internal code repositories at Microsoft. Experiments demonstrate strong performance of developed deep learning model in classifying relevant and non-relevant reviews w.r.t to a code snippet, and ranking reviews given a code snippet. We have also evaluated DCR recommentations using a user study and survey. The results of our user study show good acceptance rate and answers of our survey questions are strongly correlated with our system’s goal of making code reviews focused on finding defects.</p>\n", "tags": ["representation","review"] },
 {"key": "gupta2019neural", "year": "2019", "title":"Neural Attribution for Semantic Bug-Localization in Student Programs", "abstract": "<p>Providing feedback is an integral part of teaching. Most open online courses on programming make use of automated grading systems to support programming assignments and give real-time feedback. These systems usually rely on test results to quantify the programs’ functional correctness. They return failing tests to the students as feedback. However, students may find it difficult to debug their programs if they receive no hints about where the bug is and how to fix it. In this work, we present NeuralBugLocator, a deep learning based technique, that can localize the bugs in a faulty program with respect to a failing test, without even running the program. At the heart of our technique is a novel tree convolutional neural network which is trained to predict whether a program passes or fails a given test. To localize the bugs, we analyze the trained network using a state-of-the-art neural prediction attribution technique and see which lines of the programs make it predict the test outcomes. Our experiments show that NeuralBugLocator is generally more accurate than two state-of-the-art program-spectrum based and one syntactic difference based bug-localization baselines.</p>\n", "tags": ["defect","representation"] },
+{"key": "gupta2023grace", "year": "2023", "title":"Grace: Language Models Meet Code Edits", "abstract": "<p>Developers spend a significant amount of time in editing code for a variety of reasons such as bug fixing or adding new features. Designing effective methods to predict code edits has been an active yet challenging area of research due to the diversity of code edits and the difficulty of capturing the developer intent. In this work, we address these challenges by endowing pre-trained large language models (LLMs) with the knowledge of relevant prior associated edits, which we call the Grace (Generation conditioned on Associated Code Edits) method. The generative capability of the LLMs helps address the diversity in code changes and conditioning code generation on prior edits helps capture the latent developer intent. We evaluate two well-known LLMs, codex and CodeT5, in zero-shot and fine-tuning settings respectively. In our experiments with two datasets, Grace boosts the performance of the LLMs significantly, enabling them to generate 29% and 54% more correctly edited code in top-1 suggestions relative to the current state-of-the-art symbolic and neural approaches, respectively.</p>\n", "tags": ["editing"] },
 {"key": "gvero2015synthesizing", "year": "2015", "title":"Synthesizing Java expressions from free-form queries", "abstract": "<p>We present a new code assistance tool for integrated development environments. Our system accepts as input free-form queries containing a mixture of English and Java, and produces Java code expressions that take the query into account and respect syntax, types, and scoping rules of Java, as well as statistical usage patterns. In contrast to solutions based on code search, the results returned by our tool need not directly correspond to any previously seen code fragment. As part of our system we have constructed a probabilistic context free grammar for Java constructs and library invocations, as well as an algorithm that uses a customized natural language processing tool chain to extract information from free-form text queries. We present the results on a number of examples showing that our technique (1) often produces the expected code fragments, (2) tolerates much of the flexibility of natural language, and (3) can repair incorrect Java expressions that use, for example, the wrong syntax or missing arguments.</p>\n", "tags": ["synthesis","code generation","bimodal"] },
 {"key": "habib2019neural", "year": "2019", "title":"Neural Bug Finding: A Study of Opportunities and Challenges", "abstract": "<p>Static analysis is one of the most widely adopted techniques to find software bugs before code is put in production. Designing and implementing effective and efficient static analyses is difficult and requires high expertise, which results in only a few experts able to write such analyses. This paper explores the opportunities and challenges of an alternative way of creating static bug detectors: neural bug finding. The basic idea is to formulate bug detection as a classification problem, and to address this problem with neural networks trained on examples of buggy and non-buggy code. We systematically study the effectiveness of this approach based on code examples labeled by a state-of-the-art, static bug detector. Our results show that neural bug finding is surprisingly effective for some bug patterns, sometimes reaching a precision and recall of over 80%, but also that it struggles to understand some program properties obvious to a traditional analysis. A qualitative analysis of the results provides insights into why neural bug finders sometimes work and sometimes do not work. We also identify pitfalls in selecting the code examples used to train and validate neural bug finders, and propose an algorithm for selecting effective training data.</p>\n", "tags": ["program analysis"] },
 {"key": "hajipour2019samplefix", "year": "2019", "title":"SampleFix: Learning to Correct Programs by Sampling Diverse Fixes", "abstract": "<p>Automatic program correction is an active topic of research, which holds the potential of dramatically improving productivity of programmers during the software development process and correctness of software in general. Recent advances in machine learning, deep learning and NLP have rekindled the hope to eventually fully automate the process of repairing programs. A key challenges is ambiguity, as multiple codes – or fixes – can implement the same functionality. In addition, dataset by nature fail to capture the variance introduced by such ambiguities. Therefore, we propose a deep generative model to automatically correct programming errors by learning a distribution of potential fixes. Our model is formulated as a deep conditional variational autoencoder that samples diverse fixes for the given erroneous programs. In order to account for ambiguity and inherent lack of representative datasets, we propose a novel regularizer to encourage the model to generate diverse fixes. Our evaluations on common programming errors show for the first time the generation of diverse fixes and strong improvements over the state-of-the-art approaches by fixing up to 61% of the mistakes.</p>\n", "tags": ["repair","code generation"] },