ml4code
diff --git a/‎paper-abstracts.json‎
Lines changed: 1 addition & 0 deletions b/‎paper-abstracts.json‎
Lines changed: 1 addition & 0 deletions
@@ -417,6 +417,7 @@
 {"key": "waunakh2019idbench", "year": "2021", "title":"IdBench: Evaluating Semantic Representations of Identifier Names in Source Code", "abstract": "<p>Identifier names convey useful information about the intended semantics of code. Name-based program analyses use this information, e.g., to detect bugs, to predict types, and to improve the readability of code. At the core of namebased analyses are semantic representations of identifiers, e.g., in the form of learned embeddings. The high-level goal of such a representation is to encode whether two identifiers, e.g., len and size, are semantically similar. Unfortunately, it is currently unclear to what extent semantic representations match the semantic relatedness and similarity perceived by developers. This paper presents IdBench, the first benchmark for evaluating semantic representations against a ground truth created from thousands of ratings by 500 software developers. We use IdBench to study state-of-the-art embedding techniques proposed for natural language, an embedding technique specifically designed for source code, and lexical string distance functions. Our results show that the effectiveness of semantic representations varies significantly and that the best available embeddings successfully represent semantic relatedness. On the downside, no existing technique provides a satisfactory representation of semantic similarities, among other reasons because identifiers with opposing meanings are incorrectly considered to be similar, which may lead to fatal mistakes, e.g., in a refactoring tool. Studying the strengths and weaknesses of the different techniques shows that they complement each other. As a first step toward exploiting this complementarity, we present an ensemble model that combines existing techniques and that clearly outperforms the best available semantic representation.</p>\n", "tags": ["representation"] },
 {"key": "wei2019code", "year": "2019", "title":"Code Generation as a Dual Task of Code Summarization", "abstract": "<p>Code summarization (CS) and code generation (CG) are two crucial tasks in the field of automatic software development. Various neural network-based approaches are proposed to solve these two tasks separately. However, there exists a specific intuitive correlation between CS and CG, which have not been exploited in previous work. In this paper, we apply the relations between two tasks to improve the performance of both tasks. In other words, exploiting the duality between the two tasks, we propose a dual training framework to train the two tasks simultaneously. In this framework, we consider the dualities on probability and attention weights, and design corresponding regularization terms to constrain the duality. We evaluate our approach on two datasets collected from GitHub, and experimental results show that our dual framework can improve the performance of CS and CG tasks over baselines.</p>\n", "tags": ["code generation","summarization"] },
 {"key": "wei2020lambdanet", "year": "2020", "title":"LambdaNet: Probabilistic Type Inference using Graph Neural Networks", "abstract": "<p>As gradual typing becomes increasingly popular in languages like Python and TypeScript, there is a growing need to infer type annotations automatically. While type annotations help with tasks like code completion and static error catching, these annotations cannot be fully inferred by compilers and are tedious to annotate by hand. This paper proposes a probabilistic type inference scheme for TypeScript based on a graph neural network. Our approach first uses lightweight source code analysis to generate a program abstraction called a type dependency graph, which links type variables with logical constraints as well as name and usage information. Given this program abstraction, we then use a graph neural network to propagate information between related type variables and eventually make type predictions. Our neural architecture can predict both standard types, like number or string, as well as user-defined types that have not been encountered during training. Our experimental results show that our approach outperforms prior work in this space by 14% (absolute) on library types, while having the ability to make type predictions that are out of scope for existing techniques.</p>\n", "tags": ["GNN","types"] },
+{"key": "wei2023typet5", "year": "2023", "title":"TypeT5: Seq2seq Type Inference using Static Analysis", "abstract": "<p>There has been growing interest in automatically predicting missing type annotations in programs written in Python and JavaScript. While prior methods have achieved impressive accuracy when predicting the most common types, they often perform poorly on rare or complex types. In this paper, we present a new type inference method that treats type prediction as a code infilling task by leveraging CodeT5, a state-of-the-art seq2seq pre-trained language model for code. Our method uses static analysis to construct dynamic contexts for each code element whose type signature is to be predicted by the model. We also propose an iterative decoding scheme that incorporates previous type predictions in the model’s input context, allowing information exchange between related code elements. Our evaluation shows that the proposed approach, TypeT5, not only achieves a higher overall accuracy (particularly on rare and complex types) but also produces more coherent results with fewer type errors – while enabling easy user intervention.</p>\n", "tags": ["types","Transformer"] },
 {"key": "white2015toward", "year": "2015", "title":"Toward Deep Learning Software Repositories", "abstract": "<p>Deep learning subsumes algorithms that automatically learn compositional representations. The ability of these\nmodels to generalize well has ushered in tremendous advances\nin many fields such as natural language processing (NLP).\nRecent research in the software engineering (SE) community\nhas demonstrated the usefulness of applying NLP techniques to\nsoftware corpora. Hence, we motivate deep learning for software\nlanguage modeling, highlighting fundamental differences between\nstate-of-the-practice software language models and connectionist\nmodels. Our deep learning models are applicable to source\ncode files (since they only require lexically analyzed source\ncode written in any programming language) and other types\nof artifacts. We show how a particular deep learning model\ncan remember its state to effectively model sequential data,\ne.g., streaming software tokens, and the state is shown to be\nmuch more expressive than discrete tokens in a prefix. Then we\ninstantiate deep learning models and show that deep learning\ninduces high-quality models compared to n-grams and cache-based n-grams on a corpus of Java projects. We experiment\nwith two of the models’ hyperparameters, which govern their\ncapacity and the amount of context they use to inform predictions,\nbefore building several committees of software language models\nto aid generalization. Then we apply the deep learning models to\ncode suggestion and demonstrate their effectiveness at a real SE\ntask compared to state-of-the-practice models. Finally, we propose\navenues for future work, where deep learning can be brought to\nbear to support model-based testing, improve software lexicons,\nand conceptualize software artifacts. Thus, our work serves as\nthe first step toward deep learning software repositories.</p>\n", "tags": ["representation"] },
 {"key": "white2016deep", "year": "2016", "title":"Deep Learning Code Fragments for Code Clone Detection", "abstract": "<p>Code clone detection is an important problem for software\nmaintenance and evolution. Many approaches consider either structure or identifiers, but none of the existing detection techniques model both sources of information. These\ntechniques also depend on generic, handcrafted features to\nrepresent code fragments. We introduce learning-based detection techniques where everything for representing terms\nand fragments in source code is mined from the repository.\nOur code analysis supports a framework, which relies on\ndeep learning, for automatically linking patterns mined at\nthe lexical level with patterns mined at the syntactic level.\nWe evaluated our novel learning-based approach for code\nclone detection with respect to feasibility from the point\nof view of software maintainers. We sampled and manually\nevaluated 398 file- and 480 method-level pairs across eight\nreal-world Java systems; 93% of the file- and method-level\nsamples were evaluated to be true positives. Among the true\npositives, we found pairs mapping to all four clone types. We\ncompared our approach to a traditional structure-oriented\ntechnique and found that our learning-based approach detected clones that were either undetected or suboptimally\nreported by the prominent tool Deckard. Our results affirm\nthat our learning-based approach is suitable for clone detection and a tenable technique for researchers.</p>\n", "tags": ["clone"] },
 {"key": "white2017sorting", "year": "2017", "title":"Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities", "abstract": "<p>In  the  field  of  automated  program  repair,  the  redundancy  assumption  claims  large  programs  contain  the  seeds\nof  their  own  repair.  However,  most  redundancy-based  program\nrepair  techniques  do  not  reason  about  the  repair  ingredients—the code that is reused to craft a patch. We aim to reason about\nthe repair ingredients by using code similarities to prioritize and\ntransform  statements  in  a  codebase  for  patch  generation.  Our\napproach,  DeepRepair,  relies  on  deep  learning  to  reason  about\ncode  similarities.  Code  fragments  at  well-defined  levels  of  granularity in a codebase can be sorted according to their similarity\nto suspicious elements (i.e., code elements that contain suspicious\nstatements) and statements can be transformed by mapping out-of-scope  identifiers  to  similar  identifiers  in  scope.  We  examined\nthese new search strategies for patch generation with respect to\neffectiveness  from  the  viewpoint  of  a  software  maintainer.  Our\ncomparative experiments were executed on six open-source Java\nprojects  including  374  buggy  program  revisions  and  consisted\nof  19,949  trials  spanning  2,616  days  of  computation  time.  DeepRepair’s  search  strategy  using  code  similarities  generally  found\ncompilable  ingredients  faster  than  the  baseline,  jGenProg,  but\nthis improvement neither yielded test-adequate patches in fewer\nattempts (on average) nor found significantly more patches than\nthe  baseline.  Although  the  patch  counts  were  not  statistically\ndifferent,  there  were  notable  differences  between  the  nature  of\nDeepRepair  patches  and  baseline  patches.  The  results  demonstrate that our learning-based approach finds patches that cannot\nbe  found  by  existing  redundancy-based  repair  techniques</p>\n", "tags": ["repair"] },