From 347e9410cd08b8dbd67a2f71761c316cef2902c1 Mon Sep 17 00:00:00 2001 From: Aditya Kanade Date: Mon, 6 Nov 2023 15:53:49 +0530 Subject: [PATCH] Update README.md --- README.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 11b1f46..07a54aa 100644 --- a/README.md +++ b/README.md @@ -6,8 +6,8 @@ This repository hosts the official code and data artifact for the paper ["Monito ## Repository Contents 1. [Datasets](#1-datasets): PragmaticCode and DotPrompts -2. [Evaluation scripts](#2-evaluation-scripts): Scripts to evaluate LMs by taking as input inferences (code generated by the model) for testcases in DotPrompts and producing score@k scores for the metrics reported in the paper: Compilation Rate (CR), Next-Identifier Match (NIM), Identifier-Sequence Match (ISM) and Prefix Match (PM). -3. [Inference Results over DotPrompts](#3-inference-results-over-dotprompts): generated code for testcases in DotPrompts with various model configurations reported in the paper. The graphs and tables reported in the paper can be reproduced by running the evaluation scripts on the provided inference results. +2. [Evaluation scripts](#2-evaluation-scripts): Scripts to evaluate LMs by taking as input inferences (code generated by the model) for examples in DotPrompts and producing score@k scores for the metrics reported in the paper: Compilation Rate (CR), Next-Identifier Match (NIM), Identifier-Sequence Match (ISM) and Prefix Match (PM). +3. [Inference Results over DotPrompts](#3-inference-results-over-dotprompts): Generated code for examples in DotPrompts with various model configurations reported in the paper. The graphs and tables reported in the paper can be reproduced by running the evaluation scripts on the provided inference results. 4. [`multilspy`](#4-multilspy): A language server client, to easily obtain and use results of various static analyses provided by a large variety of language servers that communicate over the [Language Server Protocol](https://microsoft.github.io/language-server-protocol/). `multilspy` is intended to be used as a library to easily query various language servers, without having to worry about setting up their configurations and implementing the client-side of language server protocol. `multilspy` currently supports running language servers for Java, Rust, C# and Python, and we aim to expand this list with the help of the community. 5. [Monitor-Guided Decoding](#5-monitor-guided-decoding): Implementation of various monitors monitoring for different properties reported in the paper (for example: monitoring for type-valid identifier dereferences, monitoring for correct number of arguments to method calls, monitoring for typestate validity of method call sequences, etc.), spanning 3 programming languages. @@ -44,7 +44,7 @@ pip3 install -r requirements.txt |--------------|:-----:| | Number of repositories in PragmaticCode | 100 | | Number of methods in DotPrompts | 1420 | -| Number of testcases in DotPrompts | 10538 | +| Number of examples in DotPrompts | 10538 | ### PragmaticCode PragmaticCode is a dataset of real-world open-source Java projects complete with their development environments and dependencies (through their respective build systems). The authors tried to ensure that all the repositories in PragmaticCode were released publicly only after the determined training dataset cutoff date (31 March 2022) for the CodeGen, SantaCoder and text-davinci-003 family of models, which were used to evaluate MGD. @@ -52,9 +52,9 @@ PragmaticCode is a dataset of real-world open-source Java projects complete with The list of repositories along with their respective licenses consisting PragmaticCode is available in [datasets/PragmaticCode/repos.csv](datasets/PragmaticCode/repos.csv). The contents of the files required for inference for each of the repositories is available in [datasets/PragmaticCode/fileContentsByRepo.json](datasets/PragmaticCode/fileContentsByRepo.json). ### DotPrompts -DotPrompts is a set of testcases derived from PragmaticCode, such that each testcase consists of a prompt to a dereference location (a code location having the "." operator in Java). The scenario described in [motivating example above](#monitor-guided-decoding-motivating-example) is an example of a testcase in DotPrompts. +DotPrompts is a set of examples derived from PragmaticCode, such that each example consists of a prompt to a dereference location (a code location having the "." operator in Java). The scenario described in [motivating example above](#monitor-guided-decoding-motivating-example) is an example in DotPrompts. -The complete description of a testcase in DotPrompts is a tuple - `(repo, classFileName, methodStartIdx, methodStopIdx, dot_idx)`. The dataset is available at [datasets/DotPrompts/dataset.csv](datasets/DotPrompts/dataset.csv). +The complete description of an example in DotPrompts is a tuple - `(repo, classFileName, methodStartIdx, methodStopIdx, dot_idx)`. The dataset is available at [datasets/DotPrompts/dataset.csv](datasets/DotPrompts/dataset.csv). ## 2. Evaluation Scripts ### Running the evaluation script @@ -87,7 +87,7 @@ Description of expected columns in the inference results csv input to the evalua * `compilationSucceeded`: Result of compiling the generated method in the context of the full repository. 1 if success, 0 otherwise. Values from: `[1, 0]` ## 3. Inference Results over DotPrompts -We provide all inferences (generated code) generated by all model configurations reported in the paper, for every testcase in DotPrompts. This consists of 6 independently sampled inferences for 18 different model configurations (spanning parameter scale, prompt templates, use of FIM context, etc.) for every testcase in DotPrompts. +We provide all inferences (generated code) generated by all model configurations reported in the paper, for every example in DotPrompts. This consists of 6 independently sampled inferences for 18 different model configurations (spanning parameter scale, prompt templates, use of FIM context, etc.) for every example in DotPrompts. The generated samples along with their compilation status, following the format described [above](#description-of-inference-results-csv-file-format), is available at [inference_results/dotprompts_results.csv](inference_results/dotprompts_results.csv). The file is stored using [git lfs](https://git-lfs.com/). If the file is not available locally after cloning this repository, please check the [git lfs website](https://git-lfs.com/) for instructions on setup, and clone the repository again after git lfs setup. @@ -101,7 +101,7 @@ python3 evaluation_scripts/eval_results.py inference_results/dotprompts_results. The above command creates a directory [results](results/) (already included in the repository), containing all the figures and tables provided in the paper along with extra details. The command also generates a report in the output directory which relates the generated figures to sections in the paper. In case of above command, the report is generated at [results/Report.md](results/Report.md). ## 4. `multilspy` -`multilspy` is a cross-platform library to set up and interact with various language servers in a unified and easy way. [Language servers]((https://microsoft.github.io/language-server-protocol/overviews/lsp/overview/)) are tools that perform a variety of static analyses on source code and provide useful information such as type-directed code completion suggestions, symbol definition locations, symbol references, etc., over the [Language Server Protocol (LSP)](https://microsoft.github.io/language-server-protocol/overviews/lsp/overview/). `multilspy` intends to ease the process of using language servers, by abstracting the setting up of the language servers, performing language-specific configuration and handling communication with the server over the json-rpc based protocol, while exposing a simple interface to the user. +`multilspy` is a cross-platform library that we have built to set up and interact with various language servers in a unified and easy way. [Language servers]((https://microsoft.github.io/language-server-protocol/overviews/lsp/overview/)) are tools that perform a variety of static analyses on source code and provide useful information such as type-directed code completion suggestions, symbol definition locations, symbol references, etc., over the [Language Server Protocol (LSP)](https://microsoft.github.io/language-server-protocol/overviews/lsp/overview/). `multilspy` intends to ease the process of using language servers, by abstracting the setting up of the language servers, performing language-specific configuration and handling communication with the server over the json-rpc based protocol, while exposing a simple interface to the user. Since LSP is language-agnostic, `multilspy` can provide the results for static analyses of code in different languages over a common interface. `multilspy` is easily extensible to any language that has a Language Server and currently supports Java, Rust, C# and Python and we aim to support more language servers from the [list of language server implementations](https://microsoft.github.io/language-server-protocol/implementors/servers/).