DocuMint: Docstring Generation for Python using Small Language Models

Background

Large Language Models (LLMs) are having their moment right now. The latest trends in LLMs however, are Small Language Models (SLMs, generally 13B parameters or less), such as Mistral, Gemma, and Llama family of models. SLMs are significantly more cost effective in terms of latency, memory, throughput, and energy consumption. In addition, they are also small enough to fit in consumer GPUs i.e., they can be deployed locally.

This study evaluates the effectiveness of SLMs in generating documentation of a Python file. Worldwide, software developers are estimated to spend significant time in code documentation, if this step can be effectively automated, it would have a huge impact on the software development pipeline.

Research

Our study is to explore if we can leverage SLMs to automatically generate docstring (classes, functions in a python file). More specifically:

Conduct a user preference study on various open source SLMs to establish a ranking on which SLMs programmers prefer.
Fine tune available SLMs on World of Code data.
Study the emergent properties of SLMs on documentation generation (As the number of parameters increase, can SLMs generate better docs?).

Data

Extract well documented python files from World of Code. Split the data into fine-tuning and validation. World of Code is a large dataset, so we need to mine for the information that we're looking for:

Look at commits (and blobs, not just files) that have “added docstring” in the commit message.
Look for projects that list documentation links in the README (parse the README).
Look at the code released by corporations (tend to have high quality documentation)

Deliverables

User preference study (human ranking) on the docstring generated by various SLMs.
Evaluation of the “emergent behaviors" in generating docstring.
- Do human coders prefer the documentation from larger models? i.e., does increasing the size of the model improve the quality of docstring?
Fine tuning vs. base model.
- Does the fine tuning step help?
- NOTE: This will be based on time constraints, and may not be feasible in the short-term.

People

Shelah Ameli (@ShelahAmeli / [email protected])

Adam Cook (@ajcook247 / [email protected])

Bibek Poudel (@poudel-bibek / [email protected])

Sekou Traore (@Sekou2077 /[email protected])

BI-weekly reports are present as issues

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
data		data
fine_tune_logs		fine_tune_logs
finetuning		finetuning
generated_output_docstrings		generated_output_docstrings
output_metrics		output_metrics
scripts		scripts
.gitignore		.gitignore
Dataset extraction and generation script		Dataset extraction and generation script
Documint-logo.png		Documint-logo.png
Final Presentation.pdf		Final Presentation.pdf
Final Report.pdf		Final Report.pdf
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocuMint: Docstring Generation for Python using Small Language Models

Background

Research

Data

Deliverables

People

About

Releases

Packages

Contributors 4

Languages

cs540-24/docu-mint

Folders and files

Latest commit

History

Repository files navigation

DocuMint: Docstring Generation for Python using Small Language Models

Background

Research

Data

Deliverables

People

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages