ReadmeReady: Free and Customizable Code
Documentation with LLMs - A Fine-Tuning Approach
1 2
Sayak Chakrabarty and Souradip Pal
1 Northwestern University 2 Purdue University
DOI: 10.21105/joss.07489
Software
• Review Summary
• Repository
• Archive Automated documentation of programming source code is a challenging task with significant
practical and scientific implications for the developer community. ReadmeReady is a large
language model (LLM)-based application that developers can use as a support tool to generate
Editor: Chris Vernon
basic documentation for any publicly available or custom repository. Over the last decade,
several research have been done on generating documentation for source code using neural
Reviewers:
network architectures. With the recent advancements in LLM technology, some open-source
• @Manvi-Agrawal applications have been developed to address this problem. However, these applications
• @camilochs typically rely on the OpenAI APIs, which incur substantial financial costs, particularly for large
repositories. Moreover, none of these open-source applications offer a fine-tuned model or
Submitted: 13 November 2024
features to enable users to fine-tune custom LLMs. Additionally, finding suitable data for
Published: 12 April 2025
fine-tuning is often challenging. Our application addresses these issues.
License
Authors of papers retain copyright
and release the work under a Statement of Need
Creative Commons Attribution 4.0
International License (CC BY 4.0). The integration of natural and programming languages is a research area that addresses tasks
such as automatic documentation of source code, code generation from natural language
descriptions, and searching for code using natural language queries. These tasks are highly
practical, as they can significantly enhance programmer efficiency, and they are scientifically
intriguing due to their complexity and the proposed relationships between natural language,
computation, and reasoning (Chomsky, 1956; Graves et al., 2014; Miller, 2003).
State of the Field
Recently, large language models (LLMs) have become increasingly significant, demonstrating
human-like abilities across various fields (Brown et al., 2020; Ouyang et al., 2022; Radford et
al., 2019). LLMs typically employ transformer architecture variants and are trained on massive
data volumes to detect patterns (Vaswani et al., 2017).
We present an LLM-based application that developers can use as a support tool to generate
basic documentation for any code repository. Some open-source applications have been
developed to address this issue, to name a few:
• AutoDoc-ChatGPT (Awekrx, 2023)
• AutoDoc (Labs, 2023)
• Auto-GitHub-Docs-Generator (Microsoft, 2023)
However, these applications suffer from two major issues. Firstly, all of them are built on top
of the OpenAI APIs, requiring users to have an OpenAI API key and incurring a cost with each
API request. Generating documentation for a large repository could result in costs reaching
hundreds of dollars. Our application allows users to choose among OpenAI’s GPT, Meta’s
Chakrabarty, & Pal. (2025). ReadmeReady: Free and Customizable Code Documentation with LLMs - A Fine-Tuning Approach. Journal of Open 1
Source Software, 10 (108), 7489. https://doi.org/10.21105/joss.07489.