CodeTag is a powerful, interactive toolkit for analysing, understanding, and summarising software repositories. It turns sprawling projects into concise, AI-ready context through a guided, text-based user-interface (TUI).
- Guided UX – a menu-driven TUI makes every feature discoverable.
- Insightful Metrics – language breakdowns, complexity scores & dependency graphs in seconds.
- AI-Ready Context – pack or distil code into a single file tailor-made for LLMs.
- Integrated Security – optional OSV-Scanner & Semgrep checks before you share code.
- Hybrid Power – friendly TUI for humans; rich CLI for automation.
The easiest way to install CodeTag is via pipx – it creates an isolated environment automatically and places the codetag command on your $PATH.
# macOS / Linux
python3 -m pip install --user pipx
python3 -m pipx ensurepath  # may require terminal restartpipx install git+https://github.com/mescuwa/codetag.gitThat's it – codetag is now available everywhere:
codetag  # launches the interactive TUIaudit orchestrates standalone security tools; install them inside CodeTag’s pipx environment:
pipx inject codetag osv-scanner   # dependency vulnerability scanner
pipx inject codetag semgrep       # static code analysisFor tree-sitter distillation follow the instructions in the advanced section below.
Prefer a classic editable install? Use a virtual environment:
# clone & enter repo
git clone https://github.com/mescuwa/codetag.git
cd codetag
# create & activate venv
python -m venv venv
source venv/bin/activate  # Windows: .\venv\Scripts\activate
# editable install with extras
pip install -e ".[audit,dev]"For a full guide on workflow, coding standards, and how to submit a pull request, please see CONTRIBUTING.md.
CodeTag can load project-specific defaults from a .codetag.yaml file located
in the root of your repository.  Any value passed on the command line always
overrides the configuration file.
# .codetag.yaml — example
# Exclude build artefacts and large data files during *scan*
scan:
  exclude_dirs:
    - build
    - dist
    - .venv
    - data/
  exclude_patterns:
    - "*.log"
    - "*.tmp"
    - "*.pkl"
# Increase the default token budget for *pack*
pack:
  max_tokens: 150000Keep the file in version control so your whole team shares the same defaults.
The TUI shows the equivalent CLI after every run, but here are common commands:
# scan a repo into JSON
codetag scan ./project --output report.json
# pack with 100k token budget
codetag pack ./project --output packed.txt --max-tokens 100000
# distill (level 2)
codetag distill ./project --output summary.txt --level 2
# audit with stricter rules
codetag audit ./project --strictSettings can be stored in a .codetag.yaml; flags override file values.
pip install tree-sitter
# clone grammars you need e.g.
git clone https://github.com/tree-sitter/tree-sitter-python vendor/tree-sitter-python
python - <<'PY'
from tree_sitter import Language
Language.build_library('build/my-languages.so', ['vendor/tree-sitter-python'])
PY
export CODETAG_TS_LIB=build/my-languages.soCodeTag is released under the MIT Licence (see LICENSE).