👋 Hi there! I'm unclecode, the author of Crawl4AI - a no 1 trending GitHub repository that's crawl the web in LLms friendly way. While working with LLMs like Claude and GPT, I often need to provide codebase context efficiently. That's why I created gitin
- a simple yet powerful tool that helps you extract and format GitHub repository content for LLM consumption.
When chatting with AI models about code, providing the right context is crucial. gitin
helps you:
- Extract relevant code files from any GitHub repository
- Format them into a clean, token-efficient markdown file
- Filter files by type, size, and content
- Get token estimates for LLM context windows
pip install gitin
Basic usage - get all Python files from a repository:
gitin https://github.com/unclecode/crawl4ai -o output.md --include="*.py"
Extract Python files from Crawl4AI, excluding tests:
gitin https://github.com/unclecode/crawl4ai \
--include="*.py" \
--exclude="tests/*" \
-o basic_example.md
Find files containing async functions:
gitin https://github.com/unclecode/crawl4ai \
--include="*.py" \
--search="async def" \
-o async_functions.md
Get both Python and Markdown files under 5KB:
gitin https://github.com/unclecode/crawl4ai \
--include="*.py,*.md" \
--exclude="tests/*,docs/*" \
--max-size=5000 \
-o small_files.md
Extract markdown files for documentation:
gitin https://github.com/unclecode/crawl4ai \
--include="docs/**/*.md" \
-o documentation.md
The tool generates a clean markdown file with:
- Repository structure
- File contents with syntax highlighting
- Clear separators between files
- Token count estimation for LLMs
Options:
--version Show the version and exit
--exclude TEXT Comma-separated glob patterns to exclude
Example: --exclude="test_*,*.tmp,docs/*"
--include TEXT Comma-separated glob patterns to include
Example: --include="*.py,src/*.js,lib/*.rb"
--search TEXT Comma-separated strings to search in file contents
Example: --search="TODO,FIXME,HACK"
--max-size INTEGER Maximum file size in bytes (default: 1MB)
-o, --output TEXT Output markdown file path [required]
--help Show this message and exit
When using the output with AI models:
- Generate the markdown file:
gitin https://github.com/your/repo -o context.md --include="*.py"
-
Copy the content to your conversation with the AI model
-
The AI model will now have context about your codebase and can help with:
- Code review
- Bug fixing
- Feature implementation
- Documentation
- Refactoring suggestions
- Token Efficiency: Use
--max-size
to limit file sizes and stay within context windows - Relevant Context: Use
--search
to find specific code patterns or TODO comments - Multiple Patterns: Combine patterns with commas:
--include="*.py,*.js,*.md"
- Exclude Tests: Use
--exclude="tests/*,*_test.py"
to focus on main code - Documentation: Include only docs with
--include="docs/**/*.md"
I'm unclecode, and I love building tools that make AI development easier. Check out my other project Crawl4AI and follow me on X @unclecode.
Contributions are welcome! Feel free to:
- Report bugs
- Suggest features
- Submit pull requests
I'm extremely busy with Crawl4ai, so I may not be able to check this repository frequently. However, feel free to send your pull request, and I will try to approve it.
MIT License - feel free to use in your projects!
See CHANGELOG.md for release history.