Thanks to visit codestin.com
Credit goes to github.com

Skip to content

danghung1202/Tutorial-Codebase-Knowledge

 
 

Repository files navigation

AI Codebase Knowledge Builder

Generate beginner-friendly tutorials for any codebase using AI!

Features

  • 🔍 Codebase Analysis: Automatically identify key abstractions and relationships
  • 📊 Interactive Visualization: View relationships between components
  • 📝 Tutorial Generation: Create a multi-chapter tutorial with code examples and diagrams
  • 🧠 Multiple LLM Options: Use Google Gemini, OpenAI, Claude, or DeepSeek models
  • 🌐 User-Friendly Interface: Easy-to-use Streamlit web interface for configuration and generation

🚀 Getting Started

  1. Clone this repository:

    git clone https://github.com/danghung1202/Tutorial-Codebase-Knowledge.git
    cd Tutorial-Codebase-Knowledge
  2. Install dependencies:

    pip install -r requirements.txt
  3. Set up your environment variables in a .env file:

    GEMINI_API_KEY=your_gemini_api_key  # Required for default Gemini model
    OPENAI_API_KEY=your_openai_api_key  # Optional, for OpenAI models
    ANTHROPIC_API_KEY=your_anthropic_api_key  # Optional, for Claude models
    GITHUB_TOKEN=your_github_token  # Optional, for private repositories
    
  4. Launch the web interface:

    python -m streamlit run Home.py

    or

    streamlit run Home.py

    This will open a browser window where you can:

    • Enter a GitHub repository URL or upload a local project
    • Choose your preferred LLM model
    • Configure analysis settings
    • Generate and view your tutorial in real-time

For advanced users who prefer the command line, you can also use:

# Analyze a GitHub repository
python main.py --repo https://github.com/username/repo --include "*.py" "*.js" --exclude "tests/*"

# Analyze a local directory
python main.py --dir /path/to/your/codebase --include "*.py" --exclude "*test*"

# Generate in different languages
python main.py --repo https://github.com/username/repo --language "Chinese"

Common CLI options:

  • --repo or --dir - GitHub repo URL or local directory path
  • -n, --name - Project name (optional)
  • -t, --token - GitHub token for private repos
  • -o, --output - Output directory (default: ./output)
  • -i, --include - Files to include (e.g., ".py" ".js")
  • -e, --exclude - Files to exclude (e.g., "tests/*")
  • -s, --max-size - Maximum file size in bytes (default: 100KB)
  • --language - Tutorial language (default: "english")

How It Works

  1. Source Analysis: The tool processes your repository files, filtering by specified patterns
  2. Abstraction Identification: Using LLMs, the key abstractions in your codebase are identified
  3. Relationship Analysis: Connections between abstractions are determined
  4. Chapter Ordering: The abstractions are ordered into a logical tutorial structure
  5. Tutorial Generation: Detailed chapters are created with explanations, diagrams, and code examples

Customization

The web interface provides easy access to all customization options:

  • LLM Provider: Choose between Google Gemini, Anthropic Claude, OpenAI, or DeepSeek
  • Model Parameters: Select specific models and configure API settings
  • File Filtering: Include/exclude specific file patterns with an intuitive interface
  • Output Settings: Customize the output directory and format
  • Language Selection: Generate tutorials in different languages
  • Real-time Preview: View the generated content as it's being created

Output Examples

The generated tutorial includes:

  • A markdown index file with project summary and relationship diagram
  • Individual chapter markdown files for each abstraction
  • Mermaid diagrams visualizing concepts and relationships

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Turns Codebase into Easy Tutorial with AI

License: MIT

Ever stared at a new codebase written by others feeling completely lost? This tutorial shows you how to build an AI agent that analyzes GitHub repositories and creates beginner-friendly tutorials explaining exactly how the code works.

This is a tutorial project of Pocket Flow, a 100-line LLM framework. It crawls GitHub repositories and builds a knowledge base from the code. It analyzes entire codebases to identify core abstractions and how they interact, and transforms complex code into beginner-friendly tutorials with clear visualizations.

  🔸 🎉 Reached Hacker News Front Page (April 2025) with >800 up‑votes: Discussion »

⭐ Example Results for Popular GitHub Repositories!

🤯 All these tutorials are generated entirely by AI by crawling the GitHub repo!

  • AutoGen Core - Build AI teams that talk, think, and solve problems together like coworkers!

  • Browser Use - Let AI surf the web for you, clicking buttons and filling forms like a digital assistant!

  • Celery - Supercharge your app with background tasks that run while you sleep!

  • Click - Turn Python functions into slick command-line tools with just a decorator!

  • Codex - Turn plain English into working code with this AI terminal wizard!

  • Crawl4AI - Train your AI to extract exactly what matters from any website!

  • CrewAI - Assemble a dream team of AI specialists to tackle impossible problems!

  • DSPy - Build LLM apps like Lego blocks that optimize themselves!

  • FastAPI - Create APIs at lightning speed with automatic docs that clients will love!

  • Flask - Craft web apps with minimal code that scales from prototype to production!

  • Google A2A - The universal language that lets AI agents collaborate across borders!

  • LangGraph - Design AI agents as flowcharts where each step remembers what happened before!

  • LevelDB - Store data at warp speed with Google's engine that powers blockchains!

  • MCP Python SDK - Build powerful apps that communicate through an elegant protocol without sweating the details!

  • NumPy Core - Master the engine behind data science that makes Python as fast as C!

  • OpenManus - Build AI agents with digital brains that think, learn, and use tools just like humans do!

  • Pydantic Core - Validate data at rocket speed with just Python type hints!

  • Requests - Talk to the internet in Python with code so simple it feels like cheating!

  • SmolaAgents - Build tiny AI agents that punch way above their weight class!

  • Showcase Your AI-Generated Tutorials in Discussions!

💡 Development Tutorial

  • I built using Agentic Coding, the fastest development paradigm, where humans simply design and agents code.

  • The secret weapon is Pocket Flow, a 100-line LLM framework that lets Agents (e.g., Cursor AI) build for you

  • Check out the Step-by-step YouTube development tutorial:



About

UI For Turns Codebase into Easy Tutorial with AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.8%
  • Batchfile 0.2%