A Chrome extension that generates visual diagrams from web pages using Chrome's built-in Gemini Nano AI. This extension works completely offline and extracts key relationships and structures from any web page to create interactive diagrams.
- Multi-Turn AI Pipeline: Advanced 6-stage conversational AI processing for superior accuracy
- Content Understanding: Deep analysis of tone, intent, and content type with confidence scoring
- Entity-Relationship Extraction: Discovers connections between people, organizations, concepts, and processes
- Self-Critique & Refinement: AI reviews and improves its own output for better quality
- Visual Editor: Full-featured diagram editor with drag-and-drop node/edge manipulation
- Code Editor: Direct Mermaid syntax editing with real-time validation and preview
- Undo/Redo System: Comprehensive history management with 50+ operation tracking
- Clipboard Operations: Copy, cut, paste, and duplicate elements with smart positioning
- Multi-Selection: Select and modify multiple elements simultaneously
- Focus Mode: Click any node to explore its connections with navigation history
- Diagram Diffing: Visual comparison showing changes between page visits
- Progressive Processing: Real-time progress updates during AI analysis
- Live Validation: Instant Mermaid syntax checking with error reporting
- Language Detection: Automatic programming language identification
- Specialized Diagrams: Code-specific diagram types (execution flow, data flow, dependencies)
- Syntax Highlighting: Enhanced code block processing and visualization
- Auto-Detection: Intelligent diagram type selection based on content
- Flowcharts: Process and workflow visualization
- Mind Maps: Hierarchical concept organization
- Timelines: Chronological event sequences
- State Diagrams: System state and transition modeling
- Relationship Graphs: Entity connection visualization
- 100% Offline: Uses Chrome's built-in Gemini Nano AI - no data leaves your device
- Multiple Export Formats: SVG, PNG, PDF, Interactive HTML, JSON Data, and Markdown
- Smart Content Extraction: Processes headers, paragraphs, lists, tables, and code blocks with enhanced validation
- Robust Error Handling: Comprehensive content validation with specific error messages for different scenarios
- Intelligent Caching: Fast diagram recall with 7-day expiration and content-based invalidation
- Text Selection Support: Generate diagrams from selected text on any page
- Comprehensive Settings: Customizable preferences with import/export functionality
- Chrome Browser: Chrome 140+ with AI features enabled
- System Requirements:
- 22GB+ free storage space (for AI model download)
- 4GB+ VRAM OR 16GB+ RAM with 4+ CPU cores
- Unmetered internet connection (for initial model download)
-
Clone and Build:
git clone <repository-url> cd visual-diagram-generator-extension npm install npm run build
-
Load Extension:
- Open Chrome β
chrome://extensions/ - Enable "Developer mode" (top-right toggle)
- Click "Load unpacked" β Select
extension/dist/folder - Extension appears in toolbar
- Open Chrome β
-
Enable Chrome AI:
- Go to
chrome://flags/ - Search for "AI" or "Gemini"
- Enable AI-related flags
- Restart Chrome
- Go to
-
First Use:
- Click extension icon to open sidepanel
- AI model will download automatically (may take several minutes)
- Wait for "π’ AI Model Ready" status
- Open Sidepanel: Click extension icon in Chrome toolbar
- Navigate to Content: Visit any webpage with substantial text content
- Generate Diagram: Click "Generate Diagram" button
- View Results: Interactive diagram appears with AI analysis
- Export: Use export buttons for SVG, PNG, PDF, or other formats
- Enable "Focus Mode" toggle in settings (enableFocusMode)
- Click any node to explore its connections
- Navigate between related concepts with back/forward controls
- View breadcrumb navigation history
- Click "Edit Diagram" to enter edit mode
- Visual Editor: Drag nodes, add/remove connections, modify properties
- Code Editor: Direct Mermaid syntax editing with validation
- History: Undo/redo with Ctrl+Z/Ctrl+Y
- Clipboard: Copy/paste elements with Ctrl+C/Ctrl+V
- Progressive Results: Real-time progress updates during AI processing (enableProgressiveResults)
- Inferred Relationships: AI-powered relationship suggestions (enableInferredRelationships)
- Tone Analysis: Content tone and intent analysis with confidence indicators (enableToneAnalysis)
- Evidence Display: Supporting evidence for extracted relationships (enableEvidenceDisplay)
- Diagram Diffing: Visual comparison showing changes between page visits (enableDiagramDiff)
- Content Intelligence: Automatic detection of content type and optimal diagram style
- Smart Caching: Intelligent cache invalidation based on content changes
- Auto-Detection: System suggests optimal diagram type
- Manual Selection: Choose from flowchart, mindmap, timeline, state diagram
- Code Intelligence: Specialized diagrams for programming content
- Access via gear icon in sidepanel
- Configure diagram generation preferences
- Customize visual themes and export options
- Import/export settings and cache data
- Toggle advanced features on/off individually
- Status Check: Look for "π’ AI Model Ready" in sidepanel
- Download Failed: Click "Retry Download" button
- Chrome Version: Ensure Chrome 140+ with AI flags enabled
- Storage Space: Verify 22GB+ free space available
- No Content: Ensure page has readable text (not just images)
- Look for specific error: "No content found on this page to extract"
- Extension validates all content types: headers, paragraphs, lists, tables, code blocks
- Restricted Pages: Cannot process browser internal pages (chrome://, about:, etc.)
- Error message: "Cannot extract content from browser internal pages"
- Processing Errors: Check browser console for detailed error messages
- Slow Performance: Try selecting specific text instead of entire page
- Cache Issues: Clear extension cache in settings if diagrams seem outdated
The extension uses an advanced 6-stage conversational AI system:
- Content Understanding (20%): Analyzes topic, type, and themes
- Entity Discovery (40%): Identifies key people, organizations, concepts
- Relationship Extraction (60%): Finds connections between entities
- Self-Critique (80%): AI reviews and improves its own output
- Prioritization (90%): Ranks relationships by importance
- Normalization (100%): Deduplicates and merges similar entities
Each stage builds on previous context, resulting in more accurate and comprehensive diagrams.
- Node Operations: Add, edit, delete, and reposition nodes with drag-and-drop
- Edge Management: Create connections by clicking nodes, modify relationship labels
- Multi-Selection: Select multiple elements with Ctrl+click or selection box
- Smart Positioning: Automatic layout to prevent overlaps and maintain readability
- Direct Syntax Editing: Modify Mermaid code with syntax highlighting
- Real-Time Validation: Instant error checking with helpful suggestions
- Live Preview: See changes reflected immediately in diagram view
- Comprehensive Undo/Redo: Track up to 50 operations with automatic compression
- Clipboard Operations: Copy, cut, paste, and duplicate with Ctrl+C/V/X shortcuts
- Checkpoints: Create named save points for major changes
- Node-Centric Exploration: Click any node to center and highlight its connections
- Navigation History: Back/forward buttons with breadcrumb trail
- Smooth Transitions: Animated diagram updates for better user experience
- Context Preservation: Maintains selection and view state during navigation
- Tone Analysis: Identifies content tone (Analytical, Promotional, Critical, etc.)
- Intent Detection: Determines author purpose (To Inform, To Persuade, To Instruct)
- Confidence Scoring: Each relationship includes confidence level (40-100%)
- Programming Language Detection: Specialized processing for code content
- Multiple Formats: SVG (vector), PNG (raster), PDF (document), HTML (interactive)
- Data Export: JSON (structured data), Markdown (text format)
- Quality Options: Configurable resolution and compression settings
- Batch Export: Export multiple diagrams simultaneously
- Default Diagram Type: Set preferred diagram style (auto-detect, flowchart, mindmap, etc.)
- Processing Mode: Choose between speed-optimized or quality-optimized AI processing
- Confidence Threshold: Filter relationships below specified confidence level
- Entity Limits: Configure maximum entities and relationships per diagram
- Theme Selection: Light, dark, or system-based theme switching
- Color Schemes: Multiple color palettes for different diagram types
- Font Settings: Customize text size and font family for diagrams
- Layout Options: Grid alignment, spacing, and positioning preferences
- Progressive Results: Enable real-time progress updates during AI processing (enableProgressiveResults)
- Inferred Relationships: Toggle AI-powered relationship suggestions (enableInferredRelationships)
- Focus Mode: Configure navigation behavior and animation settings (enableFocusMode)
- Diagram Diffing: Enable visual comparison of diagram changes (enableDiagramDiff)
- Tone Analysis: Show content tone and intent analysis (enableToneAnalysis)
- Evidence Display: Display supporting evidence for relationships (enableEvidenceDisplay)
- Caching: Control diagram cache duration and storage limits
- Export Defaults: Set preferred export formats and quality settings
- Import/Export Settings: Backup and restore all extension preferences
- Cache Management: View, export, or clear stored diagram cache
- Privacy Controls: Configure data retention and processing options
Content β Understanding β Entities β Relationships β Refinement β Prioritization β Normalization β Diagram
Core AI Modules:
- AI Pipeline (
extension/src/lib/aiPipeline/): Multi-turn conversational AI system- Stages: 6 specialized processing stages with fallback mechanisms
- Parsers: Robust AI response parsing with error recovery
- Context Management: Maintains conversation state across AI turns
- Dynamic Loading: PipelineOrchestrator is dynamically imported to optimize bundle size
- Content Extractor (
extension/src/lib/extractor.ts): DOM parsing and semantic chunking - Diagram Cache (
extension/src/lib/diagramCache.ts): Intelligent caching with 7-day expiration
Generation β Validation β Rendering β Interaction β Export
Diagram Modules:
- Core System (
extension/src/lib/diagram/core/): Modular diagram generation with factory pattern - Editor System (
extension/src/lib/diagram/editor/): Interactive editing with CRUD operations - Interactive Features (
extension/src/lib/diagram/interactive/): Focus mode and navigation - Export System (
extension/src/lib/diagram/export/): Multi-format export (SVG, PNG, PDF, HTML, JSON, Markdown) - Validation (
extension/src/lib/diagram/core/validator.ts): Mermaid syntax validation and sanitization
Sidepanel (React UI) β Background Service Worker β Offscreen Document (AI Processing)
β
Content Script (DOM Access)
Extension Components:
- Sidepanel (
extension/src/sidepanel/): React-based UI with Tailwind CSS and real-time status updates - Background Service Worker (
extension/src/background/): Enhanced message coordination with lifecycle management, error handling, and badge notifications - Offscreen Document (
extension/src/offscreen/): Dedicated AI processing context with directwindow.aiaccess - Content Script (
extension/src/content/): DOM extraction and content analysis - Options Page (
extension/src/options/): Settings and preferences management - Message Router (
extension/src/utils/messageRouter.ts): Advanced message routing with retries and broadcasting - Error Boundary (
extension/src/utils/errorBoundary.ts): Comprehensive error handling with recovery strategies - AI Telemetry (
extension/src/lib/aiTelemetry.ts): Performance monitoring and usage analytics
Chrome Extension MV3 Enhancements:
- Service Worker Lifecycle: Keep-alive mechanism prevents termination during AI processing
- Type-Safe Messaging: Comprehensive message passing with error recovery
- Advanced Message Router: Reliable message routing with automatic retries and broadcasting
- Optimized Storage: Strategic use of session/local/sync storage areas
- Error Handling: Global error tracking with recovery action suggestions
- Comprehensive Error Boundary: Automatic error classification with recovery strategies
- Badge Notifications: Real-time visual feedback for processing status
- AI Telemetry: Performance monitoring and usage analytics with automated recommendations
# Development with hot reload
npm run dev
# Production build
npm run build
# Type checking
npm run type-check
# Code linting
npm run lint# Unit tests (Vitest)
npm test
npm run test:coverage
# E2E tests (Playwright)
npm run test:e2e
npm run test:e2e:ui
npm run test:e2e:headed
# Integration tests (Puppeteer)
npm run test:puppeteer
npm run test:puppeteer:advanced
# Live API testing (requires Chrome AI)
npm run test:live- Setup:
npm installβnpm run build - Load Extension: Chrome Extensions β Load unpacked β
extension/dist/ - Development:
npm run devfor hot reload - Testing:
npm testfor unit tests,npm run test:e2efor full workflow - Quality:
npm run lintandnpm run type-checkbefore commits
- Prompt API: Multi-turn conversational AI for entity-relationship extraction
- Language Model: Chrome's built-in Gemini Nano for offline processing
- Availability: Chrome 140+ with AI flags enabled
- Storage: 22GB+ free space for AI model download
- Memory: 4GB+ VRAM OR 16GB+ RAM with 4+ CPU cores
- Network: Unmetered connection for initial model download (offline after setup)
- activeTab: Access current page content
- storage: Cache diagrams and settings with optimized storage strategy
- sidePanel: Display extension UI with real-time status updates
- scripting: Inject content scripts for DOM access
- offscreen: Dedicated AI processing context for improved reliability
- 100% Offline Processing: All AI analysis happens locally using Chrome's built-in Gemini Nano
- Zero External Requests: No user data ever leaves your device
- Local Storage Only: Diagrams cached in browser's local storage with 7-day expiration
- No Telemetry: No usage tracking, analytics, or data collection
- XSS Prevention: Comprehensive DOMPurify sanitization prevents script injection
- Input Validation: All user inputs validated and sanitized before processing
- Content Security Policy: Strict CSP prevents unauthorized script execution
- Secure Defaults: Conservative security settings with user control over data retention
- GDPR Ready: No personal data processing or storage
- Enterprise Safe: Suitable for corporate environments with strict data policies
- Audit Trail: All processing happens transparently with optional debug logging
"AI Model Not Available":
- Verify Chrome 140+ with AI flags enabled at
chrome://flags/ - Check system requirements (22GB storage, sufficient RAM/VRAM)
- Ensure stable internet connection for initial model download
"Download Failed":
- Click "Retry Download" button in extension sidepanel
- Restart Chrome and try again
- Check available storage space (need 22GB+ free)
Extension Won't Load:
- Verify build completed:
npm run build - Check
extension/dist/folder exists and contains files - Review Chrome extension console for error messages
Diagram Generation Fails:
- Ensure page has substantial text content (not just images/videos)
- Try selecting specific text instead of processing entire page
- Check browser console (F12) for detailed error messages
- Clear extension cache in settings if diagrams seem outdated
Slow Processing:
- Large pages take longer - try text selection for specific sections
- Check system resources (CPU/memory usage)
- Clear browser cache and restart Chrome
Memory Problems:
- Close other Chrome tabs to free memory
- Restart Chrome to clear memory leaks
- Check system meets minimum requirements
- Manual Testing Guide:
MANUAL_TESTING_GUIDE.md- Comprehensive testing procedures - Feature Documentation: This README covers all user-facing features
- Developer Guide:
CLAUDE.md- Complete development guidance and architecture - AI Pipeline Guide:
docs/AI_PIPELINE_COMPLETE_GUIDE.md- Multi-turn AI system architecture - Diagram Editor System:
docs/DIAGRAM_EDITOR_SYSTEM.md- Interactive editing documentation - Enhancement Priorities:
docs/CURRENT_ENHANCEMENT_PRIORITIES.md- Current roadmap and priorities
- Project Structure:
.kiro/steering/structure.md- File organization and naming conventions - Technology Stack:
.kiro/steering/tech.md- Build system and dependencies - Product Overview:
.kiro/steering/product.md- Features and target users
- Documentation Structure:
docs/DOCUMENTATION_STRUCTURE.md- Organization guide - Archived Analysis:
docs/archive/- Historical implementation and analysis documents
- Fork & Clone: Fork the repository and clone your fork
- Setup:
npm installβnpm run buildβ Load extension in Chrome - Development: Use
npm run devfor hot reload during development - Testing: Run
npm testandnpm run test:e2ebefore submitting
- Code Standards: Follow TypeScript and React best practices
- Testing: Add unit tests for new functionality, update E2E tests for UI changes
- Documentation: Update relevant documentation for new features
- Commit Messages: Use conventional commit format (feat:, fix:, docs:, etc.)
- Feature Branch: Create from
mainwith descriptive name - Quality Checks: Ensure
npm run lintandnpm run type-checkpass - Testing: Verify all tests pass with
npm testandnpm run test:e2e - Documentation: Update README and relevant docs for new features
- Review: Submit PR with clear description of changes and testing performed
Refer to .kiro/steering/ files for detailed coding standards and project structure guidelines.
MIT License - see LICENSE file for details
- Troubleshooting: Check the troubleshooting section above for common issues
- Documentation: Review
MANUAL_TESTING_GUIDE.mdfor detailed usage instructions - GitHub Issues: Search existing issues or create new ones with detailed information
When creating an issue, include:
- Chrome Version: From
chrome://version/ - Extension Version: From
chrome://extensions/ - System Info: OS, available RAM/storage
- Console Logs: Browser console errors (F12 β Console)
- Steps to Reproduce: Detailed steps to recreate the issue
- Expected vs Actual: What you expected vs what happened
Use the enhancement template in .github/ISSUE_TEMPLATE/ and include:
- Use Case: Why this feature would be valuable
- Proposed Solution: How you envision it working
- Alternatives: Other approaches you've considered