Thanks to visit codestin.com
Credit goes to github.com

Skip to content

GhostTypes/gemini-discord-bot

Repository files navigation

Gemini Discord Bot ๐Ÿค–

Node.JS TypeScript Discord.JS Genkit GenAI ESLint Vite Prisma

An advanced AI-powered Discord bot featuring multimodal content processing, sophisticated game systems, and intelligent conversation management ๐Ÿš€

๐ŸŽฏ Core Features

Feature Description
Real-time Streaming Chat AI responses stream in real-time with intelligent message editing and context awareness
Dynamic Personality System AI-powered conversation tone detection with configurable personalities and adaptive response styles
Advanced Image Editing Google Imagen integration for multi-image editing operations with direct prompt processing
Multimodal Content Processing Comprehensive support for images, videos, PDFs, URLs, and rich media analysis
Complete Game System Six fully-featured games with AI integration including TicTacToe, RPG, GeoGuesser, and more
Intelligent Message Routing AI-powered intent classification and specialized flow orchestration
Advanced Security Framework Standalone prompt injection testing, hierarchical operators, and comprehensive security validation
Real-time Debug UI Modular flow monitoring interface with live statistics, WebSocket updates, and comprehensive debugging
Message Cache System Sliding window conversation cache with 64-message threshold and context optimization
Content Detection Engine Generic attachment caching and multimodal content analysis
Voice Generation Multi-voice text-to-speech with Google AI integration

๐Ÿ—ฃ๏ธ Voice Generation System

Feature Description
Multi-Voice Support 26 distinct AI voices with diverse personalities including Charon (professional), Fenrir (excitable), Aoede (musical), and Kore (authoritative)
Advanced Audio Processing PCM to WAV conversion, duration calculation, and waveform visualization
Discord Integration Seamless audio file delivery with metadata and interactive voice selection
Content Filtering Built-in safety validation and content moderation

๐ŸŽฎ Complete Game System

Game Description
AI Uprising Immersive text-based RPG with AI-powered narrative, combat system, character progression, and equipment management
TicTacToe Classic game with intelligent AI opponent featuring three difficulty levels and strategic decision-making
GeoGuesser Geographic guessing game with AI validation, curated location database, and Mapillary street imagery
Blackjack Full casino-style card game with betting system, chip management, and standard blackjack rules
Hangman Word guessing game with AI word generation, visual progression, and progressive hint system
WordScramble Programming-focused vocabulary puzzle with multiplayer support and technology terms

๐ŸŽญ Dynamic Personality System

Feature Description
AI-Powered Tone Detection Analyzes user messages and conversation context to determine optimal response personality
Configurable Personalities JSON-based personality definitions with custom system prompts and behavioral patterns
Context-Aware Selection Intelligent personality routing based on message content, user intent, and conversation history
Enhanced Media Awareness Personality system includes advanced media processing capabilities with conversation history context and intelligent media filtering
Fallback Reliability Graceful degradation to general personality for consistent user experience
Structured Output Validation Zod schema validation ensuring reliable personality detection and selection

โœจ Advanced Image Editing

Capability Description
Multi-Image Operations Support for combining, merging, and complex editing operations across multiple images
Direct Prompt Processing Takes user editing instructions exactly as-is without parsing or modification
Google Imagen Integration Optimized parameters and specialized prompting for professional image editing
Comprehensive Scenarios Single image edits, multi-image combining, style modifications, and object manipulation
Type-Safe Processing Zod schema validation throughout the image editing pipeline

๐ŸŽจ AI-Powered Content Generation

Capability Description
Image Generation Gemini 2.0 Flash image generation with natural language prompt parsing and multiple artistic styles
Code Execution Server-side Python code execution with real-time streaming and comprehensive error handling
Search Grounding Real-time web search integration with citation support and source verification
Video Analysis YouTube and general video processing with multimodal AI understanding from attachments and conversation history
PDF Processing Document analysis and content extraction with streaming responses from both current and cached documents
URL Context Web page analysis and content summarization with intelligent routing
Conversation History Media AI-powered intelligent access and analysis of images, videos, and PDFs from recent conversation history when contextually relevant
Smart Media Filtering AI-powered relevance detection that automatically filters conversation media to include only contextually relevant items, reducing token usage and improving response accuracy

๐Ÿง  Intelligent Message Processing

Component Description
Flow Orchestrator Central routing hub analyzing content and directing to appropriate specialized processing flows with configurable AI routing models
Content Detection Service Comprehensive analysis of message content to determine optimal processing strategies
Message Validator Response strategy determination based on mentions, replies, game mode, and autonomous opportunities
Context Optimization AI-powered conversation history filtering with relevance scoring and token optimization
Media Relevance Engine Advanced AI-powered system that analyzes cached media from conversation history to determine contextual relevance to current user requests, featuring configurable thresholds, multi-dimensional scoring, and performance optimization
Attachment Caching Generic caching system eliminating duplicate downloads and enabling instant access
MCP Integration Model Context Protocol server management with dynamic tool discovery, parameter resolution, and intelligent tool chaining capabilities

๐Ÿ”ง Model Context Protocol (MCP) Integration

Feature Description
Dynamic Server Discovery Automatic MCP server detection and connection management with configuration caching
Intelligent Tool Selection AI-powered analysis of available MCP tools with automatic parameter mapping and validation
Tool Chaining Advanced capability to chain multiple MCP tools together for complex multi-step operations
Parameter Resolution Dynamic parameter resolution system that intelligently maps Discord message content to MCP tool requirements
Configurable Routing Flexible routing model configuration allowing different AI models for different analysis tasks
Real-time Tool Discovery Live discovery and integration of new MCP tools without bot restart
Schema Validation Comprehensive validation of MCP tool schemas with automatic compatibility checking

๐Ÿ” Security & Advanced Testing Framework

Feature Description
Prompt Injection Testing Standalone testing framework for validating system prompt resilience against injection attacks
Centralized System Prompts Unified prompt management system with personality-specific prompt loading and caching
Hierarchical Operators Primary operator from environment with sub-operator management capabilities
Channel Whitelisting Separate bot and autonomous response whitelists with database persistence
Domain Security Strict domain whitelisting for media processing (Discord CDN, approved platforms only)
Content Validation Comprehensive file size, type, and security validation for all attachments
Natural Language Auth Conversational interface for managing operators and whitelists through AI understanding

๐Ÿ“Š Advanced Data Management

System Description
Message Cache Service Sliding window conversation cache with automatic initialization and context management
Game State Persistence Complete game state serialization and recovery with timeout management
Prisma Database SQLite-based storage for users, channels, messages, and game sessions
Relevance Scoring Multi-dimensional conversation analysis for intelligent context optimization
Generic Attachment System Unified caching architecture for images, PDFs, videos, and future content types

๐Ÿ”ง Advanced Flow Debug UI

Component Description
Real-time Flow Monitoring Live WebSocket-powered monitoring of all AI flow executions with instant updates
Modular Architecture Split from monolithic to component-based architecture with AppHeader, ConnectionStatus, ControlPanel, and FlowList
Intelligent Flow Sequences Automatic detection and visualization of flow delegation patterns (MESSAGE_ROUTING โ†’ CONVERSATION โ†’ AUTH_FLOW)
Interactive Flow Cards Comprehensive flow metadata display with expandable content, action buttons, and performance analysis
Advanced Filtering System Real-time filtering by flow type, user ID, with debounced input handling and persistent state
Connection Management Detailed connection status tracking, uptime statistics, and automatic reconnection handling
Accessibility Support WCAG compliance with keyboard navigation, screen reader announcements, and keyboard shortcuts
Export & Analysis Data export capabilities, performance metrics, and comprehensive debugging information

Usage:

# Start debug UI development server
npm run dev:debug

# Run bot with debug UI simultaneously  
npm run dev:debug-full

# Run bot with monitoring only (no UI)
npm run dev:monitor

The debug interface runs on http://localhost:3001 with a dark-themed interface featuring Bootstrap 5 styling, real-time WebSocket updates, and comprehensive flow analysis capabilities.

๐Ÿš€ Getting Started

Prerequisites

  • Node.js 18+ with NPM package manager
  • Discord Bot Token from Discord Developer Portal
  • Google AI API Key with Genkit access
  • Windows environment (optimized for Windows development)

Installation

# 1. Clone the repository
git clone https://github.com/your-username/gemini-discord-bot-rewrite.git
cd gemini-discord-bot-rewrite
# 2. Install dependencies
npm install
# 3. Set up environment variables
cp .env.example .env
# Edit .env with your API keys and configuration
# 4. Initialize the database
npm run db:init
# 5. Build the project
npm run build
# 6. Start the bot
npm start
# 7. For development with hot reload
npm run dev

About

Versatile Discord Bot powered by Gemini

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published