A full-stack web application for real-time video calls with AI companions, featuring WebRTC peer-to-peer streaming, intelligent conversations powered by Google Gemini, voice synthesis via ElevenLabs, lifelike avatars from D-ID, and conversation memory via LangMem.
- Frontend: React + TypeScript + Vite + Tailwind CSS (deployed on Vercel)
- Backend: Python + FastAPI + Socket.IO (deployed on Render)
- Database: Supabase (PostgreSQL + Auth + Storage)
- AI Services: Google Gemini, ElevenLabs, D-ID, LangMem
- Real-time: WebRTC + Socket.IO for signaling
- User authentication with Supabase Auth
- Browse and select AI companions
- Real-time video calls with WebRTC
- AI-powered conversations with LangMem context retention
- Voice synthesis for AI responses via ElevenLabs
- D-ID animated talking avatars
- Real-time text chat during calls
- Call recording and playback
- Responsive design for mobile and desktop
- WebRTC signaling via Socket.IO
- Row-level security for data protection
- Node.js 18+ and npm
- Python 3.11+
- Supabase account
- Google Gemini API key
- ElevenLabs API key
- D-ID API key
- LangMem (auto-initializes)
- Redis instance (optional, for caching)
- Twilio account (optional, for TURN server)
- Navigate to the frontend directory:
cd frontend- Install dependencies:
npm install- Copy
.env.exampleto.envand fill in your values:
cp .env.example .env- Start the development server:
npm run dev- Navigate to the backend directory:
cd backend- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Copy
.env.exampleto.envand fill in your values:
cp .env.example .env- Start the server:
python main.pyRun the migration file to create all required tables:
- Go to your Supabase project dashboard
- Navigate to SQL Editor
- Copy the contents of
supabase/migrations/001_initial_schema.sql - Execute the SQL
This creates the following tables with Row Level Security:
- profiles - User profiles
- companions - AI companion data
- video_rooms - Video call rooms
- messages - Chat messages
- call_recordings - Recording metadata
- conversation_contexts - Conversation history
See docs/architecture.md for detailed schema documentation.
VITE_SUPABASE_URL=your_supabase_url
VITE_SUPABASE_ANON_KEY=your_supabase_anon_key
VITE_BACKEND_URL=http://localhost:8000
VITE_WS_URL=http://localhost:8000
SUPABASE_URL=your_supabase_url
SUPABASE_SERVICE_KEY=your_supabase_service_key
GEMINI_API_KEY=your_gemini_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
DID_API_KEY=your_did_api_key
REDIS_URL=redis://localhost:6379
TWILIO_ACCOUNT_SID=your_twilio_account_sid
TWILIO_AUTH_TOKEN=your_twilio_auth_token
FRONTEND_URL=http://localhost:5173
PORT=8000
- Push your code to a Git repository
- Import the project in Vercel
- Set the root directory to
frontend - Add environment variables in Vercel dashboard
- Deploy
- Push your code to a Git repository
- Create a new Web Service in Render
- Set the root directory to
backend - Add environment variables in Render dashboard
- Deploy
Render will use the render.yaml configuration file automatically.
Once the backend is running, visit http://localhost:8000/docs for interactive API documentation.
GET /api/companions- List all companionsPOST /api/video/rooms- Create a video roomGET /api/webrtc/config- Get WebRTC configurationPOST /api/did/streams- Create D-ID avatar streamPOST /api/video/recordings- Upload call recording
join,offer,answer,candidate- WebRTC signalingchat_message- Real-time chatend_call- End video session
- React 18
- TypeScript
- Vite
- Tailwind CSS
- React Router
- Zustand (state management)
- Socket.IO Client
- Supabase JS Client
- date-fns
- FastAPI
- Python Socket.IO
- Supabase Python Client
- Google Generative AI (Gemini)
- ElevenLabs
- D-ID
- LangMem
- Redis
- Pydantic
project/
├── frontend/
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── pages/ # Page components
│ │ ├── hooks/ # Custom React hooks
│ │ ├── services/ # API and WebSocket services
│ │ ├── stores/ # Zustand stores
│ │ ├── contexts/ # React contexts
│ │ ├── types/ # TypeScript types
│ │ └── utils/ # Utility functions
│ └── package.json
└── backend/
├── routes/ # API routes
├── services/ # Business logic services
├── websocket/ # WebSocket handlers
├── models/ # Pydantic models
├── utils/ # Utility functions
├── config/ # Configuration
└── requirements.txt
cd frontend
npm run dev # Start dev server
npm run build # Build for production
npm run lint # Run linter
npm run typecheck # Type checkingcd backend
python main.py # Start dev serverMIT
- Architecture Documentation - Detailed system architecture, data flows, and technical specifications
- Setup Guide - Comprehensive setup and deployment instructions
- API Documentation - Interactive API docs (when backend is running)
- Conversation memory across sessions
- Context-aware AI responses
- User interaction history tracking
- Real-time animated talking avatars
- WebRTC-based avatar streaming
- Text-to-avatar speech synthesis
- Row Level Security on all tables
- JWT-based authentication
- Secure WebSocket connections
- Environment-specific configurations
- Modular service layer
- Proper error handling
- Production-ready configuration
- Comprehensive documentation
For issues and questions:
- Check the documentation in
docs/ - Review the setup guide
- Open an issue on the repository