This project provides real-time audio captioning using OpenAI's Whisper model and AI-powered summarization.
- Real-time audio recording
- Whisper transcription
- Gemini transcription
- AI summary
- Node.js (v18 or higher)
- Yarn (v4.5.3 or higher)
- Python (for Whisper)
-
Install dependencies:
yarn install
-
Set up Whisper:
# Install Whisper dependencies (Mac only) pip install -U mlx-whisper
-
Start the development server:
yarn dev
-
In a separate terminal, start the backend server:
cd server yarn dev -
Open your browser and navigate to
http://localhost:5173
/src- Frontend React application/server- Backend server with Whisper integration