GemTTS - Qwen3-TTS Desktop Application

A cross-platform desktop application for Text-to-Speech using Qwen3-TTS models with voice cloning, custom voices, and voice design capabilities.

Features

Voice Cloning: Clone any voice from a reference audio sample
TTS Custom Voice: Generate speech using preset voice models
Voice Design: Create custom voices by adjusting age, gender, accent, and emotion
Auto Model Download: Automatically downloads Qwen3-TTS models from HuggingFace
Cross-Platform: Works on Windows, macOS, and Linux
Dark/Light Mode: Comfortable interface for any lighting condition

Architecture

Frontend: Electron + React + TypeScript + Tailwind CSS
Backend: Python + FastAPI + PyTorch
Models: Qwen3-TTS-1.7B from HuggingFace

Prerequisites

Node.js 18+ and npm
Python 3.9+
Git
8GB+ RAM recommended
10GB+ disk space for models
GPU optional but recommended (CUDA-compatible)

Installation

1. Clone the repository

cd c:\Development\GemTTS

2. Set up Python Backend

cd backend

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
# source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

3. Set up Frontend

cd ../frontend

# Install dependencies
npm install

Running the Application

Development Mode

Option 1: Run Backend and Frontend Separately

Terminal 1 - Backend:

cd backend
venv\Scripts\activate  # On Windows
python main.py

Terminal 2 - Frontend:

cd frontend
npm run electron:dev

Option 2: Run Everything Together

cd frontend
npm run electron:dev

(This will automatically start the Python backend)

Production Build

cd frontend
npm run electron:build:win   # For Windows
npm run electron:build:mac   # For macOS
npm run electron:build:linux # For Linux

The built application will be in frontend/dist-electron/

Usage

First Launch: The application will prompt you to download the Qwen3-TTS models. This is a one-time setup that may take 10-30 minutes depending on your internet connection.
Voice Cloning Tab:
- Upload a reference audio file (WAV, MP3, etc.)
- Enter the text you want to speak
- Adjust similarity and speed parameters
- Click "Generate Voice"
TTS Custom Voice Tab:
- Select a preset voice from the dropdown
- Enter your text
- Adjust speed and pitch
- Click "Generate Speech"
Voice Design Tab:
- Enter your text
- Adjust age, gender, accent, and emotion sliders
- Save/load presets for later use
- Click "Generate Voice"

Project Structure

GemTTS/
├── backend/
│   ├── main.py              # FastAPI server
│   ├── model_manager.py     # Model download and management
│   ├── tts_processor.py     # TTS inference logic
│   ├── requirements.txt     # Python dependencies
│   ├── models/              # Downloaded models (auto-created)
│   ├── uploads/             # Uploaded audio files (auto-created)
│   └── outputs/             # Generated audio (auto-created)
├── frontend/
│   ├── electron/
│   │   ├── main.js          # Electron main process
│   │   └── preload.js       # Electron preload script
│   ├── src/
│   │   ├── components/      # React components
│   │   │   ├── VoiceCloning.tsx
│   │   │   ├── TTSCustomVoice.tsx
│   │   │   ├── VoiceDesign.tsx
│   │   │   └── ModelsStatus.tsx
│   │   ├── api/
│   │   │   └── apiService.ts # API client
│   │   ├── App.tsx          # Main app component
│   │   ├── App.css          # Styles
│   │   └── main.tsx         # Entry point
│   ├── package.json
│   ├── vite.config.ts
│   └── tsconfig.json
├── README.md
└── LICENSE

Configuration

Backend Configuration

Edit backend/main.py to change:

API host/port (default: 127.0.0.1:8000)
Model paths
Output settings

Frontend Configuration

Edit frontend/src/api/apiService.ts to change:

API endpoint URL
Request timeouts

Troubleshooting

Models Not Downloading

Check your internet connection
Ensure you have enough disk space (10GB+)
Check HuggingFace is accessible from your network

Audio Generation Fails

Ensure models are fully downloaded
Check Python backend logs in the terminal
Verify your system has enough RAM (8GB+ recommended)

GPU Not Being Used

Install CUDA toolkit (11.8+)

Install PyTorch with CUDA support:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Electron App Won't Start

Ensure Python backend is running first
Check that port 8000 is not in use
Look for errors in the terminal

Development

Adding New Features

Backend: Add new endpoints in main.py and processing logic in tts_processor.py
Frontend: Create new components in src/components/ and wire them up in App.tsx

Testing

# Backend
cd backend
pytest

# Frontend
cd frontend
npm test

License

MIT License - see LICENSE file for details

Acknowledgments

Qwen3-TTS by Alibaba Cloud
Built with Electron, React, FastAPI, and PyTorch

Support

For issues and questions, please open an issue on GitHub.

Roadmap

Batch processing multiple texts
Voice preset library with community voices
Real-time voice morphing
SSML support for advanced text markup
Multi-language support
Voice fine-tuning interface
Audio effects and post-processing
Export to multiple formats

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
setup.bat		setup.bat
setup.sh		setup.sh
start.bat		start.bat
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GemTTS - Qwen3-TTS Desktop Application

Features

Architecture

Prerequisites

Installation

1. Clone the repository

2. Set up Python Backend

3. Set up Frontend

Running the Application

Development Mode

Option 1: Run Backend and Frontend Separately

Option 2: Run Everything Together

Production Build

Usage

Project Structure

Configuration

Backend Configuration

Frontend Configuration

Troubleshooting

Models Not Downloading

Audio Generation Fails

GPU Not Being Used

Electron App Won't Start

Development

Adding New Features

Testing

License

Acknowledgments

Support

Roadmap

About

Uh oh!

Releases

Packages

Languages

License

MTGMAD/GemTTS

Folders and files

Latest commit

History

Repository files navigation

GemTTS - Qwen3-TTS Desktop Application

Features

Architecture

Prerequisites

Installation

1. Clone the repository

2. Set up Python Backend

3. Set up Frontend

Running the Application

Development Mode

Option 1: Run Backend and Frontend Separately

Option 2: Run Everything Together

Production Build

Usage

Project Structure

Configuration

Backend Configuration

Frontend Configuration

Troubleshooting

Models Not Downloading

Audio Generation Fails

GPU Not Being Used

Electron App Won't Start

Development

Adding New Features

Testing

License

Acknowledgments

Support

Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages