Thanks to visit codestin.com
Credit goes to github.com

Skip to content

All-in-one language processing toolkit: translation, transcription & TTS via GUI, API or CLI

Notifications You must be signed in to change notification settings

Asi0Flammeus/Language-Toolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

120 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Language Toolkit

A comprehensive Python-based application for language processing tasks, featuring both a GUI interface and REST API. The toolkit provides advanced document translation, audio transcription, text-to-speech conversion, and multimedia processing capabilities.

πŸš€ Quick Start

GUI App

python main.py

API Server

python api_server.py

Access points:

✨ Features

Core Tools

  • PPTX Translation: Translate PowerPoint presentations with full formatting preservation
  • Text Translation: Multi-language text file translation using DeepL
  • Audio Transcription: Convert audio to text using OpenAI Whisper
  • Text-to-Speech: Generate natural speech from text using ElevenLabs
  • PPTX to PDF: Convert presentations to PDF format
  • Video Merging: Combine audio and images into video files
  • Transcript Cleaning: Advanced text processing and formatting
  • Reward Evaluation: Assess text quality based on custom metrics

Key Capabilities

  • Batch processing with recursive directory support
  • Real-time progress tracking
  • Multi-language support (30+ languages)
  • Asynchronous task processing
  • Smart file handling (single files or ZIP archives)

🌍 Supported Languages

The Language Toolkit supports 29 languages across translation (txt/pptx) and text-to-speech (TTS) services.

Language Support Matrix

Language Code Translation Provider TXT PPTX ElevenLabs TTS
Czech cs DeepL βœ… βœ… βœ…
German de DeepL βœ… βœ… βœ…
English en DeepL βœ… βœ… βœ…
Spanish es DeepL βœ… βœ… βœ…
Estonian et DeepL βœ… βœ… ❌
Farsi fa OpenAI βœ… βœ… ❌
Finnish fi DeepL βœ… βœ… βœ…
French fr DeepL βœ… βœ… βœ…
Hindi hi Google βœ… βœ… βœ…
Indonesian id DeepL βœ… βœ… βœ…
Italian it DeepL βœ… βœ… βœ…
Japanese ja DeepL βœ… βœ… βœ…
Korean ko DeepL βœ… βœ… βœ…
Norwegian nb-NO DeepL βœ… βœ… ❌
Dutch nl DeepL βœ… βœ… βœ…
Polish pl DeepL βœ… βœ… βœ…
Portuguese pt DeepL βœ… βœ… βœ…
Rundi rn Google βœ… βœ… ❌
Romanian ro DeepL βœ… βœ… βœ…
Russian ru DeepL βœ… βœ… βœ…
Sinhala si OpenAI βœ… βœ… ❌
Serbian (Latin) sr-Latn OpenAI βœ… βœ… ❌
Swedish sv DeepL βœ… βœ… βœ…
Swahili sw Google βœ… βœ… ❌
Thai th OpenAI βœ… βœ… ❌
Turkish tr DeepL βœ… βœ… βœ…
Vietnamese vi Google βœ… βœ… ❌
Chinese Simplified zh-Hans DeepL βœ… βœ… βœ…
Chinese Traditional zh-Hant DeepL βœ… βœ… βœ…

Provider Summary

Translation Providers:

  • DeepL (21 languages): Premium European & Asian language translation
  • Google Translate (4 languages): Broad language coverage for Hindi, Rundi, Swahili, Vietnamese
  • OpenAI GPT-4 (4 languages): Context-aware translation for Farsi, Sinhala, Serbian, Thai

Text-to-Speech:

  • ElevenLabs Multilingual V2 (20 languages): Natural voice synthesis with high quality
  • Not Supported (9 languages): Estonian, Farsi, Norwegian, Rundi, Sinhala, Serbian, Swahili, Thai, Vietnamese

Notes

  • Translation (TXT/PPTX): All 29 languages supported with automatic provider selection
  • TTS: 20 languages supported via ElevenLabs multilingual_v2 model
  • Provider Selection: Automatic based on target language (see language_provider.json)
  • Configuration: See elevenlabs_languages.json for TTS language mapping

πŸ“‹ Prerequisites

  • Python 3.8 or higher
  • API keys for:
    • DeepL (translation)
    • OpenAI (transcription)
    • ElevenLabs (text-to-speech)
    • ConvertAPI (PDF conversion)
    • Anthropic (optional, for reward evaluation)

πŸ”§ Installation

1. Clone the Repository

git clone https://github.com/Asi0Flammeus/Language-Toolkit.git
cd Language-Toolkit

2. Set Up Python Environment

# Create virtual environment
python3 -m venv env

# Activate environment
source env/bin/activate    # Linux/Mac
.\env\Scripts\activate      # Windows

# Install dependencies
pip3 install -r requirements.txt

3. Configure API Keys

Copy the example environment file and add your API keys:

cp .env.example .env

Then edit .env with your API keys:

# API Keys
DEEPL_API_KEY=your-deepl-api-key
OPENAI_API_KEY=your-openai-api-key
ELEVENLABS_API_KEY=your-elevenlabs-api-key
CONVERTAPI_SECRET=your-convertapi-secret
ANTHROPIC_API_KEY=your-anthropic-api-key

4. Configure Languages

Create supported_languages.json:

{
  "source_languages": {
    "en": "English",
    "fr": "French",
    "de": "German",
    "es": "Spanish"
  },
  "target_languages": {
    "en": "English",
    "fr": "French",
    "de": "German",
    "es": "Spanish"
  }
}

πŸ–₯️ Usage

Quick Start Script

Use the provided script to pull latest changes and start the application:

Linux/Mac:

./start_app.sh

Windows:

start_app.bat

The startup script will:

  1. Pull latest changes from git
  2. Detect and activate the virtual environment (venv or env)
  3. Update dependencies from requirements.txt
  4. Start the GUI application

GUI Application

  1. Launch the application: python main.py
  2. Select the desired tool tab
  3. Choose processing mode (single file or folder)
  4. Select languages (for translation tools)
  5. Choose input files and output directory
  6. Click "Process" to start

API Server

  1. Start the server: python api_server.py
  2. Access documentation at http://localhost:8000/docs
  3. Use authentication token for API requests
  4. Monitor task progress via task endpoints

πŸ“ Project Structure

Language-Toolkit/
β”œβ”€β”€ main.py                 # GUI application entry point
β”œβ”€β”€ api_server.py          # FastAPI server
β”œβ”€β”€ ui/                    # GUI components
β”‚   β”œβ”€β”€ base_tool.py       # Base tool class
β”‚   └── mixins.py          # Shared UI mixins
β”œβ”€β”€ tools/                 # Tool implementations
β”‚   β”œβ”€β”€ text_to_speech.py
β”‚   β”œβ”€β”€ audio_transcription.py
β”‚   β”œβ”€β”€ pptx_translation.py
β”‚   └── ...
β”œβ”€β”€ services/              # Business logic
β”‚   β”œβ”€β”€ translation.py
β”‚   β”œβ”€β”€ transcription.py
β”‚   └── ...
β”œβ”€β”€ utils/                 # Utility functions
β”œβ”€β”€ docs/                  # Documentation
β”‚   β”œβ”€β”€ api/              # API documentation
β”‚   β”œβ”€β”€ deployment/       # Deployment guides
β”‚   └── development/      # Development guides
└── tests/                # Test suite

πŸ“š Documentation

πŸ§ͺ Testing

# Run test suite
pytest tests/

# Run with coverage
pytest --cov=. tests/

🐳 Docker Support

# Build and run with Docker Compose
docker-compose up --build

# Or use individual containers
docker build -t language-toolkit .
docker run -p 8000:8000 language-toolkit

🀝 Contributing

We welcome contributions! Please see our Development Guide for details on:

  • Setting up your development environment
  • Code style guidelines
  • Testing requirements
  • Pull request process

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support


Made with ❀️ by asi0 and Claude agents

About

All-in-one language processing toolkit: translation, transcription & TTS via GUI, API or CLI

Topics

Resources

Stars

Watchers

Forks

Contributors 6

Languages