🎙️ Real-Time Voice Translator (English ↔ Hindi)

A real-time, two-way voice translation agent built with Python that enables seamless spoken conversations between two people speaking different languages. The system listens to live speech, transcribes it, translates it using Google Gemini, and speaks the translated output instantly.

Perfect for cross-language conversations, demos, learning projects, and AI-powered communication tools.

✨ Features

🎤 Live speech recognition using microphone input
🌍 Bidirectional translation between two speakers
🤖 AI-powered translation via Google Gemini (gemini-1.5-flash)
🔊 Text-to-speech playback for translated output
🔁 Turn-based conversation flow
🛠️ Easily configurable language pairs

🧠 How it works

Person 1 speaks in their native language.
Speech is captured and transcribed using Google Speech Recognition.
The text is translated using the Gemini Generative AI model.
The translated text is converted into speech.
The translated output is played aloud for Person 2.
Roles switch and the process repeats.

🗣️ Supported languages (default)

You can extend this list — these are typical defaults included in the example:

English (en-US)
Hindi (hi-IN)
Spanish (es-ES)
French (fr-FR)
German (de-DE)
Italian (it-IT)
Portuguese (pt-BR)
Japanese (ja-JP)
Korean (ko-KR)

🧩 Tech stack

Python 3
speechrecognition
google-generativeai (Gemini API)
gTTS (Google Text-to-Speech)
playsound
python-dotenv

⚙️ Setup instructions

Clone the repository

git clone https://github.com/YashJha52/Translator.git
cd Translator

Install dependencies

pip install -r requirements.txt

Add environment variables

Create a .env file in the project root containing:

GOOGLE_API_KEY=your_google_gemini_api_key

Run the translator

python main.py

🔧 Configuration

You can change the language pair in main.py, for example:

agent = RealTimeTranslator(person1_lang='en-US', person2_lang='hi-IN')

Use any valid Google Speech Recognition / BCP-47 language code.

🚀 Use cases

🧑‍🤝‍🧑 Cross-language conversations
🎓 Language learning
🤝 International meetings & demos
🧠 AI & NLP experimentation
📢 Accessibility and communication tools

⚠️ Limitations

Requires an active internet connection.
Google Speech Recognition / Gemini API usage limits and costs may apply.
Accuracy depends on microphone quality and background noise.

🛣️ Future improvements

🔄 Continuous conversation detection (non turn-based)
🧑‍🤝‍🧑 Multi-speaker support
📱 GUI / Web interface
⚡ Streaming-based, lower-latency translation
🎧 Noise suppression & improved voice activity detection

Updated README formatting by GitHub Copilot for clarity and code blocks.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Translator/main		Translator/main
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Real-Time Voice Translator (English ↔ Hindi)

✨ Features

🧠 How it works

🗣️ Supported languages (default)

🧩 Tech stack

⚙️ Setup instructions

🔧 Configuration

🚀 Use cases

⚠️ Limitations

🛣️ Future improvements

About

Uh oh!

Releases

Packages

Languages

YashJha52/Translator

Folders and files

Latest commit

History

Repository files navigation

🎙️ Real-Time Voice Translator (English ↔ Hindi)

✨ Features

🧠 How it works

🗣️ Supported languages (default)

🧩 Tech stack

⚙️ Setup instructions

🔧 Configuration

🚀 Use cases

⚠️ Limitations

🛣️ Future improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages