Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Nov 5, 2024. It is now read-only.
/ GTTS Public archive

This project converts written material into speech by using Google AI (Gemini) for text creation or internet searches.

License

Notifications You must be signed in to change notification settings

Stawa/GTTS

Repository files navigation

Gemini Icon Gemini Text-To-Speech Gemini Icon

Transform written content into speech using Google AI (Gemini) for text generation and internet-based information retrieval.

Google Gemini Made with TypeScript Powered by Bun Documentation SonarCloud Reliability Rating


📜 Table of Contents

  1. How It Works
  2. Project Note
  3. Project Installation
  4. Project Examples
  5. Contributors

❓ How It Works

This project is based on an example in test/app.ts. It performs the following steps:

  1. Fetches a voice input
  2. Sends a request to the Google Gemini API to receive an AI-generated response
  3. Automatically converts the response to speech using Text-To-Speech (TTS) technology
  4. Plays the generated audio

📌 Project Note

This project has been tested on Linux (Ubuntu 24.04 LTS x86_64). Windows users can install SoX via SourceForge. MacOS-specific information is currently unavailable.

Task Priority Status
Implement Gemini Chat High ✅ Completed
Develop Voice Recognition High ✅ Completed
Implement Audio Language Detection High ✅ Completed
Implement Text Language Detection Medium ✅ Completed
Implement an Audio Player Low ✅ Completed
Define Enums Low ✅ Completed
Integrate Debugging Low ✅ Completed

📦 Project Installation

Before using this repository, ensure the following dependencies are installed on your system:

Linux

  • SoX: sudo apt-get install sox
  • libsox-fmt-all: sudo apt-get install libsox-fmt-all
  • FFmpeg: sudo apt install ffmpeg

Windows

MacOS

MacOS-specific installation instructions are not available at this time.

To install the package, use one of the following commands based on your preferred package manager:

# npm
$ npm install git+https://github.com/Stawa/GTTS.git --legacy-peer-deps
# Bun
$ bun install git+https://github.com/Stawa/GTTS.git --trust

📄 Project Examples

Before diving into the examples, ensure you have the following API keys and credentials:

  • Google Gemini API Key (lib.GoogleGemini)
  • TikTok SessionID (lib.TextToSpeech)
    • Extract from TikTok browser cookies after logging in
  • Google Speech API Key (lib.VoiceRecognition.fetchTranscriptGoogle)
  • Deepgram API Key (lib.VoiceRecognition.fetchTranscriptDeepgram)
  • EdenAI API Key (lib.SummarizeText)

Ensure to store these API keys securely and never commit them to version control. Consider using environment variables or a secure key management system.

Here's a concise example demonstrating how to generate a response using the Google Gemini API:

import { GoogleGemini } from "@stawa/gtts";
import dotenv from "dotenv";
dotenv.config();

const gemini = new GoogleGemini({
  apiKey: process.env.GEMINI_API_KEY,
  model: "gemini-1.5-flash",
  enableLogging: true,
});

async function main() {
  try {
    const question = "When was Facebook launched?";
    console.log(`Question: ${question}`);

    const response = await gemini.chat(question);
    console.log(`Gemini's response: ${response}`);
  } catch (error) {
    console.error("An error occurred:", error);
  }
}

main();

👥 Contributors

We appreciate the contributions of all our collaborators. Each person's effort helps make this project better. A special thanks to all our contributors who have helped shape this project!

Contributors

About

This project converts written material into speech by using Google AI (Gemini) for text creation or internet searches.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •