Codestin Search App

Gemini Text-To-Speech

Transform written content into speech using Google AI (Gemini) for text generation and internet-based information retrieval.

📜 Table of Contents

❓ How It Works

This project is based on an example in test/app.ts. It performs the following steps:

Fetches a voice input
Sends a request to the Google Gemini API to receive an AI-generated response
Automatically converts the response to speech using Text-To-Speech (TTS) technology
Plays the generated audio

📌 Project Note

This project has been tested on Linux (Ubuntu 24.04 LTS x86_64). Windows users can install SoX via SourceForge. MacOS-specific information is currently unavailable.

Task	Priority	Status
Implement Gemini Chat	High	✅ Completed
Develop Voice Recognition	High	✅ Completed
Implement Audio Language Detection	High	✅ Completed
Implement Text Language Detection	Medium	✅ Completed
Implement an Audio Player	Low	✅ Completed
Define Enums	Low	✅ Completed
Integrate Debugging	Low	✅ Completed

📦 Project Installation

Before using this repository, ensure the following dependencies are installed on your system:

Linux

SoX: sudo apt-get install sox
libsox-fmt-all: sudo apt-get install libsox-fmt-all
FFmpeg: sudo apt install ffmpeg

Windows

SoX: Download from SourceForge
FFmpeg: choco install ffmpeg (using Chocolatey) or Download from official website

MacOS

MacOS-specific installation instructions are not available at this time.

To install the package, use one of the following commands based on your preferred package manager:

# npm
$ npm install git+https://github.com/Stawa/GTTS.git --legacy-peer-deps
# Bun
$ bun install git+https://github.com/Stawa/GTTS.git --trust

📄 Project Examples

Before diving into the examples, ensure you have the following API keys and credentials:

Google Gemini API Key (lib.GoogleGemini)
- Obtain from Google Cloud Console
TikTok SessionID (lib.TextToSpeech)
- Extract from TikTok browser cookies after logging in
Google Speech API Key (lib.VoiceRecognition.fetchTranscriptGoogle)
- Generate from Google Cloud Console Credentials
Deepgram API Key (lib.VoiceRecognition.fetchTranscriptDeepgram)
- Create an account and obtain from Deepgram Console
EdenAI API Key (lib.SummarizeText)
- Sign up and retrieve from EdenAI Dashboard

Ensure to store these API keys securely and never commit them to version control. Consider using environment variables or a secure key management system.

Here's a concise example demonstrating how to generate a response using the Google Gemini API:

import { GoogleGemini } from "@stawa/gtts";
import dotenv from "dotenv";
dotenv.config();

const gemini = new GoogleGemini({
  apiKey: process.env.GEMINI_API_KEY,
  model: "gemini-1.5-flash",
  enableLogging: true,
});

async function main() {
  try {
    const question = "When was Facebook launched?";
    console.log(`Question: ${question}`);

    const response = await gemini.chat(question);
    console.log(`Gemini's response: ${response}`);
  } catch (error) {
    console.error("An error occurred:", error);
  }
}

main();

👥 Contributors

We appreciate the contributions of all our collaborators. Each person's effort helps make this project better. A special thanks to all our contributors who have helped shape this project!

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.github		.github
lib		lib
repo		repo
test		test
.env.example		.env.example
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
bun.lockb		bun.lockb
package.json		package.json
tsconfig.json		tsconfig.json
typedoc.json		typedoc.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gemini Text-To-Speech

📜 Table of Contents

❓ How It Works

📌 Project Note

📦 Project Installation

Linux

Windows

MacOS

📄 Project Examples

👥 Contributors

About

Uh oh!

Releases 2

Contributors 2

Uh oh!

Languages

License

Stawa/GTTS

Folders and files

Latest commit

History

Repository files navigation

Gemini Text-To-Speech

📜 Table of Contents

❓ How It Works

📌 Project Note

📦 Project Installation

Linux

Windows

MacOS

📄 Project Examples

👥 Contributors

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Contributors 2

Uh oh!

Languages