This tool automates the process of transcribing video files using multiple transcription services: Azure OpenAI, Groq, and OpenAI APIs. It converts video files to MP3 format, transcribes the audio, and saves the transcription as a text file.
- Converts video files to MP3 format using ffmpeg
- Supports transcription using Azure OpenAI, Groq, and OpenAI APIs
- Supports processing of individual video files or entire directories
- Cleans up temporary MP3 files after transcription
- Provides flexibility in selecting the transcription service via configuration
- Python 3.6+
- ffmpeg installed and available in the system PATH
- API access for one or more supported services:
- Azure OpenAI API
- Groq Cloud API
- OpenAI API
-
Clone this repository:
git clone https://github.com/nkkko/sapat.git cd sapat -
Install the required Python packages:
pip install -r requirements.txt -
Create a
.envfile in the project root and add your API credentials:# Azure OpenAI AZURE_OPENAI_API_KEY=your_azure_api_key_here AZURE_OPENAI_ENDPOINT=https://DEPLOYMENTENDPOINTNAME.openai.azure.com AZURE_OPENAI_DEPLOYMENT_NAME_WHISPER=whisper AZURE_OPENAI_API_VERSION_WHISPER=2024-06-01 AZURE_OPENAI_DEPLOYMENT_NAME_CHAT=gpt-4o AZURE_OPENAI_API_VERSION_CHAT=2023-03-15-preview # Groq GROQCLOUD_API_KEY=your_groq_api_key_here GROQCLOUD_MODEL=whisper-large-v3-turbo GROQCLOUD_API_ENDPOINT=https://api.groq.com/openai/v1/audio/transcriptions GROQCLOUD_MODEL_NAME_CHAT=llama3-8b-8192 # OpenAI OPENAI_API_KEY=your_openai_api_key_here OPENAI_MODEL=whisper-1 OPENAI_API_ENDPOINT=https://api.openai.com/v1/audio/transcriptions OPENAI_MODEL_NAME_CHAT=gpt-4o
-
Build the Distribution: Now, use build to create your package:
python -m build
This will create the wheel file in the dist directory, just like before.
-
Install the Wheel: Install as before:
pip install dist/sapat-0.1.1-py3-none-any.whl # Replace with the actual filename -
Add to PATH (if needed): The installation process usually automatically adds the script to your PATH. If not, you'll need to add the location of the installed script to your system's PATH environment variable. The location will be something like:
- Linux/macOS:
~/.local/bin - Windows:
%USERPROFILE%\AppData\Local\Programs\Python\Python39\Scripts(replacePython39with your Python version)
For example, on Linux/macOS:
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc source ~/.zshrc
- Linux/macOS:
Steps to Set Up Daytona Workspace
-
Create Daytona Workspace:
daytona create https://github.com/nkkko/sapat --code
-
There is no second step.
Run the script with a video file or directory as an argument:
sapat <video_file_or_directory> [--language <language>] [--prompt <prompt>] [--temperature <temperature>]
--language: Specify the language of the audio (default: "en").--prompt: Optional prompt to guide the model's transcription.--temperature: The sampling temperature, between 0 and 1 (default: 0).--quality: Quality of the MP3 audio: 'L' for low, 'M' for medium, and 'H' for high (default: 'M').--api: Specify the API to use for transcription.--api azurefor Azure OpenAI API--api groqfor Groq Cloud API--api openaifor OpenAI API
Example:
sapat my_video.mp4 --quality H --language es --prompt "This is a test prompt" --temperature 0.5 --api groq
- If a file is provided, it will process that single file.
- If a directory is provided, it will process all
.mp4files in that directory.
The script will create a .txt file with the same name as the input video file, containing the transcription.
This tool is designed for use with multiple APIs (Azure OpenAI, Groq, and OpenAI). Ensure you have valid API credentials configured in the .env file and the necessary permissions and credits for the API service you plan to use.
_____ _____ ____________________ _____
__/\____\\___ /_____\\\\\\\\\\\\\\\\\\\\\/____/___/\__
/__/\/_____\\_//_______\\\\\\\\\\\\\\\\\\\\\/__/_/\___\/\
\_\/ 444444 666666 1111 88888 222222 \_\/ \_\/
/\ 4 4 6 1 1 8 8 2 2 /\ /\
/ \ 4 4 6 1 1 8 8 2 / \ / \
\ \ 444444 666666 1 1 888888 2 / \/
\ / 4 6 6 1 1 8 8 2 / /\ /\
\/ 4 6 6 1 1 8 8 2 / / \/ \
/\ 4 66666 11111 888888 222222/ / \
/ \ \_/ \
/ \ >>>>>>> S.A.P.A.T. SYSTEM ONLINE <<<<<<< \
\ \ >>>> SYNTHESIZING AUDIO PROCESSING <<<<< /
\ / >>>> AND TRANSCRIPTION TECHNOLOGY <<<<< __________/
\ / >>>>> INITIALIZING NEURAL MATRIX <<<<<</
\/ >>>>>>>> PREPARE FOR UPLOAD <<<<<<<<<</
/\ /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
/ \/ \____________
/ /\ /\ /\ /\ /\ /\ /\ /\ /\ /\ /\ /\ /\
/______/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \_
This updated README includes the correct .env example and the additional steps for building and installing the Python package, as well as ensuring that the script is available in the PATH if needed.