Thanks to visit codestin.com
Credit goes to github.com

Skip to content

lightandfuture/yt-transcript

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

yt-transcript

A cross-platform command-line tool for downloading YouTube transcripts given a video URL. Built in Rust for zero runtime dependencies and fast execution.

Features

  • Download transcripts from any YouTube video
  • Multiple output formats: plain text, JSON, SRT
  • Accepts full URLs or raw video IDs
  • Save to file or output to stdout
  • Single binary with no runtime dependencies

System Requirements

To Run (Pre-built Binary)

  • Linux: x86_64 or aarch64, glibc 2.31+
  • macOS: 10.15+ (x86_64 or Apple Silicon)
  • Windows: 10+ (x86_64)
  • No additional runtime required

To Build from Source

  • Rust 1.70+ (install via rustup)
  • Internet connection (for fetching transcripts)

Installation

macOS (Recommended)

Install via Homebrew — no Gatekeeper warnings, automatic updates:

# Add the tap
brew tap lightandfuture/yt-transcript

# Install
brew install yt-transcript

# Run
yt-transcript --help

From Source

git clone https://github.com/lightandfuture/yt-transcript.git
cd yt-transcript
cargo install --path .

From Pre-built Binary

  1. Download the latest release from Releases

  2. Choose the binary matching your system:

    • yt-transcript-macos-arm64 (Apple Silicon Macs)
    • yt-transcript-macos-amd64 (Intel Macs)
    • yt-transcript-linux-amd64 (Linux x86_64)
    • yt-transcript-linux-arm64 (Linux ARM64)
    • yt-transcript-windows-amd64.exe (Windows)
  3. Make it executable and move to your PATH:

macOS / Linux:

chmod +x yt-transcript-macos-arm64
sudo mv yt-transcript-macos-arm64 /usr/local/bin/yt-transcript

Windows:

.\yt-transcript-windows-amd64.exe --help

macOS Gatekeeper Warning

If macOS blocks the downloaded binary:

Option 1: Right-click Open

  1. Right-click the binary in Finder
  2. Select Open → click Open in the dialog

Option 2: System Settings

  1. Go to System Settings > Privacy & Security
  2. Scroll to the bottom and click "Open Anyway"

Option 3: Remove Quarantine Attribute

xattr -d com.apple.quarantine yt-transcript-macos-arm64
chmod +x yt-transcript-macos-arm64

Usage

If installed in PATH:

yt-transcript [OPTIONS] <URL>

If running the downloaded binary directly:

./yt-transcript-macos-arm64 [OPTIONS] <URL>

Arguments

Argument Description
<URL> YouTube video URL or video ID

Options

Option Description Default
-l, --lang <LANG> Language code en
-f, --format <FORMAT> Output format: text, json, srt text
-o, --output <FILE> Output file path stdout
-v, --verbose Enable verbose output off
-h, --help Print help -
-V, --version Print version -

Examples

Basic usage:

yt-transcript https://www.youtube.com/watch?v=dQw4w4WgXcQ
# Or with downloaded binary:
./yt-transcript-macos-arm64 https://www.youtube.com/watch?v=dQw4w4WgXcQ

Output as JSON:

yt-transcript -f json https://www.youtube.com/watch?v=dQw4w4WgXcQ

Save to file as SRT subtitles:

yt-transcript -f srt -o subtitles.srt https://www.youtube.com/watch?v=dQw4w4WgXcQ

Use raw video ID:

yt-transcript dQw4w4WgXcQ

Verbose mode:

yt-transcript -v https://www.youtube.com/watch?v=dQw4w4WgXcQ

Output as JSON:

yt-transcript -f json https://www.youtube.com/watch?v=dQw4w4WgXcQ

Save to file as SRT subtitles:

yt-transcript -f srt -o subtitles.srt https://www.youtube.com/watch?v=dQw4w4WgXcQ

Use raw video ID:

yt-transcript dQw4w4WgXcQ

Verbose mode:

yt-transcript -v https://www.youtube.com/watch?v=dQw4w4WgXcQ

Supported URL Formats

  • https://www.youtube.com/watch?v=VIDEO_ID
  • https://youtu.be/VIDEO_ID
  • https://www.youtube.com/embed/VIDEO_ID
  • https://www.youtube.com/shorts/VIDEO_ID
  • Raw video ID (11 characters)

Output Formats

Text (default):

[00:00] Welcome to this video
[00:05] Today we'll be discussing...

JSON:

[
  {"text": "Welcome to this video", "start": 0.0, "duration": 5.0},
  {"text": "Today we'll be discussing...", "start": 5.0, "duration": 3.0}
]

SRT:

1
00:00:00,000 --> 00:00:05,000
Welcome to this video

2
00:00:05,000 --> 00:00:08,000
Today we'll be discussing...

Architecture

Technology Stack

Built in Rust for performance, type safety, and zero runtime dependencies. See design.md for the full language selection analysis.

Module Structure

yt_transcript/
├── Cargo.toml
├── src/
│   ├── main.rs          # CLI entry point, async main
│   ├── cli.rs           # Clap CLI definition
│   ├── transcript.rs    # Transcript fetching logic
│   ├── formatter.rs     # Output formatting (text, json, srt)
│   ├── url_parser.rs    # YouTube URL validation/extraction
│   └── error.rs         # Custom error types
├── tests/
│   └── integration.rs   # Integration tests
└── .github/
    └── workflows/
        └── ci.yml       # Build/test/release pipeline

Data Flow

User Input (URL)
    → url_parser.rs (validate & extract video ID)
    → transcript.rs (fetch transcript via youtube_transcript crate)
    → formatter.rs (format to requested output)
    → stdout or file output

Dependencies

Crate Purpose
clap CLI argument parsing with derive macros
youtube-transcript YouTube transcript fetching
serde / serde_json JSON serialization for output
thiserror Error handling
regex URL pattern matching
tokio Async runtime

Development

Build

cargo build

Run

cargo run -- <URL>

Test

cargo test

Lint

cargo clippy

Format

cargo fmt

System TODOs

Phase 1: Project Setup ✅

  • Initialize Rust project
  • Configure dependencies
  • Set up CI/CD (GitHub Actions)
  • Add README

Phase 2: Core Functionality ✅

  • YouTube URL parser/validator
  • Transcript fetching integration
  • Error handling
  • Language selection support (partial - see limitations)

Phase 3: CLI Interface ✅

  • Argument parsing with clap
  • Output format options (text, json, srt)
  • File output option
  • Verbose/debug mode

Phase 4: Cross-Platform Distribution ⏳

  • Configure cross-compilation with cross-rs
  • GitHub Actions for multi-platform builds
  • Release automation with binary uploads

Phase 5: Testing & Polish ⏳

  • Unit tests for URL parsing
  • Integration tests for transcript fetching
  • Man page documentation
  • Performance optimization

Known Limitations

  1. Language Support: The current youtube-transcript 0.3 crate only extracts English transcripts. The --lang flag is accepted but ignored. Full multi-language support requires either switching to a different library or implementing custom YouTube API calls.

  2. Auto-generated vs Manual Captions: The tool currently fetches the first available English caption track. It does not distinguish between auto-generated and manually created captions.

Future Enhancements

  • Batch processing multiple URLs
  • Auto-generated vs manual captions selection
  • Proxy support
  • Caching layer
  • Progress indicators for long transcripts
  • Translation support
  • Full multi-language transcript support

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors