[Examples] Enhance real-time transcription with VAD, word timestamps, and CLI options #2701
+1,000
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR builds upon #2696 and significantly enhances the real-time transcription example with production-ready features.
New Features
Voice Activity Detection (VAD)
--energy-threshold)Word-Level Timestamps
--word-timestampsflag shows timing for each wordSpeaker Change Detection (Experimental)
--detect-speakersprovides hints when speaker changes are detectedAudio Device Selection
--list-devicesto show available microphones--device-idto select a specific input deviceEnhanced User Experience
--output,--timestamps)Usage Examples
Basic usage
python examples/real_time_transcription.py
With word timestamps
python examples/real_time_transcription.py --word-timestamps
Save transcript with timestamps
python examples/real_time_transcription.py --output notes.txt --timestamps
Use specific model and language
python examples/real_time_transcription.py --model small --language es## Changes
examples/real_time_transcription.py- Complete rewrite with new featuresREADME.md- Updated documentation with usage examples