Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Is it possible to avoid losing all progress when it fails? #137

@jerkstorecaller

Description

@jerkstorecaller

I am trying to get subtitles for a long audio file, using Medium. It's the audio from a compilation from a spanish-language TV series.

My command was this:

pwcpp \
  --model "medium-q8_0" \
  --language "es" \
  --temperature 0.1 \
  --print_realtime true \
  --output-srt \
  ./input.wav

I ran it, saw it was doing great, so I left for a few hours. I came back to find it went crazy at around the 2 hour markk:

[02:03:58.320 --> 02:03:59.320]   ¿Qué pasa?
[02:03:59.320 --> 02:04:00.320]   ¿Qué pasa?
[02:04:00.320 --> 02:04:01.320]   ¿Qué pasa?
[02:04:01.320 --> 02:04:02.320]   ¿Qué pasa?
Progress:  37%
[02:04:02.320 --> 02:04:14.320]   ¿Qué pasa?
Progress:  38%
[02:04:14.320 --> 02:04:39.320]   ¿Qué pasa?
Progress:  38%

...repeat ad nauseum for 2.5 hours

[04:34:56.320 --> 04:35:11.320]   ¿Qué pasa?
Progress:  84%
[04:35:11.320 --> 04:35:26.320]   ¿Qué pasa?
Progress:  84%

I had previously tried another whisper.cpp-based transcription tool (Memo, using the non-quantized Medium model) and the same thing happened after 30min, with a different repeating line.

Is whisper.cpp inherently unreliable for long-running tasks? And could pwcpp provide a way to mitigate this? For example:

  • I didn't even get a partial srt file when I killed pwcpp. Always write the srt even if incomplete.
  • Provide the user a way to resume. Let us specify the timestamp to start at, and a file to resume from --begin-at-timestamp "02:04:39.320 --use-existing-srt ./previousoutput.srt"
    (or at least just --begin-at-timestamp. I could manually combine the two srt files.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions