Transcribing locally could be slow if the users GPU does not support CUDA. In that case it might be preferable to use the API instead. https://platform.openai.com/docs/guides/speech-to-text