Add option to use speech to text API rather than transcribing locally 

Transcribing locally could be slow if the users GPU does not support CUDA. In that case it might be preferable to use the API instead. https://platform.openai.com/docs/guides/speech-to-text