API examples using 8-mile for audio
The codebase relies primarily on 8-mile (mead-layers) for its modeling and optimization code.
Whats left is pretty much just training and inference code
The code depends on:
editdistance(for error evaluation)numpysixsoundfilemead-baselinepytorch
There are a few optional dependencies
scipy(for on-the-fly resampling of wav files)ctcdecode(for prefix beam decoding with optional LM)