Marcel Timm, RhinoDevel, 2025
mt_tts is a C++ library for Linux and Windows that offers a pure C interface to the awesome text-to-speech system called Piper by Michael Hansen.
mt_tts supports:
- Text to wave file.
- Text to raw audio samples.
- Text to raw audio stream-to-stream conversion.
Take a look at the example showing a simple Speech-To-Text, Large-Language-Model, Text-To-Speech pipeline via mt_stt, mt_llm and mt_tts!
Clone the mt_tts repository:
git clone https://github.com/RhinoDevel/mt_tts.git
Enter the created folder:
cd mt_tts
Get the Piper submodule content:
git submodule update --init --recursive
If you want to build Piper in debug instead
of release mode, you need to modify the following in the file CMakeLists.txt:
fmt and spdlog under target_link_libraries( (two times) into fmtd and
spdlogd.
No details for Linux here, yet, but you can take a look at the Windows instructions below and at the Makefile.
Build Piper
- Open Visual Studio developer commandline.
- Go to the Piper submodule folder:
cd mt_tts\piper - Create build folder:
mkdir build - Enter build folder:
cd build - Prepare the build:
cmake .. - Start the build (this will also download stuff Piper needs from the internet):
cmake --build . --config Release(orcmake --build .for debug mode)
Test Piper (without mt_tts)
Copy the following (from different output directories in the build folder) into a new folder:
build\pi\share\espeak-ng-data(the whole folder)build\pi\bin\espeak-ng.dllbuild\pi\bin\piper_phonemize.dllbuild\pi\lib\onnxruntime.dllbuild\Release\piper.exe(orbuild\Debug\piper.exe)
Download a voice and its configuration, e.g. one for speech output in German language by Thorsten Müller:
Store these files in the same, new folder.
echo "Ich bin ein Mensch, Du auch?" | piper --model de_DE-thorsten-high.onnx --config de_DE-thorsten-high.onnx.json --debug --output_file test.wav
Play back the WAV file:
ffplay.exe test.wav
Test 2: Output directly to speakers with ffmpeg:
(ffmpeg parameters may not be optimal, in this example):
echo "Hallo, ich bin kein Mensch, was man auch einigermaßen leicht heraushören kann, meinst Du nicht auch? Trotzdem ein tolles TTS-System!" | piper --model de_DE-thorsten-high.onnx --config de_DE-thorsten-high.onnx.json --output_raw | ffplay.exe -f s16le -ar 22050 -
- Open solution
mt_tts.slnwith Visual Studio (tested with 2022). - Compile in release or debug mode.
- Get the DLL and LIB files resulting from the build, e.g. for release mode
x64\Release\mt_tts.dllandx64\Release\mt_tts.lib, copy them to a new folder. - Also copy the file
mt_tts\mt_tts.hto that new folder. - Copy the following stuff from Piper to the
new folder, too:
build\pi\share\espeak-ng-data(the whole folder)build\pi\bin\espeak-ng.dllbuild\pi\bin\piper_phonemize.dllbuild\pi\lib\onnxruntime.dll
- Also copy a voice model file and its configuration file to the same new folder.
- Open
x64 Native Tools Command Prompt for VS 2022commandline. - Go to the new folder and create a file
main.cwith the following code:
#include "mt_tts.h"
int main()
{
mt_tts_reinit("de_DE-thorsten-high.onnx", "de_DE-thorsten-high.onnx.json");
mt_tts_to_wav_file(
"Hallo, nun testen wir dieses kleine Hilfsmodul.", "output.wav");
mt_tts_deinit();
return 0;
}
- Compile via
cl main.c mt_tts.lib. - Run
main.exe, which will create the WAV fileoutput.wav.