Still in development
AquaTK is short for "Audio QUality Assessment Toolkit". It contains metrics that are popularly used in the evaluation of Neural Audio Synthesizers such as FAD and Kernel Distances. It also contains a pure python port of PEAQb, a C implementation of the Basic PEAQ algorithm.
Currently implemented metrics:
| Metric | Description |
|---|---|
| FAD | Frechet Audio Distance |
| KID | Kernel Inception Distance |
| PEAQb | Basic PEAQ |
| NDB/k | Number of Different Bins over K |
| SISDR | Scale-Invariant SDR |
| SNR | Signal-to-Noise Ratio |
| MAE | Mean Absolute Error |
| MSE | Mean Squared Error |
| KL | Kullback-Leibler Divergence |
You can install this repo using Git for now. PyPi support is coming soon.
Using UV (recommended):
uv add git+https://github.com/ashvala/AQUA-tk.gitUsing pip:
pip install git+https://github.com/ashvala/AQUA-tk.gitAquaTK has optional extras for different embedding extractors and features:
| Extra | Description |
|---|---|
vggish |
VGGish embedding extractor (TensorFlow) |
panns |
PANNs embedding extractor |
openl3 |
OpenL3 embedding extractor (requires Python <3.12) |
ui |
Streamlit web interface |
plotting |
Matplotlib for visualizations |
runner |
Librosa for audio processing |
dev |
Development dependencies (pytest) |
all |
All optional dependencies |
Install with extras:
uv add "git+https://github.com/ashvala/AQUA-tk.git[vggish,panns]"
# or install everything
uv add "git+https://github.com/ashvala/AQUA-tk.git[all]"git clone https://github.com/ashvala/AQUA-tk.git
cd AQUA-tk
uv sync --all-extrasDownload The following:
Put them in the embedding_extractors/models/vggish folder. However, as long as you have the files, you can set the paths manually by yourself when initializing the VGGish extractor:
from aquatk.embedding_extractors import VGGish
vggish_extractor = VGGish(path_to_checkpoint=PATH_TO_CHECKPOINT, path_to_pca_params=PATH_TO_PARAMS)Contributions are welcome! Please feel free to open an issue or a pull request. This repository will only improve with your involvement!
This repo is indexed on DeepWiki. It doesn't suck and should behave as something of a useful stopgap while more robust (and less verbose) documentation can be produced.
This package would not exist if it weren't for the following:
- Frechet Audio Distance on Google Research
- Frechet Audio Distance by @gudgud96
- The Kernel Inception Distance implementation on TorchMetrics
- PEAQb & PEAQb-Fast
- NDB/k by eitanrich
- VGGish model from Tensorflow's repo
License acknowledgements:
The program generally is released under the GPL license. However, it is important to mention the license for some of these models and implementations: FAD and VGGish from Google are provided under the Apache 2.0 license. @gudgud96's implementation for FAD is released under MIT license. PEAQb is released under the GPL license.