A toolkit for binning / categorisation optimisation with respect to signal significance for HEP analyses, using gradient-descent methods. gatohep relies on TensorFlow with TensorFlow-Probability.
The categorisation can be performed directly in a multidimensional discriminant space, e.g. from a mutliclassifier with softmax activation. The bins are defined by learnable multidimensional Gaussians as a Gaussian Mixture Model (GMM), or, well working in 1D, using bin boundaries approximated by steep sigmoid functions of learnable position.
See the full documentation at https://gato-hep.readthedocs.io/.
git clone https://github.com/FloMau/gato-hep.git
cd gato-hep
python3 -m venv gato_env # or use conda
source gato_env/bin/activate
pip install -e .Dependencies are declared in pyproject.toml. Note: The only tricky part is to find matching versions of tensorflow, tensorflow-probability and ml-dtypes. The requirements mentioned here should work, however, other combinations may work as well.
python examples/1D_example/run_toy_example.pypython examples/three_class_softmax_example/run_example.pyEach script writes plots & a significance comparison table.
# standard GMM model for ND optimisation
from gatohep.models import gato_gmm_model
# more to be included here later on
# see ./examples for a full workflow!gato-hep/ project root
│
├─ pyproject.toml metadata + dependencies
├─ src/gatohep/ installable Python package
│ │
│ ├─ __init__.py
│ ├─ models.py Trainable model class
│ └─ losses.py custom loss / penalty terms
│ ├─ utils.py misc helpers
│ ├─ plotting_utils.py helper plots (stacked hists, bin boundaries, ...)
│ ├─ data_generation.py toy data generators (1D / 3-class softmax)
│
└─ examples/ runnable demos
├─ 1D_example/run_example.py
└─ three_class_softmax_example/run_example.py
git checkout -b feature/xyz- Code under src/gatohep/, add tests under tests/.
- Update version in pyproject.toml.
black/isort/pytest, then open a PR.