Tokenizers with highly compressed latent spaces -- such as TiTok, which compresses 256x256 px images into just 32 discrete tokens -- can be used to perform various image generative tasks without training a dedicated generative model at all. In particular, we show that simple test-time optimization of tokens according to arbitrary user-specified objective functions can be used for tasks such as text-guided editing or inpainting.
This repo includes the simple test-time optimization algorithm used in
our ICML 2025 paper, "Highly Compressed Tokenzier Can Generate Without
Training", under the tto/ directory.
For convenience, we include the TiTok implementation copied from the
official code release
under titok/.
-
Text-guided image editing:
notebooks/clip_opt.ipynbfor test-time optimization with CLIP objective. -
Inpainting:
notebooks/inpainting.ipynbfor inpainting via reconstruction objective.
Running Locally: If you use Nix, you can enter a
shell with all dependencies via nix develop.