Highly Compressed Tokenzier Can Generate Without Training

Tokenizers with highly compressed latent spaces -- such as TiTok, which compresses 256x256 px images into just 32 discrete tokens -- can be used to perform various image generative tasks without training a dedicated generative model at all. In particular, we show that simple test-time optimization of tokens according to arbitrary user-specified objective functions can be used for tasks such as text-guided editing or inpainting.

This repo includes the simple test-time optimization algorithm used in our ICML 2025 paper, "Highly Compressed Tokenzier Can Generate Without Training", under the tto/ directory.

For convenience, we include the TiTok implementation copied from the official code release under titok/.

Examples

Text-guided image editing: notebooks/clip_opt.ipynb for test-time optimization with CLIP objective.
Inpainting: notebooks/inpainting.ipynb for inpainting via reconstruction objective.

Running Locally: If you use Nix, you can enter a shell with all dependencies via nix develop.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
notebooks		notebooks
titok		titok
tto		tto
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Highly Compressed Tokenzier Can Generate Without Training

Examples

About

Uh oh!

Languages

License

lukaslaobeyer/token-opt

Folders and files

Latest commit

History

Repository files navigation

Highly Compressed Tokenzier Can Generate Without Training

Examples

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages