Esoteric Language Models

Subham Sekhar Sahoo^*¹, Zhihan Yang^*², Yash Akhauri^†1, Johnna Liu^†1, Deepansha Singh^†1, Zhoujun Cheng^†3, Zhengzhong Liu³, Eric Xing³, John Thickstun², Arash Vahdat⁴

¹Cornell Tech ²Cornell University ³MBZUAI ⁴NVIDIA
^*Joint first authors ^†Joint second authors

Pre-print 2025

We propose Esoteric Language Models (Eso-LMs), a new framework for language modeling that fuses AR and MDM paradigms and outperforms the previous hybrid approach, BD3-LMs. Our model uses a revised attention mechanism to support both paradigms, and is trained with a hybrid loss—a combination of AR and MDM objectives—which allows it to interpolate smoothly between the two paradigms in terms of perplexity and sample quality. Further, ours is the first approach to enable KV caching for MDMs while preserving parallel generation, achieving up to 65× faster inference than standard MDMs and 4× faster inference than KV-cached semi-autoregressive baselines.

In this repository, we release both variants of Eso-LMs: Eso-LM (A) and Eso-LM (B).

Code Organization

main.py: Routines for training, evaluation, and generating samples
trainer_base.py: Base classes for AR and all kinds of discrete diffusion in algo.py
algo.py: Classes for AR, MDLM, EsoLM
dataloader.py: Dataloaders
utils.py: LR scheduler, logging, fsspec handling, etc.
models/: Denoising network architectures. Supports DiT, EsoLM-DiT, and AR transformer
configs/: Config files for algorithms/datasets/denoising networks/noise schedules/LR schedules
scripts/: Shell scripts for training, evaluation, and generating samples

Getting started in this repository

To get started, create a conda environment containing the required dependencies.

conda create -n esolm python=3.9
conda activate esolm
pip install -r requirements.txt

Create the following directories to store saved models and slurm logs:

mkdir outputs
mkdir watch_folder

Training

Run training as a batch job:

sbatch scripts/esolm/train_owt_esolmb_alpha0_1.sh

Modify DATA_DIR and CHECKPOINT_DIR accordingly within the bash script.

Logging is done with wandb. Configure entity and project in configs/config.yaml to your own.

Evaluating a checkpoint

Download our Eso-LM (B) checkpoints trained on OpenWebText from this Google Drive folder.

Run evaluation as a batch job:

sbatch scripts/esolm/eval_owt_esolmb.sh \
  --alpha_0 1 \
  --batch_split 1 \
  --ckpt_path folder/esolmb-alpha0-1-250k.ckpt

By default, this bash script occupies 4 GPUs.

The values of alpha_0 and batch_split used for evaluation must be the same as the ones used for training.

Sampling from a checkpoint

Download our Eso-LM (B) checkpoints trained on OpenWebText from this Google Drive folder.

Run sampling as a batch job (generate 8 samples):

sbatch scripts/esolm/gen_ppl_owt_esolmb.sh \
  --alpha_0 1 \
  --T 1024 \
  --batch_size 8 \
  --num_batches 1 \
  --ckpt_path folder/esolmb-alpha0-1-250k.ckpt

By default, this bash script occupies a single GPU and uses a fixed seed.

The value of alpha_0 used for sampling can be different from the one used for training.

Adjust batch_size (must fit on your GPU) and num_batches to generate the desired total number of samples.

Acknowledgements

This repository was built off of DUO.

Citation

@misc{sahoo2025esotericlanguagemodels,
      title={Esoteric Language Models}, 
      author={Subham Sekhar Sahoo and Zhihan Yang and Yash Akhauri and Johnna Liu and Deepansha Singh and Zhoujun Cheng and Zhengzhong Liu and Eric Xing and John Thickstun and Arash Vahdat},
      year={2025},
      eprint={2506.01928},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.01928}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Esoteric Language Models

Code Organization

Getting started in this repository

Training

Evaluating a checkpoint

Sampling from a checkpoint

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
configs		configs
models		models
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
algo.py		algo.py
dataloader.py		dataloader.py
main.py		main.py
mauve.py		mauve.py
metrics.py		metrics.py
requirements.txt		requirements.txt
trainer_base.py		trainer_base.py
utils.py		utils.py

License

s-sahoo/Eso-LMs

Folders and files

Latest commit

History

Repository files navigation

Esoteric Language Models

Code Organization

Getting started in this repository

Training

Evaluating a checkpoint

Sampling from a checkpoint

Acknowledgements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages