No-Propagation Diffusion Transformer (NoPropDT) on MNIST

This repository contains a clean and complete PyTorch implementation of the No-Propagation Diffusion Transformer (NoPropDT) model as described in the paper by researchers from the University of Oxford. Paper Link: https://arxiv.org/pdf/2503.24322

🚀 Purpose

The goal of this project was to go from scratch to paper-accurate implementation of the No-Propagation learning algorithm — a backpropagation-free method for training deep models using local layer-wise targets and diffusion-based denoising.

This implementation specifically focuses on MNIST to replicate the experimental setup shown in the original paper and validate that the model can reach high accuracy using purely local updates.

💡 Intuition

Traditional neural networks use backpropagation to update weights layer by layer using gradients. While powerful, it's:

Biologically implausible
Not memory efficient
Hard to parallelize layer-wise

NoPropDT replaces backprop with a stack of denoising blocks. Each block performs a denoising operation that refines a noisy class embedding toward the correct label embedding. The entire network learns through sample reuse and local MSE — not gradients passed through the whole network.

The intuition:

Start with a noisy guess for class embedding.
Use CNN + MLP blocks to clean (denoise) it toward the correct label.
Repeat this T times.
At the end, classify the denoised embedding with a linear head.

🧠 How the Codebase Works

The code is modular and broken into clear parts:

1. `models/no_prop_dt.py`

Defines the NoPropDT model:

DenoiseBlock: CNN + MLP that processes image + noisy embedding
NoPropDT: Stack of denoising blocks + classifier + cosine noise schedule

2. `trainer/train_nopropdt.py`

Trains the model using layer-wise local losses (no backprop)
Final step uses cross-entropy + KL for stability
Evaluates accuracy at each epoch

3. `data/mnist_loader.py`

Loads MNIST dataset with ToTensor()
Returns train/test DataLoaders

4. `experiments/run_mnist_dt.py`

Loads data, builds model, sets hyperparams, calls trainer
Clean separation for future dataset extensions

5. `main.py`

Entrypoint script that runs the MNIST experiment

🔄 Code Flow Diagram:

main.py
 └─→ run_mnist_dt.py
     ├─→ get_mnist_loaders() → data/mnist_loader.py
     ├─→ model = NoPropDT(...) → models/no_prop_dt.py
     └─→ train_nopropdt(...) → trainer/train_nopropdt.py

✅ Results

On MNIST, the model achieves ~99% validation accuracy by epoch 7, matching or exceeding what the paper reports.
On CIFAR-10, it reaches ~76% validation accuracy by epoch 50, outperforming earlier linear-only setups.

📊 Accuracy Comparisons

MNIST _{No Decoder}	MNIST _{With Nonlinear Decoder}
CIFAR-10 _{No Decoder}	CIFAR-10 _{With Nonlinear Decoder}

And all of this is done without using backpropagation.

🤝 How to Contribute

Want to experiment with this or try on your own dataset? Here's how to get started:

Clone the repo:

git clone https://github.com/ANKITSANJYAL/NoPropagation.git
cd NoPropagation

Add a custom dataset loader under data/ (check mnist_loader.py or cifar_loader.py as reference)
Write a training script under experiments/ (copy run_mnist_dt.py or run_cifar_dt.py)
Adjust the model config (e.g. num_input_channels, num_classes, embedding_dim) in the script
Run your experiment:

PYTHONPATH=. python experiments/your_script.py

This codebase is modular by design — you can plug in new datasets, decoders, schedulers, or compare against backprop baselines.

Open a PR if you add a new experiment, optimizer, or decoder variant. Let's keep improving this together.

🔭 Next Steps for Me

Add support for SVHN and TinyImageNet
Add backprop baselines for direct comparison
Integrate WandB tracking for better logs and visualizations
Try alternative denoising blocks or noise schedules

Thanks for checking out the project — if you're into optimization, bio-inspired learning, or just curious about alternatives to backprop, you’ll enjoy digging into this.

PRs and stars are always welcome :)

🙏 Credits

This implementation was restructured and extended from a notebook version shared by the community. Special thanks for the inspiration! https://github.com/ashishbamania/Tutorials-On-Artificial-Intelligence/blob/main/Training%20Without%20Backpropagation/NoPropDT_on_MNIST.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
data		data
experiments		experiments
models		models
trainer		trainer
utils		utils
.gitignore		.gitignore
README.md		README.md
main.py		main.py
main_cifar.py		main_cifar.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

No-Propagation Diffusion Transformer (NoPropDT) on MNIST

🚀 Purpose

💡 Intuition

🧠 How the Codebase Works

1. `models/no_prop_dt.py`

2. `trainer/train_nopropdt.py`

3. `data/mnist_loader.py`

4. `experiments/run_mnist_dt.py`

5. `main.py`

🔄 Code Flow Diagram:

✅ Results

📊 Accuracy Comparisons

🔭 Next Steps for Me

🙏 Credits

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ANKITSANJYAL/NoPropagation

Folders and files

Latest commit

History

Repository files navigation

No-Propagation Diffusion Transformer (NoPropDT) on MNIST

🚀 Purpose

💡 Intuition

🧠 How the Codebase Works

1. models/no_prop_dt.py

2. trainer/train_nopropdt.py

3. data/mnist_loader.py

4. experiments/run_mnist_dt.py

5. main.py

🔄 Code Flow Diagram:

✅ Results

📊 Accuracy Comparisons

🔭 Next Steps for Me

🙏 Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

1. `models/no_prop_dt.py`

2. `trainer/train_nopropdt.py`

3. `data/mnist_loader.py`

4. `experiments/run_mnist_dt.py`

5. `main.py`

Packages