Codestin Search App

An attempt at replicating Lifelong Neural Developmental Programs ¹ in pytorch.

Visualization of a rollout in cartpole (preceded by spontaneous activity phase):

visualization.mp4

... not quite there yet, regarding performance. ²

What I tried to fix it, my suspicions, and how I would debug it further

- tried a bunch of hyperparameter config and ablations of config values which were unclear from the paper & original implementation (partly conflicting) - logged a lot of intermediate values (but couldn't find anything too suspicious, except an unusually high number of edges with some settings) - vibe debugging

As for why it doesn't learn properly, I have some guesses:

rng goofed up somewhere
misinterpretation of hyperparameters from original implementation & paper
subtle bugs around masking, timing (esp add/prune), indexing, initialization, etc.
goofed up something specific to the cartpole env?

How I would debug it further:

actually think a bit harder about the behavior of the model (visualization, statistics)
train in a different env from the paper (e.g. with discrete actions)
but first, I would rewrite it (since my understanding made leaps since I wrote this) in jax (ecosystem+parallelism+rng+speed etc.; harder to goof up)

Mean fitness 50, max fitness 350 on cartpole. Seems like a big discrepancy, actually? Maybe this points to a bug with the optimization, as opposed to the architecture implementation?

Some lessons learned:

no premature optimization: first make the code run, then make it pretty/efficient/modular/thoroughly typed
don't try to be smart: add your own ideas after you have a working baseline
put your (own) mind to it, or let it be: even the best LLMs [2025] fail hard at vibe-debugging nontrivial ML code (hallucinations, getting lost in dead ends, etc.)

Setup:

git clone [email protected]:MaxWolf-01/LNDP.git
cd LNDP
uv sync --all-extras
source .venv/bin/activate

Run cartpole with default hparams and wandb logging:

python experiments/cartpole_paper.py --wandb.enabled true

In case you don't know what it is, here's a TLDR. For the conceptual picture, I recommend reading the paper, as my note mostly focuses on implementation details. ↩
Ultimately, I left it here since I had obtained a deeper understanding by getting my hands dirty, and I've since been busy collecting stepping stones towards successors which break with some assumptions of LNDP; getting this to work wouldn't be worth the effort. What's more, I worked on this during conscription - limited time/attention/patience. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
experiments		experiments
lndp_pytorch		lndp_pytorch
scripts		scripts
tests		tests
.envrc		.envrc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.ruff.toml		.ruff.toml
README.md		README.md
pyproject.toml		pyproject.toml
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Uh oh!

Uh oh!

MaxWolf-01/LNDP

Folders and files

Latest commit

History

Repository files navigation

Footnotes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages