Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MaxWolf-01/LNDP

Repository files navigation

An attempt at replicating Lifelong Neural Developmental Programs 1 in pytorch.

Visualization of a rollout in cartpole (preceded by spontaneous activity phase):

visualization.mp4

... not quite there yet, regarding performance. 2

What I tried to fix it, my suspicions, and how I would debug it further - tried a bunch of hyperparameter config and ablations of config values which were unclear from the paper & original implementation (partly conflicting) - logged a lot of intermediate values (but couldn't find anything too suspicious, except an unusually high number of edges with some settings) - vibe debugging

As for why it doesn't learn properly, I have some guesses:

  • rng goofed up somewhere
  • misinterpretation of hyperparameters from original implementation & paper
  • subtle bugs around masking, timing (esp add/prune), indexing, initialization, etc.
  • goofed up something specific to the cartpole env?

How I would debug it further:

  • actually think a bit harder about the behavior of the model (visualization, statistics)
  • train in a different env from the paper (e.g. with discrete actions)
  • but first, I would rewrite it (since my understanding made leaps since I wrote this) in jax (ecosystem+parallelism+rng+speed etc.; harder to goof up)
image image

Mean fitness 50, max fitness 350 on cartpole. Seems like a big discrepancy, actually? Maybe this points to a bug with the optimization, as opposed to the architecture implementation?

Some lessons learned:

  • no premature optimization: first make the code run, then make it pretty/efficient/modular/thoroughly typed
  • don't try to be smart: add your own ideas after you have a working baseline
  • put your (own) mind to it, or let it be: even the best LLMs [2025] fail hard at vibe-debugging nontrivial ML code (hallucinations, getting lost in dead ends, etc.)

Setup:

git clone [email protected]:MaxWolf-01/LNDP.git
cd LNDP
uv sync --all-extras
source .venv/bin/activate

Run cartpole with default hparams and wandb logging:

python experiments/cartpole_paper.py --wandb.enabled true

Footnotes

  1. In case you don't know what it is, here's a TLDR. For the conceptual picture, I recommend reading the paper, as my note mostly focuses on implementation details.

  2. Ultimately, I left it here since I had obtained a deeper understanding by getting my hands dirty, and I've since been busy collecting stepping stones towards successors which break with some assumptions of LNDP; getting this to work wouldn't be worth the effort. What's more, I worked on this during conscription - limited time/attention/patience.

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages