getting oriented

Was able to get things nominally up and running based on the following:

> To install, clone it, cd cortex/ec3  (ec = explore-compress, the intellectual lineage of DreamCoder, which was ec2 in their repo). 
> ./install-deps.sh
> should build the ocaml executable.  run it via
> ./run.sh -b 512 -g -p
> -b : batch size (change based on your gpu memory)
> -g : (optional) debug logging
> -p : parallel.  (Defaults to assuming there are ~16 cores .. i should make that a parameter.  turn it off when debugging )
> 
> Training: in a separate terminal,
> cd cortex/ec3
> python ec33.py -b 512
> This will start training.  Batch size needs to be the same.
> 
> Dreaming: Once it writes out a model, you can start dreaming in yet another terminal:
> python ec33.py -b 512 -d
> where -d : dreaming
> You can monitor training progress in yet another term
> python plot_losslog.py -b 512
> (window output: assumes running locally)
> 
> At present, the dreams don't directly feed back into the training.  What I'm working on now.  But, this is enough for you to poke around! 

Probably going to have some high level questions about how data is getting passed around between the processes here but want to poke it a little first...

Two quick ones to help orient me:
- What kind of setup have you been using to train on this so far in terms of hardware? I see a comment in `run.sh` that reads `# use the first 4090 (Second one for python)` - does this imply two GPUs with one running the ocaml stuff and the other one doing pytorch? 
- Anything to think about in terms of setting up the python environment? I'm working from an image with Ubuntu 22.04 + CUDA 12.0 (had to go to datacrunch to get a GPU instance as AWS is being weirdly stingy with my personal account)... was able to get `ec33.py` running just by doing a naked pip install of `torch` and `matplotlib` though that's not best practice obviously. Mainly asking because ocaml is a black box to me for now and I don't really understand what (if any) dependencies might be getting shared between it and a pytorch installation. 
  - Once i get my bearings a bit I'd be down to maybe try to dockerize the setup procedure here if you think that makes sense

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

getting oriented #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

getting oriented #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions