-
Notifications
You must be signed in to change notification settings - Fork 3
Fix/priority search #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: experimental
Are you sure you want to change the base?
Conversation
Remove mailing list
…into fix/priority_search
…nTrace into fix/priority_search
…emory Add compress_candidate_memory method to PrioritySearch
…o recently added compress_candidate_memory.
…nTrace into fix/priority_search_pickling
add a saving method to pre-save source code
- Make the copied modules' parameter nodes have the same as the original one, so that optimizer's memory works. - Add a flag to allow using the same optimizer instance across search - Remove commented code
…nd multi algo BENCH
added GEPA in examples/priority_search_on_convex_fn.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few files that need to be removed (including one that I accidentally committed/pushed). No immediate implementation problem spotted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this file to be gepa_on_convex_fn.py
? @doxav
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is save/load removed here -- is it because this solution doesn't work and needs more testing or more customized implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the base Optimizer class now implements getstate and setstate, which skips parameters, so the optimizers can be pickled directly. There is save
and load
implemented there. What's done manually in the original code is basically doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the separation between Model
and Module
mostly for pickling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Module is quite high level. It's also the parent of FunModule created by bundle.
Model is what is the user should use when building a traceable agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we leave resume
for a future feature/implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GEPA should not be part of the Fix/priority search
branch -- can it be a different one? @doxav any chance you can remove this from this PR? You can commit the convex_fn_BENCH
to that PR too if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I don't know if @chinganc wants them in this PR lol -- I'll leave it to him to decide :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do it in a different PR. This PR is already quite large. Let's follow the principle to keep each PR targeted.
@doxav sorry for the trouble.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe another simpler resolution is to move GEPA under features in the PR?
In this way, the intention that the code is not fully tested is clear?
# Reset the index for the next epoch | ||
self._i = 0 | ||
self.n_epochs += 1 | ||
"""Get the next batch of data, always of batch_size. If the dataset is smaller or at the end, the batch will include data from the next epoch after shuffling.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, is this the logic we want?
This will cause the step
vs epoch
distinction. I think a lot of trainer does this...so I'm ok if this is the route we want to go with...but just curious if there's any special rationale for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to guarantee that the sampled batch is always of the same size.
In the many applications of generative optimization, what I found is that the training dataset size is much smaller than what we typically see in DL. So we see the boundary effects more often. A special degenerate case is single-node optimization, for which, the original sampler won't honor the batch size at all (even if you specify batch_size>0, the optimizer will only see one feedback as opposed to multiple feedback of batch_size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see -- that makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same concern as before -- GEPA should be its own PR (even for future references/code checks)
This PR finishes the PrioritySearch
It's been tested on the convex optimization and some prompt optimization problems.
It supports
Changes to core in PR:
@trace.model
can be pickled.NOTE
-Resuming experiments. will be done in future PRs