-
Notifications
You must be signed in to change notification settings - Fork 3.9k
feat: openai wandb sync #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
entity=args.entity, | ||
force=args.force, | ||
) | ||
print(resp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is a print statement required after openai logger sync is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will show wandb log completed successfully
to confirm the command ran successfully.
It was just to follow the pattern of the other commands but it could be done directly in the logger if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - makes sense as is!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this! LGTM, besides a few nitty comments
openai/cli.py
Outdated
sub.add_argument("-i", "--id", help="The id of the fine-tune job (optional)") | ||
sub.add_argument( | ||
"-n", | ||
"--n_jobs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally refer to these in our documentation / everywhere as "fine-tunes" instead of "fine-tune jobs". Can you remove references to job
s everywhere in this PR?
openai/logger.py
Outdated
@@ -0,0 +1,277 @@ | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we name this module and class wandb_logger
/ WandbLogger
respectively to make it more distinct from more generic logging lirbaries?
openai/logger.py
Outdated
] | ||
|
||
if not show_warnings and not any(fine_tune_logged): | ||
print("No new successful fine-tune were found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine-tunes
openai/logger.py
Outdated
fine_tunes = fine_tunes["data"][-n_jobs if n_jobs is not None else None :] | ||
|
||
# log starting from oldest fine_tune | ||
show_warnings = False if id is None and n_jobs is None else True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Either [1] add a comment to briefly explain the show_warnings
logic here and in L81 ("Show individual warnings if the user specifies a fine-tune or a specific number of fine-tunes, otherwise only warn if there are no new successful fine-tunes to sync."), or [2] rename this to show_individual_warnings
. I think it'll make the warning logic slightly easier to grok quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
openai/logger.py
Outdated
|
||
# start a wandb run | ||
wandb.init( | ||
job_type="finetune", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Should this be fine_tune
or fine-tune
to be consistent with our API? (Does finetune
mean something special within your system?)
Git was able to solve the conflicts without my help! |
* feat: log fine_tune with wandb * feat: ensure we are logged in * feat: cli wandb namespace * feat: add fine_tuned_model to summary * feat: log training & validation files * feat: re-log if was not successful or force * doc: add docstring * feat: set wandb api only when needed * fix: train/validation files are inputs * feat: rename artifact type * feat: improve config logging * feat: log all jobs by default * feat: log job details * feat: log -> sync * feat: cli wandb log -> sync * fix: validation_files not always present * feat: format created_at + style * feat: log number of training/validation samples * feat(wandb): avoid download if file already synced * feat(wandb): add number of items to metadata * fix(wandb): allow force sync * feat(wandb): job -> fine-tune * refactor(wandb): use show_individual_warnings * feat(wandb): Logger -> WandbLogger * feat(wandb): retrive number of items from artifact * doc(wandb): add link to documentation
Usage - CLI
Usage - Python
Reference