Thanks to visit codestin.com
Credit goes to github.com

Skip to content

feat: openai wandb sync #64

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Feb 1, 2022
Merged

feat: openai wandb sync #64

merged 28 commits into from
Feb 1, 2022

Conversation

borisdayma
Copy link
Contributor

Usage - CLI

$ openai wandb sync --help
usage: openai wandb sync [-h] [-i ID] [-n N_JOBS] [--project PROJECT] [--entity ENTITY] [--force]

optional arguments:
  -h, --help            show this help message and exit
  -i ID, --id ID        The id of the fine-tune job
  -n N_JOBS, --n_jobs N_JOBS
                        Number of most recent fine-tune jobs to log when an id is not provided
  --project PROJECT     Name of the project where you're sending runs. By default, it is "GPT-3".
  --entity ENTITY       Username or team name where you're sending runs. By default, your default entity is used, which is usually your username.
  --force               Forces logging and overwrite existing wandb run of the same finetune job.

Usage - Python

from openai.logger import Logger

Logger.sync(
    id=None,
    n_jobs=10,
    project='GPT-3',
    entity=None,
    force=False,
    **kwargs_wandb_init
)

Reference

entity=args.entity,
force=args.force,
)
print(resp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a print statement required after openai logger sync is called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will show wandb log completed successfully to confirm the command ran successfully.
It was just to follow the pattern of the other commands but it could be done directly in the logger if you prefer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - makes sense as is!

Copy link
Collaborator

@rachellim rachellim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! LGTM, besides a few nitty comments

openai/cli.py Outdated
sub.add_argument("-i", "--id", help="The id of the fine-tune job (optional)")
sub.add_argument(
"-n",
"--n_jobs",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally refer to these in our documentation / everywhere as "fine-tunes" instead of "fine-tune jobs". Can you remove references to jobs everywhere in this PR?

openai/logger.py Outdated
@@ -0,0 +1,277 @@
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we name this module and class wandb_logger / WandbLogger respectively to make it more distinct from more generic logging lirbaries?

openai/logger.py Outdated
]

if not show_warnings and not any(fine_tune_logged):
print("No new successful fine-tune were found")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine-tunes

openai/logger.py Outdated
fine_tunes = fine_tunes["data"][-n_jobs if n_jobs is not None else None :]

# log starting from oldest fine_tune
show_warnings = False if id is None and n_jobs is None else True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Either [1] add a comment to briefly explain the show_warnings logic here and in L81 ("Show individual warnings if the user specifies a fine-tune or a specific number of fine-tunes, otherwise only warn if there are no new successful fine-tunes to sync."), or [2] rename this to show_individual_warnings. I think it'll make the warning logic slightly easier to grok quickly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

openai/logger.py Outdated

# start a wandb run
wandb.init(
job_type="finetune",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Should this be fine_tune or fine-tune to be consistent with our API? (Does finetune mean something special within your system?)

@borisdayma borisdayma requested a review from rachellim January 20, 2022 01:45
@borisdayma
Copy link
Contributor Author

Git was able to solve the conflicts without my help!

@christinakim christinakim merged commit 62b51ca into openai:main Feb 1, 2022
@stainless-bot stainless-bot mentioned this pull request Nov 6, 2023
baseprime pushed a commit to breezerfp/breeze-openai that referenced this pull request Mar 20, 2024
* Add support for search_indices (openai#64)

* Add support for search_indices

* Updated with Schnurr's comments

* Add version to search (openai#65)

* Make search query required (openai#67)
cgayapr pushed a commit to cgayapr/openai-python that referenced this pull request Dec 14, 2024
* Add support for search_indices (openai#64)

* Add support for search_indices

* Updated with Schnurr's comments

* Add version to search (openai#65)

* Make search query required (openai#67)
cgayapr pushed a commit to cgayapr/openai-python that referenced this pull request Dec 14, 2024
* feat: log fine_tune with wandb

* feat: ensure we are logged in

* feat: cli wandb namespace

* feat: add fine_tuned_model to summary

* feat: log training & validation files

* feat: re-log if was not successful or force

* doc: add docstring

* feat: set wandb api only when needed

* fix: train/validation files are inputs

* feat: rename artifact type

* feat: improve config logging

* feat: log all jobs by default

* feat: log job details

* feat: log -> sync

* feat: cli wandb log -> sync

* fix: validation_files not always present

* feat: format created_at + style

* feat: log number of training/validation samples

* feat(wandb): avoid download if file already synced

* feat(wandb): add number of items to metadata

* fix(wandb): allow force sync

* feat(wandb): job -> fine-tune

* refactor(wandb): use show_individual_warnings

* feat(wandb): Logger -> WandbLogger

* feat(wandb): retrive number of items from artifact

* doc(wandb): add link to documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants