feat: openai wandb sync #64

borisdayma · 2022-01-13T20:55:47Z

Usage - CLI

$ openai wandb sync --help
usage: openai wandb sync [-h] [-i ID] [-n N_JOBS] [--project PROJECT] [--entity ENTITY] [--force]

optional arguments:
  -h, --help            show this help message and exit
  -i ID, --id ID        The id of the fine-tune job
  -n N_JOBS, --n_jobs N_JOBS
                        Number of most recent fine-tune jobs to log when an id is not provided
  --project PROJECT     Name of the project where you're sending runs. By default, it is "GPT-3".
  --entity ENTITY       Username or team name where you're sending runs. By default, your default entity is used, which is usually your username.
  --force               Forces logging and overwrite existing wandb run of the same finetune job.

Usage - Python

from openai.logger import Logger

Logger.sync(
    id=None,
    n_jobs=10,
    project='GPT-3',
    entity=None,
    force=False,
    **kwargs_wandb_init
)

Reference

BorisPower · 2022-01-13T21:08:20Z

openai/cli.py

+            entity=args.entity,
+            force=args.force,
+        )
+        print(resp)


Why is a print statement required after openai logger sync is called?

It will show wandb log completed successfully to confirm the command ran successfully.
It was just to follow the pattern of the other commands but it could be done directly in the logger if you prefer.

Thanks - makes sense as is!

rachellim

Thanks for adding this! LGTM, besides a few nitty comments

rachellim · 2022-01-19T15:38:48Z

openai/cli.py

+    sub.add_argument("-i", "--id", help="The id of the fine-tune job (optional)")
+    sub.add_argument(
+        "-n",
+        "--n_jobs",


We generally refer to these in our documentation / everywhere as "fine-tunes" instead of "fine-tune jobs". Can you remove references to jobs everywhere in this PR?

rachellim · 2022-01-19T15:40:24Z

openai/logger.py

@@ -0,0 +1,277 @@
+try:


should we name this module and class wandb_logger / WandbLogger respectively to make it more distinct from more generic logging lirbaries?

rachellim · 2022-01-19T15:44:47Z

openai/logger.py

+        ]
+
+        if not show_warnings and not any(fine_tune_logged):
+            print("No new successful fine-tune were found")


fine-tunes

rachellim · 2022-01-19T15:47:52Z

openai/logger.py

+            fine_tunes = fine_tunes["data"][-n_jobs if n_jobs is not None else None :]
+
+        # log starting from oldest fine_tune
+        show_warnings = False if id is None and n_jobs is None else True


Nit: Either [1] add a comment to briefly explain the show_warnings logic here and in L81 ("Show individual warnings if the user specifies a fine-tune or a specific number of fine-tunes, otherwise only warn if there are no new successful fine-tunes to sync."), or [2] rename this to show_individual_warnings. I think it'll make the warning logic slightly easier to grok quickly.

rachellim · 2022-01-19T15:50:47Z

openai/logger.py

+
+        # start a wandb run
+        wandb.init(
+            job_type="finetune",


Nit: Should this be fine_tune or fine-tune to be consistent with our API? (Does finetune mean something special within your system?)

borisdayma · 2022-01-27T18:55:02Z

Git was able to solve the conflicts without my help!

* Add support for search_indices (openai#64) * Add support for search_indices * Updated with Schnurr's comments * Add version to search (openai#65) * Make search query required (openai#67)

* feat: log fine_tune with wandb * feat: ensure we are logged in * feat: cli wandb namespace * feat: add fine_tuned_model to summary * feat: log training & validation files * feat: re-log if was not successful or force * doc: add docstring * feat: set wandb api only when needed * fix: train/validation files are inputs * feat: rename artifact type * feat: improve config logging * feat: log all jobs by default * feat: log job details * feat: log -> sync * feat: cli wandb log -> sync * fix: validation_files not always present * feat: format created_at + style * feat: log number of training/validation samples * feat(wandb): avoid download if file already synced * feat(wandb): add number of items to metadata * fix(wandb): allow force sync * feat(wandb): job -> fine-tune * refactor(wandb): use show_individual_warnings * feat(wandb): Logger -> WandbLogger * feat(wandb): retrive number of items from artifact * doc(wandb): add link to documentation

borisdayma added 19 commits November 3, 2021 18:19

feat: log fine_tune with wandb

9a6d168

feat: ensure we are logged in

a246785

feat: cli wandb namespace

b24d1b6

feat: add fine_tuned_model to summary

dfb67fa

feat: log training & validation files

1036d6d

feat: re-log if was not successful or force

bb3def6

doc: add docstring

2d34eb2

feat: set wandb api only when needed

0b0f456

fix: train/validation files are inputs

c948a2e

feat: rename artifact type

1d10235

feat: improve config logging

146cdaa

feat: log all jobs by default

ce7352e

feat: log job details

5e6dbe9

feat: log -> sync

00111ba

feat: cli wandb log -> sync

9a3edcb

fix: validation_files not always present

2c151f3

feat: format created_at + style

9eccf84

feat: log number of training/validation samples

84c2cbd

Merge branch 'main' of https://github.com/openai/openai-python

6650b39

BorisPower reviewed Jan 13, 2022

View reviewed changes

borisdayma added 3 commits January 18, 2022 14:25

feat(wandb): avoid download if file already synced

8ac09d3

feat(wandb): add number of items to metadata

08ef4de

fix(wandb): allow force sync

9c3738c

rachellim requested changes Jan 19, 2022

View reviewed changes

borisdayma added 4 commits January 19, 2022 10:30

feat(wandb): job -> fine-tune

39e747c

refactor(wandb): use show_individual_warnings

0b1751d

feat(wandb): Logger -> WandbLogger

81db437

feat(wandb): retrive number of items from artifact

e6f154f

borisdayma requested a review from rachellim January 20, 2022 01:45

rachellim approved these changes Jan 26, 2022

View reviewed changes

Merge branch 'main' of https://github.com/openai/openai-python

6c57ef5

doc(wandb): add link to documentation

90c802b

christinakim merged commit 62b51ca into openai:main Feb 1, 2022

stainless-bot mentioned this pull request Nov 6, 2023

release: 1.1.2 #694

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: openai wandb sync #64

feat: openai wandb sync #64

Uh oh!

borisdayma commented Jan 13, 2022

Uh oh!

BorisPower Jan 13, 2022

Uh oh!

borisdayma Jan 13, 2022

Uh oh!

BorisPower Jan 13, 2022

Uh oh!

rachellim left a comment

Uh oh!

rachellim Jan 19, 2022

Uh oh!

rachellim Jan 19, 2022

Uh oh!

rachellim Jan 19, 2022

Uh oh!

rachellim Jan 19, 2022

Uh oh!

borisdayma Jan 19, 2022

Uh oh!

rachellim Jan 19, 2022

Uh oh!

borisdayma commented Jan 27, 2022

Uh oh!

Uh oh!

feat: openai wandb sync #64

feat: openai wandb sync #64

Uh oh!

Conversation

borisdayma commented Jan 13, 2022

Usage - CLI

Usage - Python

Reference

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rachellim left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

borisdayma commented Jan 27, 2022

Uh oh!

Uh oh!