Tags: AbuSido82/openai-python
Tags
Boris/examples and cli (openai#32) * Add a codex backtranslation example to improve SQL queries (openai#58) * Add a codex backtranslation example to improve SQL queries * Boris update ft example (openai#57) * update fine-tune example to show the new CLI outputs * model specifiction for search (openai#60) * Catch chunked encoding errors and retry (openai#63) * Add batch suggestion logic to prepare_data for fine_tunes and custom Q&A answers logic (openai#62) * Add batch suggestion logic to prepare_data for fine_tunes; add an example of how to create a rudimentary answers endpoint with a custom Q&A model Co-authored-by: Madeleine Thompson <[email protected]> Co-authored-by: hallacy <[email protected]>
Updates to prepare_data function (openai#29) * update documentation links to point to the website * Fix encoding * Add rough time estimator based on historical stats * Fix train_test split naming logic; add quiet mode for running inside scripts * Add a finetuning step by step example for a classification use case. * add classification params if train and valid set; add length_validator
minor fixes to tools prepare_data validators (openai#47) (openai#26) * ensure that only a single whitespace is prepended. Ensure the message regarding the prompt separator is displayed only if a prompt separator exists. * change pandas contains to not use regex, which can trip if the common_suffix is actually a regex Co-authored-by: Boris Power <[email protected]>
Cli fixes and improvements (openai#25) * Revamp cli args (openai#45) * Rachel/follow (openai#46) * Add fine_tunes.follow. Add better error handling for disconnected streams * return early * fix an oops * lint * Nicer strings * ensure end token is not applied to classification (openai#44) * ensure end token is not applied to classification * black Co-authored-by: Boris Power <[email protected]>
bugfix * ensure that pandas empty values are read as empty string, rather than a float Co-authored-by: Boris Power <[email protected]>
Lots of CLI changes (openai#22) * Add CLI option to download files (openai#34) * Option to check if file has been uploaded in the past before uploading (openai#33) The check is done based on filename, file purpose and file size * Add fine-tuning hparams directly into the fine-tunes CLI (openai#35) * update fine_tunes cli use_packing argument (openai#38) * A file verification and remediation tool. It applies the following validations: - prints the number of examples, and warns if it's lower than 100 - ensures prompt and completion columns are present - optionally removes any additional columns - ensures all completions are non-empty - infers which type of fine-tuning the data is most likely in (classification, conditional generation and open-ended generation) - optionally removes duplicate rows - infers the existence of a common suffix, and if there is none, suggests one for classification and conditional generation - optionally prepends a space to each completion, to make tokenization better - optionally splits into training and validation set for the classification use case - optionally ensures there's an ending string for all completions - optionally lowercases completions or prompts if more than a 1/3 of alphanumeric characters are upper case It interactively asks the user to accept or reject recommendations. If the user is happy, then it saves the modified output file as a jsonl, which is ready for being used in fine-tuning with the printed command. * Completion: remove from kwargs before passing to EngineAPI (openai#37) * Version bump before pushing to external Co-authored-by: Todor Markov <[email protected]> Co-authored-by: Boris Power <[email protected]> Co-authored-by: Dave Cummings <[email protected]>
PreviousNext