Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Adding transaction classification notebooks #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 27, 2022

Conversation

colin-jarvis
Copy link
Contributor

I've made two examples of transaction classification using GPT-3, one with multiclass classification and one using clustering on an unlabelled dataset.

I've also included the source dataset used in the multiclass classification notebook, plus a set of labelled examples I made based on it.

I added a .gitignore to the repo as well

.gitignore Outdated
@@ -127,3 +127,9 @@ dmypy.json

# Pyre type checker
.pyre/

# helpers
*helpers.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been removed and the file checked out of Git

"import pandas as pd\n",
"import numpy as np\n",
"\n",
"from helpers import OPENAI_API_KEY\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this import, and you can replace with:

import os
openai.api_key = os.getenv("OPENAI_API_KEY")

"import json\n",
"\n",
"def check_finetune_classes(train_file,valid_file):\n",
"\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a very short docstring explaining what this code does

"outputs": [],
"source": [
"zero_shot_prompt = '''You are a data expert working for the National Library of Scotland. \n",
" You are analysing all transactions over £25,000 in value and classifying them into one of five categories.\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a bunch of whitespace in every line. You should start at the beginning of line (not idented) for every line of the multiline string.

"from sklearn.model_selection import train_test_split\n",
"from sklearn.metrics import classification_report, accuracy_score\n",
"\n",
"fs_df = pd.read_csv(embedding_path)\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add a parameter index_col=0 (to get rid of the "Unnamed: 0" column

"import matplotlib\n",
"import matplotlib.pyplot as plt\n",
"\n",
"from helpers import OPENAI_API_KEY\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you replace this with the same as for the previous notebook?

Copy link
Collaborator

@BorisPower BorisPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make those small changes first, then it's good to go! Thanks

@BorisPower BorisPower merged commit fe60d7f into openai:main Oct 27, 2022
syusuke9999 pushed a commit to syusuke9999/openai-cookbook that referenced this pull request May 12, 2023
Adding transaction classification notebooks
katia-openai pushed a commit that referenced this pull request Feb 29, 2024
Adding transaction classification notebooks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants