Releases: liamdugan/raid
v0.2.0
v0.2.0 Release for RAID
This release adds:
- Support for multiple FPR values & AUROC in
run_evaluation - Leaderboard toggle to select one of three metrics: TPR@FPR=5%, TPR@FPR=1%, and AUROC
- Recalculation of all existing detector evaluation scores to include these new metrics
- Logic to remove detector predictions that do not meet a particular FPR threshold
- Warnings on submission if your detector fails to meet the FPR threshold
Non-breaking interface changes
evaluate_cli.pynow takes in multiple arguments fortarget_fpr.- The default value for
evaluate_cli.pyis both 0.05 FPR and 0.01 FPR.
Potentially breaking changes
The output format of results.json now has a slightly altered structure. Instead of the accuracy field pointing to the TPR@FPR=5% it now points to a dictionary indexed by the FPR containing the true positives, false negatives, and TPR for the particular fpr value.
When detectors are unable to achieve a given target FPR, the resulting field in the accuracy dictionary will be given a
nullvalue (as seen below).
Old
{
"domain": "abstracts",
"model": "llama-chat",
"decoding": "greedy",
"repetition_penalty": "no",
"attack": "none",
"tp": 200,
"fn": 0,
"accuracy": 1.0
},
New
{
"domain": "abstracts",
"model": "llama-chat",
"decoding": "greedy",
"repetition_penalty": "no",
"attack": "none",
"accuracy": {
"0.05": {
"tp": 200,
"fn": 0,
"accuracy": 1.0
},
"0.01": null
},
"auroc": 0.9989833333333333
},
This is a BREAKING CHANGE for any code built off evaluate_cli or run_evaluation that directly accesses the results.json. Please take care to catch these null values and index accuracy correctly.
v0.1.0
Marking this as a new minor release because it's been a while. Please update if you can
Changes on this release
- bumping the numpy and pandas dependencies to 1.26.4 and 2.2.2 respectively to match Google Colab's versions due to installation issues in Python 3.12+
v0.0.9
v0.0.8
v0.0.7
v0.0.6
v0.0.5
v0.0.4
Added new features to evaluate to accommodate varied use of pypi scripts
- Added option to toggle per-domain threshold tuning off/on in
run_evaluation - Added option to toggle needing to have all predictions for a given dataset split in order to return the score in
run_evaluation - Fixed threshold search to never return a threshold that corresponds to an FPR of 0.0
v0.0.3
This release includes an important bug fix for both run_evaluation and evaluate_cli.py as well as a few feature improvements.
- Fix:
evaluate.pyFixed target fpr threshold calculation being done on adversarially attacked human data - Fix: Lowered the pypi packages min version for numpy to 1.24.x as 1.25.x doesn't support python 3.8
- Feature: Added argument
include_adversarialto theload_datafunction to allow users to specify whether they want to include adversarial attacks in the downloaded dataframe or not