v0.2.0 Release for RAID

This release adds:

Support for multiple FPR values & AUROC in run_evaluation
Leaderboard toggle to select one of three metrics: TPR@FPR=5%, TPR@FPR=1%, and AUROC
Recalculation of all existing detector evaluation scores to include these new metrics
Logic to remove detector predictions that do not meet a particular FPR threshold
Warnings on submission if your detector fails to meet the FPR threshold

Non-breaking interface changes

evaluate_cli.py now takes in multiple arguments for target_fpr.
The default value for evaluate_cli.py is both 0.05 FPR and 0.01 FPR.

Potentially breaking changes

The output format of results.json now has a slightly altered structure. Instead of the accuracy field pointing to the TPR@FPR=5% it now points to a dictionary indexed by the FPR containing the true positives, false negatives, and TPR for the particular fpr value.

When detectors are unable to achieve a given target FPR, the resulting field in the accuracy dictionary will be given a null value (as seen below).

Old

{
  "domain": "abstracts",
  "model": "llama-chat",
  "decoding": "greedy",
  "repetition_penalty": "no",
  "attack": "none",
  "tp": 200,
  "fn": 0,
  "accuracy": 1.0
},

New

{
  "domain": "abstracts",
  "model": "llama-chat",
  "decoding": "greedy",
  "repetition_penalty": "no",
  "attack": "none",
  "accuracy": {
    "0.05": {
      "tp": 200,
      "fn": 0,
      "accuracy": 1.0
    },
    "0.01": null
  },
  "auroc": 0.9989833333333333
},

This is a BREAKING CHANGE for any code built off evaluate_cli or run_evaluation that directly accesses the results.json. Please take care to catch these null values and index accuracy correctly.

Marking this as a new minor release because it's been a while. Please update if you can

Changes on this release

bumping the numpy and pandas dependencies to 1.26.4 and 2.2.2 respectively to match Google Colab's versions due to installation issues in Python 3.12+

Minor Bug Fixes

run_detection now passes arguments of the correct type to the detector function

Added better error handling for cases where run_evaluation is given predictions that do not contain enough human-written data.

Fixed bug where run_evaluation fails if input dataframe contains a scores column
Fixed run_detection editing passed in dataframes directly instead of making a local copy

Fixed bug where using the fp argument caused load_data function to error.

Added include_all toggle to run_evaluation to allow users to turn on and off having aggregations in the output evaluation json
Fixed require_complete to remove null scores when set to false

Added new features to evaluate to accommodate varied use of pypi scripts

Added option to toggle per-domain threshold tuning off/on in run_evaluation
Added option to toggle needing to have all predictions for a given dataset split in order to return the score in run_evaluation
Fixed threshold search to never return a threshold that corresponds to an FPR of 0.0

This release includes an important bug fix for both run_evaluation and evaluate_cli.py as well as a few feature improvements.

Fix: evaluate.py Fixed target fpr threshold calculation being done on adversarially attacked human data
Fix: Lowered the pypi packages min version for numpy to 1.24.x as 1.25.x doesn't support python 3.8
Feature: Added argument include_adversarial to the load_data function to allow users to specify whether they want to include adversarial attacks in the downloaded dataframe or not

Downgraded numpy version from 1.26.x to 1.25.x to fix errors with scikitlearn during pip installation

Releases: liamdugan/raid

v0.2.0

v0.2.0 Release for RAID

This release adds:

Non-breaking interface changes

Potentially breaking changes

Old

New

Uh oh!

v0.1.0

Changes on this release

Uh oh!

v0.0.9

Minor Bug Fixes

Uh oh!

v0.0.8

Uh oh!

v0.0.7

Uh oh!

v0.0.6

Uh oh!

v0.0.5

Uh oh!

v0.0.4

Uh oh!

v0.0.3

Uh oh!

v0.0.2

Uh oh!