Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Releases: liamdugan/raid

v0.2.0

25 Jul 16:48
28e96e8

Choose a tag to compare

v0.2.0 Release for RAID

This release adds:

  • Support for multiple FPR values & AUROC in run_evaluation
  • Leaderboard toggle to select one of three metrics: TPR@FPR=5%, TPR@FPR=1%, and AUROC
  • Recalculation of all existing detector evaluation scores to include these new metrics
  • Logic to remove detector predictions that do not meet a particular FPR threshold
  • Warnings on submission if your detector fails to meet the FPR threshold

Non-breaking interface changes

  • evaluate_cli.py now takes in multiple arguments for target_fpr.
  • The default value for evaluate_cli.py is both 0.05 FPR and 0.01 FPR.

Potentially breaking changes

The output format of results.json now has a slightly altered structure. Instead of the accuracy field pointing to the TPR@FPR=5% it now points to a dictionary indexed by the FPR containing the true positives, false negatives, and TPR for the particular fpr value.

When detectors are unable to achieve a given target FPR, the resulting field in the accuracy dictionary will be given a null value (as seen below).

Old
{
  "domain": "abstracts",
  "model": "llama-chat",
  "decoding": "greedy",
  "repetition_penalty": "no",
  "attack": "none",
  "tp": 200,
  "fn": 0,
  "accuracy": 1.0
},
New
{
  "domain": "abstracts",
  "model": "llama-chat",
  "decoding": "greedy",
  "repetition_penalty": "no",
  "attack": "none",
  "accuracy": {
    "0.05": {
      "tp": 200,
      "fn": 0,
      "accuracy": 1.0
    },
    "0.01": null
  },
  "auroc": 0.9989833333333333
},

This is a BREAKING CHANGE for any code built off evaluate_cli or run_evaluation that directly accesses the results.json. Please take care to catch these null values and index accuracy correctly.

v0.1.0

13 Jan 17:24
71958c1

Choose a tag to compare

Marking this as a new minor release because it's been a while. Please update if you can

Changes on this release

  • bumping the numpy and pandas dependencies to 1.26.4 and 2.2.2 respectively to match Google Colab's versions due to installation issues in Python 3.12+

v0.0.9

18 Sep 20:10
f644ff8

Choose a tag to compare

Minor Bug Fixes

  • run_detection now passes arguments of the correct type to the detector function

v0.0.8

12 Sep 17:25
47105b9

Choose a tag to compare

Added better error handling for cases where run_evaluation is given predictions that do not contain enough human-written data.

v0.0.7

11 Sep 20:17
6625fc6

Choose a tag to compare

  • Fixed bug where run_evaluation fails if input dataframe contains a scores column
  • Fixed run_detection editing passed in dataframes directly instead of making a local copy

v0.0.6

04 Sep 23:05
1925e93

Choose a tag to compare

Fixed bug where using the fp argument caused load_data function to error.

v0.0.5

05 Jun 03:11

Choose a tag to compare

  • Added include_all toggle to run_evaluation to allow users to turn on and off having aggregations in the output evaluation json
  • Fixed require_complete to remove null scores when set to false

v0.0.4

05 Jun 02:36

Choose a tag to compare

Added new features to evaluate to accommodate varied use of pypi scripts

  • Added option to toggle per-domain threshold tuning off/on in run_evaluation
  • Added option to toggle needing to have all predictions for a given dataset split in order to return the score in run_evaluation
  • Fixed threshold search to never return a threshold that corresponds to an FPR of 0.0

v0.0.3

04 Jun 20:42

Choose a tag to compare

This release includes an important bug fix for both run_evaluation and evaluate_cli.py as well as a few feature improvements.

  • Fix: evaluate.py Fixed target fpr threshold calculation being done on adversarially attacked human data
  • Fix: Lowered the pypi packages min version for numpy to 1.24.x as 1.25.x doesn't support python 3.8
  • Feature: Added argument include_adversarial to the load_data function to allow users to specify whether they want to include adversarial attacks in the downloaded dataframe or not

v0.0.2

03 Jun 19:50

Choose a tag to compare

  • Downgraded numpy version from 1.26.x to 1.25.x to fix errors with scikitlearn during pip installation