-
Notifications
You must be signed in to change notification settings - Fork 381
Description
By default, Heretic conducts 200 trials to ablate a model. In cases where these trials fail to identify an best optimal configuration that minimizes refusals while maintaining low KL divergence, the tool often leaves us with 20–30 refusals per 100 prompts, which is suboptimal.
Currently, it simply presents a menu to select a trial once 200 trials are done, but if the results are unsatisfactory to us, there is no straightforward way to continue improving without restarting the entire process. I suggest adding an option to extend the ablation process by running additional trials. For example, the user could specify to continue with 50 more trials, allowing the system to further explore the parameter space in search of a better configuration, without repeating ineffective optimizations from earlier runs.
This feature would enable a more efficient approach to achieving the best possible ablation results.
Should we work on this simple feature as PR?