Thanks to visit codestin.com
Credit goes to github.com

Skip to content

FEAT: Add default implementations of GCG extension protocols#1902

Merged
romanlutz merged 2 commits into
microsoft:mainfrom
romanlutz:romanlutz/romanlutz-gcg-default-implementations
Jun 22, 2026
Merged

FEAT: Add default implementations of GCG extension protocols#1902
romanlutz merged 2 commits into
microsoft:mainfrom
romanlutz:romanlutz/romanlutz-gcg-default-implementations

Conversation

@romanlutz

Copy link
Copy Markdown
Contributor

This PR adds default concrete implementations for the four GCG extension protocols introduced in #1861.

The new module pyrit/auxiliary_attacks/gcg/default_implementations.py contains four classes that byte-identically reproduce the existing GCG attack code paths:

  • StandardGCGSampling — reproduces GCGPromptManager.sample_control.
  • CrossEntropyLoss(target_weight, control_weight) — reproduces the weighted sum of AttackPrompt.target_loss and AttackPrompt.control_loss applied inside GCGMultiPromptAttack.step.
  • LengthPreservingFilter(filter) — reproduces MultiPromptAttack.get_filtered_cands.
  • LiteralStringInit(suffix) — reproduces the literal-string control_init parameter threaded through the attack constructors.

The four classes are exported from the package root via the existing PEP 562 lazy-import machinery in pyrit/auxiliary_attacks/gcg/__init__.py.

What this PR does NOT do

This PR is purely additive. The existing methods (GCGPromptManager.sample_control, AttackPrompt.target_loss / control_loss, MultiPromptAttack.get_filtered_cands) and the literal control_init parameter plumbing all remain the production code path — the defaults are extracted alongside the existing code, not replacing it.

The follow-up PR will wire these defaults into GCGAlgorithmConfig and GCGMultiPromptAttack.step. That is the PR where the existing per-step logic is replaced by dispatch through these protocol objects (with the defaults preserving today's behavior when no custom implementation is configured).

Acceptance gate: golden-input parity

The accompanying tests in tests/unit/auxiliary_attacks/gcg/test_default_implementations.py are the acceptance gate. Each default has at least one parity test that:

  1. Constructs the default and the existing code path with a fixed torch.manual_seed(...) and the same deterministic inputs.
  2. Runs both.
  3. Asserts byte-identical equality (torch.equal(...) for tensors, == for lists/strings).

Branch and edge cases are also covered (allow_non_ascii=True/False, filter=True/False, individual-weight-zero loss paths, out-of-vocab clamping, constructor validation).

Verification

  • 23/23 new parity tests pass.
  • 152/152 GCG unit suite passes.
  • Pre-commit clean.

Adds a new module `pyrit/auxiliary_attacks/gcg/default_implementations.py`
containing four concrete classes that byte-identically reproduce the legacy
GCG attack code paths:

- `StandardGCGSampling` — reproduces `GCGPromptManager.sample_control`
- `CrossEntropyLoss` — reproduces the weighted sum of
  `AttackPrompt.target_loss` and `AttackPrompt.control_loss` applied
  inside `GCGMultiPromptAttack.step`
- `LengthPreservingFilter` — reproduces
  `MultiPromptAttack.get_filtered_cands`
- `LiteralStringInit` — reproduces the literal-string `control_init`
  parameter threaded through the attack constructors

The defaults are exported from the package root via the existing PEP 562
lazy-import machinery in `pyrit/auxiliary_attacks/gcg/__init__.py`. They
are not yet wired into `GCGMultiPromptAttack` — the legacy code paths
remain the production path until a follow-up change.

Acceptance gate is golden-input parity. New file
`tests/unit/auxiliary_attacks/gcg/test_default_implementations.py` contains
one `torch.equal` parity test per default (plus branch and edge-case
coverage), comparing the default's output against the legacy code path
called with the same seeded inputs.

Co-authored-by: Copilot <[email protected]>
@adrian-gavrila adrian-gavrila self-assigned this Jun 18, 2026

@adrian-gavrila adrian-gavrila left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@romanlutz romanlutz enabled auto-merge June 22, 2026 18:26
@romanlutz romanlutz added this pull request to the merge queue Jun 22, 2026
Merged via the queue into microsoft:main with commit a0677a1 Jun 22, 2026
53 checks passed
@romanlutz romanlutz deleted the romanlutz/romanlutz-gcg-default-implementations branch June 22, 2026 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants