Thanks to visit codestin.com
Credit goes to github.com

Skip to content

E2E tests: qwen2.5-coder:0.5b non-determinism with receiver method instantiation #197

@cweill

Description

@cweill

Problem

The E2E tests for calculator_multiply and calculator_divide fail all 10 retry attempts due to non-deterministic receiver method instantiation patterns generated by qwen2.5-coder:0.5b.

Details

Even with temperature=0 and seed=42, the LLM randomly chooses between two valid receiver instantiation patterns:

Pattern 1 (in golden files):

for _, tt := range tests {
    t.Run(tt.name, func(t *testing.T) {
        c := &Calculator{}
        if got := c.Multiply(tt.args.n, tt.args.d); got != tt.want {
            t.Errorf("Calculator.Multiply() = %v, want %v", got, tt.want)
        }
    })
}

Pattern 2 (sometimes generated):

for _, tt := range tests {
    t.Run(tt.name, func(t *testing.T) {
        if got := tt.c.Multiply(tt.args.n, tt.args.d); got != tt.want {
            t.Errorf("Calculator.Multiply() = %v, want %v", got, tt.want)
        }
    })
}

Both patterns are syntactically valid but produce different output strings, causing E2E test failures.

Current Status

  • Temporarily disabled calculator_multiply and calculator_divide E2E tests in internal/ai/e2e_test.go
  • 9/11 E2E tests passing consistently on first attempt
  • 2/11 tests disabled with TODO comment referencing this issue

Possible Solutions

  1. Add normalization logic: Convert Pattern 2 → Pattern 1 before comparison
  2. Strengthen prompt: Add explicit instruction to prefer Pattern 1
  3. Try different LLM: Test with larger/different models (e.g., qwen2.5-coder:1.5b)
  4. Relax matching: Use AST comparison instead of exact string matching (loses determinism validation)
  5. Accept both patterns: Update golden files to include both valid patterns (complex to implement)

References

  • PR feat: AI-powered test case generation #194: Add AI-powered test generation
  • Test failure logs: /tmp/full_e2e.txt
  • E2E test code: internal/ai/e2e_test.go:116-258
  • Golden files: testdata/goldens/calculator_{multiply,divide}_ai.go

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions