Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fix(py-ext-lightning): log unexpected kernel failures with traceback in step#2169

Merged
imran-siddique merged 1 commit into
microsoft:mainfrom
aegis-initiative:fix/py-ext-lightning-runner-log-exception
May 12, 2026
Merged

fix(py-ext-lightning): log unexpected kernel failures with traceback in step#2169
imran-siddique merged 1 commit into
microsoft:mainfrom
aegis-initiative:fix/py-ext-lightning-runner-log-exception

Conversation

@finnoybu

Copy link
Copy Markdown
Contributor

Summary

GovernedRunner.step caught unexpected (non-policy) kernel exceptions in a bare except Exception as e: branch and logged them with:

except Exception as e:
    result = None
    success = False
    logger.error(f"Execution failed: {e}")

logger.error(msg) discards exc_info — the operator-visible log record shows only the exception's str(e), dropping the stack frame information needed to localise where the kernel actually failed.

Change

Switch to logger.exception("Execution failed") so the active traceback travels with the record (exc_info is automatically set).

The policy-violation branch immediately above is unaffected: it deliberately logs only the violation description because that branch catches an expected control-flow exception (PolicyViolationError), not an unexpected failure — those are two different log shapes for two different events.

Tests

New regression test TestGovernedRunnerStep::test_step_unexpected_kernel_failure_logs_traceback exercises the fallback self.agent(input) path with an agent that raises RuntimeError("kernel boom") and asserts the resulting log record carries exc_info is not None.

Confirmed the test fails against unfixed source (exc_info is None from logger.error) and passes with the fix.

$ PYTHONPATH=src python -m pytest tests/ -q
99 passed, 1 skipped in 0.34s

Test plan

  • Existing lightning test suite green (98 → 99 with new regression).
  • Regression test fails against unfixed source and passes with fix.
  • Policy-violation branch behaviour unchanged.

Surfaced during independent audit conducted by @finnoybu (Ken Tannenbaum, AEGIS Initiative); [LOW, Python Extensions].

…in step

`GovernedRunner.step` caught unexpected (non-policy) kernel exceptions in
a bare ``except Exception as e:`` branch and logged them with
``logger.error(f"Execution failed: {e}")``. ``logger.error(msg)``
discards ``exc_info`` — the operator-visible log record shows only the
exception's ``str(e)``, dropping the stack frame information needed to
localise where the kernel failed.

Switch to ``logger.exception("Execution failed")`` so the active
traceback travels with the record (``exc_info`` is automatically set).
The policy-violation branch immediately above is unaffected: it
deliberately logs only the violation description because that branch
catches an *expected* control-flow exception, not an unexpected failure.

Regression test exercises the fallback ``self.agent(input)`` path with
an agent that raises ``RuntimeError`` and asserts that the resulting
log record carries ``exc_info is not None`` (the contract that
``logger.error`` previously violated). Confirmed the test fails against
unfixed source and passes with the fix.

Surfaced during independent audit conducted by @finnoybu (Ken Tannenbaum, AEGIS Initiative); [LOW, Python Extensions].
@github-actions github-actions Bot added the tests label May 12, 2026
@github-actions

Copy link
Copy Markdown
🤖 AI Agent: test-generator — `runner.py`

runner.py

  • test_step_unexpected_kernel_failure_logs_traceback -- validates that unexpected kernel failures log the traceback correctly.

@github-actions

Copy link
Copy Markdown
🤖 AI Agent: breaking-change-detector — API Compatibility

API Compatibility

Severity Change Impact
Potentially Breaking Changed logging from logger.error(f"Execution failed: {e}") to logger.exception("Execution failed") The change in logging behavior may affect users relying on the specific format of the error message and the absence of stack trace information in logs.

@github-actions

Copy link
Copy Markdown
🤖 AI Agent: security-scanner — View details

No security issues found.

@github-actions

Copy link
Copy Markdown
🤖 AI Agent: docs-sync-checker — Docs Sync

Docs Sync

  • GovernedRunner.step in runner.py -- missing docstring
  • README.md -- section on error handling needs update
  • CHANGELOG.md -- missing entry for behavioral change in logging unexpected kernel failures

@github-actions github-actions Bot added the size/S Small PR (< 50 lines) label May 12, 2026
@github-actions

Copy link
Copy Markdown
🤖 AI Agent: code-reviewer — View details

TL;DR: 0 blockers, 0 warnings. No issues found. Clean change.

@github-actions

Copy link
Copy Markdown

🟡 Contributor Check: MEDIUM

Check Result
Profile MEDIUM
Credential NONE
Overall MEDIUM

Automated check by AGT Contributor Check.

@github-actions github-actions Bot added the needs-review:MEDIUM Contributor check flagged MEDIUM risk label May 12, 2026
@github-actions

Copy link
Copy Markdown

PR Review Summary

Check Status Details
🔍 Code Review ✅ Passed No issues found
🛡️ Security Scan ✅ Passed No issues found
🔄 Breaking Changes ✅ Completed Analysis complete
📝 Docs Sync ✅ Completed Analysis complete
🧪 Test Coverage ✅ Completed Analysis complete

Verdict: ✅ Ready for human review

@imran-siddique imran-siddique merged commit 60aedc9 into microsoft:main May 12, 2026
13 of 14 checks passed
MohammadHaroonAbuomar pushed a commit to MohammadHaroonAbuomar/agt-acs that referenced this pull request Jun 1, 2026
…in step (microsoft#2169)

`GovernedRunner.step` caught unexpected (non-policy) kernel exceptions in
a bare ``except Exception as e:`` branch and logged them with
``logger.error(f"Execution failed: {e}")``. ``logger.error(msg)``
discards ``exc_info`` — the operator-visible log record shows only the
exception's ``str(e)``, dropping the stack frame information needed to
localise where the kernel failed.

Switch to ``logger.exception("Execution failed")`` so the active
traceback travels with the record (``exc_info`` is automatically set).
The policy-violation branch immediately above is unaffected: it
deliberately logs only the violation description because that branch
catches an *expected* control-flow exception, not an unexpected failure.

Regression test exercises the fallback ``self.agent(input)`` path with
an agent that raises ``RuntimeError`` and asserts that the resulting
log record carries ``exc_info is not None`` (the contract that
``logger.error`` previously violated). Confirmed the test fails against
unfixed source and passes with the fix.

Surfaced during independent audit conducted by @finnoybu (Ken Tannenbaum, AEGIS Initiative); [LOW, Python Extensions].
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-review:MEDIUM Contributor check flagged MEDIUM risk size/S Small PR (< 50 lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants