fix: pre-announcement security hardening, CI fixes, and demo improvements#296
Conversation
🤖 AI Agent: docs-sync-checker📝 Documentation Sync ReportIssues Found
Suggestions
Additional Notes
Action RequiredPlease address the issues and suggestions above to ensure documentation remains in sync with the codebase. |
🤖 AI Agent: breaking-change-detector🔍 API Compatibility ReportSummaryThis pull request introduces several changes across multiple files and modules, including security enhancements, demo improvements, and CI workflow updates. While most changes are additive or internal, there are a few modifications that may impact downstream users. Below is a detailed analysis of API compatibility. Findings
Migration GuideFor
|
🤖 AI Agent: test-generator🧪 Test Coverage Analysis
|
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces a mix of security hardening, CI improvements, and demo enhancements. The changes address several critical security issues, improve thread safety, and add adversarial testing scenarios to the demo. While the updates are generally positive, there are a few areas that need further attention to ensure robustness and backward compatibility.
🔴 CRITICAL: Security Issues
-
AES-256-GCM Implementation in
dmz.py- The PR mentions replacing an XOR placeholder with AES-256-GCM in
dmz.py, but the actual implementation is missing from the provided diff. Without reviewing the implementation, I cannot verify if the cryptographic operations are secure. Ensure the AES-256-GCM implementation:- Uses a secure key derivation function (e.g., PBKDF2, Argon2, or HKDF) for key generation.
- Properly handles nonce/IV uniqueness to prevent vulnerabilities like nonce reuse.
- Includes authentication tag verification to ensure data integrity.
- Action: Provide the full implementation for review or confirm adherence to cryptographic best practices.
- The PR mentions replacing an XOR placeholder with AES-256-GCM in
-
Adversarial Testing in Demo
- The adversarial scenarios (
scenario_adversarial_attacks) are a great addition, but the policy engine's behavior when attacks "pass through" is unclear. If a policy bypass occurs, it could indicate a critical gap in the policy engine. - Action: Ensure that all "PASSED THROUGH" cases are logged as high-severity audit entries and trigger alerts for further investigation.
- The adversarial scenarios (
-
pull_request_targetWorkflow Security- Switching CI workflows to
pull_request_targetallows workflows to run on forked PRs, which is useful for community contributions. However, this can introduce security risks if untrusted code is executed in the CI environment.- Mitigation: Ensure that no untrusted code (e.g., from forked PRs) is executed directly in the CI environment. Use sandboxing or restrict sensitive operations in these workflows.
- Action: Audit all workflows to confirm they do not execute untrusted code or expose sensitive secrets.
- Switching CI workflows to
🟡 WARNING: Potential Breaking Changes
-
MaxRetriestoMaxAttemptsin .NET SagaOrchestrator- The
MaxRetriesproperty inSagaStephas been replaced withMaxAttempts, andMaxRetriesis now marked as[Obsolete]. While this change is backward-compatible for now, it may break existing integrations if users rely onMaxRetries. - Action: Clearly document this change in the release notes and provide a migration guide for users. Consider maintaining
MaxRetriesas an alias forMaxAttemptsfor at least one major version to ensure a smooth transition.
- The
-
Checksum Verification Guidance
- Adding checksum verification guidance to the README is a positive step, but it may break workflows for users who are not familiar with checksum verification or lack the necessary tooling.
- Action: Provide a simple script or tool to automate checksum verification for users.
💡 Suggestions for Improvement
-
Thread Safety Enhancements
- The thread safety fixes for
CostGuard,VectorClock, andErrorBudget._eventsare well-documented in the CHANGELOG and SECURITY.md. However, consider adding unit tests to explicitly verify thread safety under concurrent access.
- The thread safety fixes for
-
Policy Engine Coverage
- The adversarial scenarios in the demo are a good start, but they only cover four attack types. Consider expanding the test suite to include additional OWASP Agentic Top 10 risks, such as:
- Resource exhaustion (e.g., infinite loops or excessive API calls).
- Unauthorized data exfiltration.
- Privilege escalation via indirect methods.
- The adversarial scenarios in the demo are a good start, but they only cover four attack types. Consider expanding the test suite to include additional OWASP Agentic Top 10 risks, such as:
-
Sandbox Escape Prevention
- The PR mentions blocking
importlibdynamic imports in the sandbox, but this is not reflected in the provided diff. Ensure that the sandbox implementation is robust against common escape techniques, such as:- Arbitrary code execution via
evalorexec. - Accessing restricted modules using
__import__orgetattr.
- Arbitrary code execution via
- The PR mentions blocking
-
Demo Warnings
- The new warnings in the demo (
in-memory storageandsample policy) are helpful. Consider adding a runtime check to detect whether the demo is running in a production environment and display a stronger warning if so.
- The new warnings in the demo (
-
CI Improvements
- While making security scans non-blocking (
continue-on-error: true) is useful for avoiding disruptions, it may lead to unaddressed vulnerabilities. Consider implementing a mechanism to ensure that critical findings are still flagged and tracked for resolution.
- While making security scans non-blocking (
-
Type Safety
- The use of
Anyin thescenario_adversarial_attacksfunction for theclientparameter reduces type safety. - Action: Replace
Anywith a more specific type hint to improve code clarity and prevent runtime errors.
- The use of
Additional Notes
- The addition of the
Security Model & Limitationssection in the README is a significant improvement for transparency. It sets clear expectations for users regarding the toolkit's capabilities and limitations. - The cleanup of unused imports (e.g.,
defaultdictinbehavior_monitor.py) and issue triage are good housekeeping practices that improve maintainability.
Summary of Actions
🔴 CRITICAL
- Provide the AES-256-GCM implementation for review or confirm adherence to cryptographic best practices.
- Ensure adversarial test failures are logged and trigger alerts.
- Audit
pull_request_targetworkflows for security risks.
🟡 WARNING
- Document the
MaxRetriestoMaxAttemptschange and provide a migration guide. - Provide a script or tool for checksum verification to ease adoption.
💡 SUGGESTIONS
- Add unit tests to verify thread safety fixes.
- Expand adversarial test scenarios to cover more OWASP Agentic Top 10 risks.
- Ensure the sandbox implementation blocks common escape techniques.
- Add stronger runtime warnings for demo usage in production environments.
- Implement a mechanism to track and address critical security scan findings.
- Improve type safety in
scenario_adversarial_attacks.
Let me know if you need further clarification or additional reviews!
🤖 AI Agent: security-scannerSecurity Review of Pull RequestThis pull request introduces several changes, including security hardening, CI/CD workflow updates, and demo improvements. Below is a detailed security analysis of the changes, categorized by potential vulnerabilities and their severity. 1. Prompt Injection Defense BypassFindings:
Severity: 🟠 HIGH
2. Policy Engine CircumventionFindings:
Severity: 🔵 LOW
3. Trust Chain WeaknessesFindings:
Severity: 🔵 LOW4. Credential ExposureFindings:
Severity: 🔵 LOW5. Sandbox EscapeFindings:
Severity: 🟠 HIGH
6. Deserialization AttacksFindings:
Severity: 🔴 CRITICAL
7. Race ConditionsFindings:
Severity: 🟡 MEDIUM
8. Supply Chain RisksFindings:
Severity: 🟡 MEDIUM
9. Other ObservationsFindings:
Severity: 🟠 HIGH
Summary of Findings
Recommended Actions
By addressing these issues, the repository can maintain its integrity as a security-focused governance toolkit. |
🤖 AI Agent: breaking-change-detector🔍 API Compatibility ReportSummaryThis pull request introduces several changes across the repository, including security fixes, demo improvements, and CI updates. After analyzing the diff, no breaking changes were identified in the Python packages published to PyPI. However, there are notable changes in the .NET package ( Findings
Migration GuideFor .NET Users:
For Python Users:
Additional Notes
Conclusion✅ No breaking changes were found in the Python packages. Ensure proper communication of the changes to .NET users and provide clear migration instructions in the release notes. |
🤖 AI Agent: docs-sync-checker📝 Documentation Sync ReportIssues Found
Suggestions
Additional Notes
Action Items
Once these issues are addressed, the documentation will be in sync with the code changes. |
🤖 AI Agent: test-generator🧪 Test Coverage Analysis
|
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces several security hardening measures, CI/CD workflow improvements, and demo enhancements. The changes address critical security issues, improve thread safety, and add adversarial testing scenarios to the demo. Additionally, the CI workflows are updated to support forked PRs and improve dependency management.
Below is a detailed review of the changes, categorized into critical issues, warnings, and suggestions.
🔴 CRITICAL
-
AES-256-GCM Implementation in DMZ Module
- The replacement of the XOR placeholder with AES-256-GCM is a significant improvement. However, the implementation of AES-256-GCM in
exus/dmz.pyis not shown in the diff. Ensure that:- A secure key management strategy is in place.
- Nonces are unique for every encryption operation to prevent replay attacks.
- The cryptographic library used is well-maintained and up-to-date.
- Proper error handling is implemented for encryption/decryption failures.
- Action: Provide the implementation details for review to ensure correctness and security.
- The replacement of the XOR placeholder with AES-256-GCM is a significant improvement. However, the implementation of AES-256-GCM in
-
pull_request_targetin CI Workflows- Switching from
pull_requesttopull_request_targetenables workflows to run on forked PRs, but it also introduces a potential security risk. Malicious actors could exploit this to inject harmful code into the workflow. - Action: Ensure that all scripts and actions executed in these workflows are read-only and cannot modify the repository or access sensitive secrets. Consider using a combination of
pull_request_targetand job conditions to limit the scope of execution.
- Switching from
-
Adversarial Scenarios in Demo
- The new adversarial scenarios are a great addition to test the robustness of the governance middleware. However:
- The
scenario_adversarial_attacksfunction uses hardcoded attack payloads. While this is acceptable for a demo, it may not cover all possible attack vectors. - The
MiddlewareTerminationexception is used to detect blocked attacks. Ensure that this mechanism is robust and cannot be bypassed by an attacker.
- The
- Action: Consider adding a mechanism to dynamically load attack scenarios from a configuration file or external source to make the testing more comprehensive. Also, review the
MiddlewareTerminationhandling for potential bypass vectors.
- The new adversarial scenarios are a great addition to test the robustness of the governance middleware. However:
🟡 WARNING
-
Breaking Change in .NET SDK
- The
MaxRetriesproperty inSagaStepis marked as obsolete and replaced withMaxAttempts. While this is a backward-compatible change (due to the mapping), it may cause issues for users relying on the old property. - Action: Clearly document this change in the release notes and provide a migration guide for users to update their code.
- The
-
Security Model & Limitations Documentation
- The addition of the "Security Model & Limitations" section in the README is valuable. However, the statement that the toolkit provides "application-level (Python middleware) governance" might lead to a false sense of security for users unfamiliar with the limitations of Python-based isolation.
- Action: Emphasize that this toolkit is not suitable for untrusted code execution without additional OS-level isolation (e.g., containers or VMs).
💡 SUGGESTIONS
-
Checksum Verification Guidance
- The README now advises verifying package checksums. Consider providing a script or command example to make this process easier for users.
-
Thread Safety Improvements
- The thread safety fixes (e.g.,
deque(maxlen=N)and locking mechanisms) are well-implemented. However, ensure that these changes are thoroughly tested under high concurrency scenarios to avoid race conditions or deadlocks.
- The thread safety fixes (e.g.,
-
Adversarial Mode in Demo
- The
--include-attacksflag is a great addition. Consider adding a summary report at the end of the demo that categorizes the attacks into "Blocked" and "Passed Through" for better clarity.
- The
-
CI Workflow Improvements
- The addition of
pyyamlto the security-scan workflow is good, but consider pinning the dependency version to avoid unexpected issues with future releases.
- The addition of
-
Audit Log Storage
- The demo warns about in-memory storage for audit logs. Consider providing an example implementation of an external storage backend (e.g., a database or file system) to help users transition to a production-ready setup.
-
Policy Coverage
- The adversarial scenarios highlight potential gaps in policy coverage. Consider adding a tool to analyze and suggest improvements to user-defined policies.
Final Recommendations
- Address the critical issues related to AES-256-GCM implementation and
pull_request_targetworkflows immediately. - Document breaking changes clearly and provide migration guidance.
- Consider the suggested improvements to enhance usability and security further.
Once the critical issues are resolved, this PR will significantly improve the security posture and usability of the toolkit.
🤖 AI Agent: security-scannerSecurity Analysis of Changes1. Prompt Injection Defense Bypass
2. Policy Engine Circumvention
3. Trust Chain Weaknesses
4. Credential Exposure
5. Sandbox Escape
6. Deserialization Attacks
7. Race Conditions
8. Supply Chain Risks
Summary of Findings
General Recommendations
Would you like me to assist with implementing any of these recommendations? |
… Model section Address 3 findings from security review: 1. Replace insecure XOR placeholder encryption in DMZ module with real AES-256-GCM via cryptography library (was: 'NOT SECURE - placeholder only' comment in nexus/dmz.py) 2. Add 'Security Model & Limitations' section to root README making clear this is application-level middleware, not OS kernel isolation. Includes table of what each layer provides vs. does not provide. 3. Add checksum verification guidance to community preview disclaimer. Co-authored-by: Copilot <[email protected]>
…sories - Add in-memory storage warning to demo startup - Add sample policy disclaimer to demo startup - Add --include-attacks flag for adversarial demo scenarios (prompt injection, tool alias bypass, SQL policy bypass) - Add security advisories to SECURITY.md for CostGuard org kill bypass (#272) and thread safety fixes (v2.1.0) Co-authored-by: Copilot <[email protected]>
…CHANGELOG Move CostGuard org kill bypass (#272), CostGuard thread safety (#253), ErrorBudget unbounded deque (#172), and VectorClock race condition (#243) from 'Fixed' to 'Security' section in v2.1.0 CHANGELOG — these are security fixes affecting concurrent governance enforcement. Co-authored-by: Copilot <[email protected]>
- Add docstring to scenario_adversarial_attacks - Document --include-attacks flag in README - Pin pyyaml version in security-scan workflow - Audit and fix unsafe yaml.load() calls (if any) - Add unreleased changelog entries Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces several critical security fixes, demo improvements, and CI/CD workflow updates. The changes address previously identified vulnerabilities, enhance the security posture of the project, and improve the overall developer experience. While the changes are generally positive, there are a few areas that require attention to ensure correctness, security, and maintainability.
🔴 CRITICAL: Security Issues
-
AES-256-GCM Implementation in
dmz.py:- The
_encrypt_dataand_decrypt_datamethods use a derived key and nonce generated from the input data. This approach is problematic because:- Using a deterministic nonce derived from the plaintext (e.g.,
hashlib.sha256(data[:16] + key).digest()[:12]) violates the security guarantees of AES-GCM. Nonces must be unique for every encryption operation to prevent nonce reuse attacks, which can completely compromise the encryption. - The derived key is based on the SHA-256 hash of the provided key. This is unnecessary and could lead to potential security issues if the input key is not of sufficient entropy.
- Using a deterministic nonce derived from the plaintext (e.g.,
- Recommendation:
- Use a cryptographically secure random number generator (e.g.,
os.urandomorsecrets.token_bytes) to generate a unique nonce for each encryption operation. - Store the nonce alongside the ciphertext (as is common practice with AES-GCM).
- Do not derive the key using SHA-256 unless absolutely necessary. Instead, require the user to provide a properly sized key (32 bytes for AES-256).
- Use a cryptographically secure random number generator (e.g.,
def _encrypt_data(self, data: bytes, key: bytes) -> bytes: """Encrypt data with AES-256-GCM.""" from cryptography.hazmat.primitives.ciphers.aead import AESGCM import os if len(key) != 32: raise ValueError("Key must be 32 bytes for AES-256-GCM.") nonce = os.urandom(12) # Generate a unique 96-bit nonce aesgcm = AESGCM(key) ciphertext = aesgcm.encrypt(nonce, data, None) return nonce + ciphertext def _decrypt_data(self, encrypted: bytes, key: bytes) -> bytes: """Decrypt data encrypted with AES-256-GCM.""" from cryptography.hazmat.primitives.ciphers.aead import AESGCM if len(key) != 32: raise ValueError("Key must be 32 bytes for AES-256-GCM.") nonce = encrypted[:12] ciphertext = encrypted[12:] aesgcm = AESGCM(key) return aesgcm.decrypt(nonce, ciphertext, None)
- The
-
Adversarial Scenarios in Demo (
maf_governance_demo.py):- The adversarial scenarios introduced in the demo are a great addition for testing governance resilience. However:
- The
Tool Alias Bypassscenario relies on a hardcoded list of allowed and denied tools. This approach may not cover all possible aliases or edge cases. - The
SQL Policy Bypassscenario does not validate whether the SQL injection is actually blocked by the policy engine.
- The
- Recommendation:
- Expand the test cases to include more realistic and diverse attack vectors.
- Ensure that the policy engine's behavior is validated for each scenario (e.g., by asserting specific audit log entries or middleware responses).
- The adversarial scenarios introduced in the demo are a great addition for testing governance resilience. However:
🟡 WARNING: Potential Breaking Changes
-
AES-256-GCM Dependency:
- The new encryption implementation in
dmz.pyintroduces a dependency on thecryptographylibrary. While this is a necessary and justified change, it may break existing environments where the library is not installed. - Recommendation: Clearly document this new dependency in the README and installation guides. Consider adding a fallback mechanism or a warning for environments where
cryptographyis not available.
- The new encryption implementation in
-
Security Model & Limitations Section in README:
- The new "Security Model & Limitations" section explicitly states that the toolkit provides application-level governance, not OS-level isolation. While this is an important clarification, it may cause confusion or concern for users who were unaware of this limitation.
- Recommendation: Ensure that this section is prominently highlighted in release notes and documentation updates to manage user expectations.
💡 Suggestions for Improvement
-
Thread Safety in Audit Log (
maf_governance_demo.py):- The
audit_log._chain._entriesattribute is accessed directly in multiple places. This could lead to race conditions in concurrent environments. - Recommendation: Use thread-safe data structures (e.g.,
collections.dequewith a lock) or provide a thread-safe API for accessing audit log entries.
- The
-
Policy Coverage for Adversarial Scenarios:
- The adversarial scenarios in the demo are a good start, but they should be integrated into the main test suite to ensure continuous validation.
- Recommendation: Add pytest-based tests for these scenarios, with assertions for expected outcomes (e.g., audit log entries, middleware responses).
-
CI Workflow Improvements:
- The
continue-on-error: truesetting in the security scan workflow allows PRs to pass even if critical vulnerabilities are detected. - Recommendation: Use
continue-on-erroronly for non-critical issues. For critical findings, block the PR and require resolution before merging.
- The
-
Documentation Updates:
- The new features and security fixes are well-documented in the README and CHANGELOG. However, the "Security Advisories" section in
SECURITY.mdcould benefit from more detailed mitigation steps and links to relevant documentation. - Recommendation: Provide step-by-step guidance for upgrading to the fixed versions and verifying the fixes.
- The new features and security fixes are well-documented in the README and CHANGELOG. However, the "Security Advisories" section in
Final Assessment
- Security: 🔴 CRITICAL — Fix the AES-256-GCM implementation to ensure proper nonce handling and key management.
- Backward Compatibility: 🟡 WARNING — Document new dependencies and limitations to avoid breaking existing deployments.
- Code Quality: 💡 SUGGESTION — Improve thread safety, expand adversarial scenarios, and enhance CI workflows.
Once the critical issues are addressed, this PR will significantly improve the security and robustness of the project.
🤖 AI Agent: security-scannerSecurity Analysis of the Pull Request1. Prompt Injection Defense Bypass
2. Policy Engine Circumvention
3. Trust Chain Weaknesses
4. Credential Exposure
5. Sandbox Escape
6. Deserialization Attacks
7. Race Conditions
8. Supply Chain
Summary of Findings
Final RecommendationThe PR introduces several important security improvements, such as replacing the insecure XOR encryption with AES-256-GCM and addressing thread safety issues. However, there are critical gaps in the adversarial testing and CI configuration that need to be addressed before merging. |
* feat(dotnet): add MCP security namespace — completes cross-language MCP parity * fix(ci): add path filters and concurrency; announce v3.1.0 release CI optimization: - Add paths-ignore for docs to 5 code-only workflows - Add paths filter to Link Check (only run on docs changes) - Add concurrency groups to 7 heavy workflows - Docs-only PRs drop from ~14 checks to ~4 README: - Add v3.1.0 release announcement callout - Add PyPI version badge - Update tutorial count to 31 Co-authored-by: Copilot <[email protected]> * docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps - Kill switch is no longer placeholder: now implements saga handoff with handoff_success_count tracking (kill_switch.py:69-178) - DeltaEngine verify_chain() is no longer a stub: now performs SHA-256 chain verification (delta.py:67-127) - Move both from Critical/High gaps to new 'Resolved' section - Update Processing Integrity coverage (2 of 4 defects, not 3 of 4) - Update evidence table with current line ranges Co-authored-by: Copilot <[email protected]> * feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer Add AgentGovernance.Mcp namespace implementing full MCP security parity with TypeScript and Rust SDKs: - McpSecurityScanner: tool poisoning, typosquatting, hidden instructions, rug pull, schema abuse, cross-server attack, and description injection detection - McpCredentialRedactor: regex-based redaction of API keys, bearer tokens, connection strings, and secret assignments - McpResponseSanitizer: response scanning for prompt injection tags, imperative phrasing, credential leakage, and exfiltration URLs - McpGateway: policy enforcement pipeline with deny/allow lists, payload sanitization, rate limiting, and human approval gates Includes 46 xUnit tests covering all threat categories. Updates SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * docs: add Entra Agent ID bridge tutorial (Tutorial 31) (#10) * fix(pipeline): run NuGet ESRP signing on Windows agent (#1022) The EsrpCodeSigning@5 task constructs internal paths (batchSignPolicyFile, ciPolicyFile) using Windows-style backslashes. Running on ubuntu-latest produced garbled mixed paths like '/home/vsts/work/1/s/src\myapp\'. Changes: - Add per-job pool override: PublishNuGet runs on windows-latest - Convert FolderPath and all shell commands to Windows paths - Replace bash scripts with PowerShell for the Windows agent - PyPI and npm stages remain on ubuntu-latest (unchanged) - Add comment to delete orphaned ESRP_DOMAIN_TENANT_ID ADO variable Co-authored-by: Copilot <[email protected]> * docs: reland empty-merge changes from PRs #1017 and #1020 (#1125) PRs #1017 and #1020 were squash-merged as empty commits (0 file changes). This commit re-applies the intended documentation updates. From PR #1017 (critic gaps): - LIMITATIONS.md: add sections 7 (knowledge governance gap), 8 (credential persistence gap), 9 (initialization bypass risk) - LIMITATIONS.md: add knowledge governance and enforcement infra rows to 'What AGT Is Not' table - THREAT_MODEL.md: add knowledge flow and credential persistence to residual risks, add configuration bypass vectors table, remove stale '10/10' qualifier From PR #1020 (SOC2 resolved gaps): - soc2-mapping.md: mark kill switch as resolved (saga handoff implemented in kill_switch.py:69-178) - soc2-mapping.md: mark DeltaEngine verify_chain() as resolved (SHA-256 chain verification in delta.py:67-127) - soc2-mapping.md: add Resolved section to gaps summary, update Processing Integrity to 2 of 4 defects (was 3 of 4) Co-authored-by: Copilot <[email protected]> * feat(dotnet): add MCP security namespace — completes cross-language MCP parity (#1021) * fix(ci): add path filters and concurrency; announce v3.1.0 release CI optimization: - Add paths-ignore for docs to 5 code-only workflows - Add paths filter to Link Check (only run on docs changes) - Add concurrency groups to 7 heavy workflows - Docs-only PRs drop from ~14 checks to ~4 README: - Add v3.1.0 release announcement callout - Add PyPI version badge - Update tutorial count to 31 * docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps - Kill switch is no longer placeholder: now implements saga handoff with handoff_success_count tracking (kill_switch.py:69-178) - DeltaEngine verify_chain() is no longer a stub: now performs SHA-256 chain verification (delta.py:67-127) - Move both from Critical/High gaps to new 'Resolved' section - Update Processing Integrity coverage (2 of 4 defects, not 3 of 4) - Update evidence table with current line ranges * feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer Add AgentGovernance.Mcp namespace implementing full MCP security parity with TypeScript and Rust SDKs: - McpSecurityScanner: tool poisoning, typosquatting, hidden instructions, rug pull, schema abuse, cross-server attack, and description injection detection - McpCredentialRedactor: regex-based redaction of API keys, bearer tokens, connection strings, and secret assignments - McpResponseSanitizer: response scanning for prompt injection tags, imperative phrasing, credential leakage, and exfiltration URLs - McpGateway: policy enforcement pipeline with deny/allow lists, payload sanitization, rate limiting, and human approval gates Includes 46 xUnit tests covering all threat categories. Updates SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅. --------- Co-authored-by: Copilot <[email protected]> * docs: address external critic gaps (#1025) * feat(dotnet): add kill switch and lifecycle management to .NET SDK (#5) - Add KillSwitch with arm/disarm, event history, and subscriber notifications - Add LifecycleManager with 8-state machine and validated transitions - Add 26 xUnit tests - Update README Co-authored-by: Copilot <[email protected]> * feat(rust): add execution rings and lifecycle management to Rust SDK (#6) * feat(dotnet): add kill switch and lifecycle management to .NET SDK - Add KillSwitch with arm/disarm, event history, and subscriber notifications - Add LifecycleManager with 8-state machine and validated transitions - Add comprehensive xUnit tests for both components (26 tests) - Update .NET SDK README with usage documentation Co-authored-by: Copilot <[email protected]> * feat(rust): add execution rings and lifecycle management to Rust SDK Add two new modules to the agentmesh Rust crate: - rings.rs: Four-level execution privilege ring model (Admin/Standard/ Restricted/Sandboxed) with per-agent assignment and per-ring action permissions, ported from the Python hypervisor enforcer. - lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through Decommissioned) with validated state transitions and event history, matching the lifecycle model used across other SDK languages. Both modules include comprehensive unit tests and are re-exported from the crate root. README updated with API tables and usage examples. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#7) * feat(openshell): add governance skill package and runnable example (#942) Co-authored-by: Copilot <[email protected]> * feat(go): add MCP security, execution rings, and lifecycle management to Go SDK - mcp.go: MCP security scanner detecting tool poisoning, typosquatting, hidden instructions (zero-width chars, homoglyphs), and rug pulls - rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed) with default-deny access control - lifecycle.go: Eight-state agent lifecycle manager with validated transitions - Full test coverage for all three modules - Updated README with API docs and examples Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * docs: sync audit redaction status and framing with current code (#8) * feat(openshell): add governance skill package and runnable example (#942) Co-authored-by: Copilot <[email protected]> * feat(typescript): add MCP security scanner and lifecycle management to TS SDK (#947) Co-authored-by: Copilot <[email protected]> * docs: update SDK feature matrix after parity pass (#950) Reflects new capabilities added in PRs #947 (TS), .NET, Rust, Go: - TypeScript: MCP security scanner + lifecycle management (was 5/14, now 7/14) - .NET: Kill switch + lifecycle management (was 8/14, now 10/14) - Rust: Execution rings + lifecycle management (was 6/14, now 8/14) - Go: MCP security + rings + lifecycle (was 4/14, now 7/14) All SDKs now have lifecycle management. Core governance (policy, identity, trust, audit) + lifecycle = 5 primitives shared across all 5 languages. Co-authored-by: Copilot <[email protected]> * docs: add LIMITATIONS.md - honest design boundaries and layered defense (#953) Addresses valid external critique of AGT's architectural blind spots: 1. Action vs Intent: AGT governs individual actions, not reasoning or action sequences. Documents the compound-action gap explicitly and recommends content policies + model safety layers. 2. Audit logs record attempts, not outcomes: Documents that post-action state verification is the user's responsibility today, with hooks planned. 3. Performance honesty: README now notes that <0.1ms is policy-eval only; distributed mesh adds 5-50ms. Full breakdown in LIMITATIONS.md. 4. Complexity spectrum: Documents the minimal path (just PolicyEvaluator, no mesh/crypto) vs full enterprise stack. 5. Vendor independence: Documents zero cloud dependencies in core, standard formats for all state, migration path. 6. Recommended layered defense architecture diagram showing AGT as one layer alongside model safety, application logic, and infrastructure. Co-authored-by: Copilot <[email protected]> * fix(docs): rewrite OpenClaw sidecar deployment with working K8s manifests (#954) Closes #952 Co-authored-by: Copilot <[email protected]> * feat: reversibility checker, trust calibration guide, escalation tests (#955) ReversibilityChecker with 4 levels and compensation plans. Trust score calibration guide with weights, decay, thresholds. 19 tests. Co-authored-by: Copilot <[email protected]> * feat: AGT Lite — zero-config governance in 3 lines + fix broken quickstart (#956) agent_os.lite: govern() factory, sub-ms enforcement, 16 tests. Fixed quickstart that called nonexistent add_rules(). Co-authored-by: Copilot <[email protected]> * fix: bump all runtime versions to 3.1.0 and fix CI lint/test failures (#957) - Bump __version__ in 29 Python __init__.py files from 3.0.2 to 3.1.0 - Bump version= in 6 setup.py files from 3.0.2 to 3.1.0 - Bump meter version strings in _mcp_metrics.py - Bump 9 package.json files from 3.0.2 to 3.1.0 - Bump .NET csproj Version from 3.0.2 to 3.1.0 - Bump Rust workspace Cargo.toml from 3.0.2 to 3.1.0 - Create Go sdk doc.go with version marker 3.1.0 - Fix ruff W292 (missing newline at EOF) in data_classification.py - Fix CLI init regex to allow dots in agent names (test_init_special_characters) Co-authored-by: Copilot <[email protected]> * fix(openclaw): critical honesty pass — document what works vs what's planned (#958) Co-authored-by: Copilot <[email protected]> * fix(ci): fix Rust crate packaging - use workspace root with -p agentmesh (#959) * fix(openclaw): critical honesty pass — document what works vs what's planned Server (__main__.py): - Add --host/--port argparse + env var support (was hardcoded 127.0.0.1:8080) Dockerfile.sidecar: - Copy modules/ directory (was missing, causing build failure) - Use 0.0.0.0 for container binding (127.0.0.1 is wrong inside containers) - Remove phantom port 9091 (no separate metrics listener exists) openclaw-sidecar.md — full honesty rewrite: - Add status banner: transparent interception is NOT yet implemented - Document actual sidecar API endpoints (health, detect/injection, execute, metrics) - Fix Docker Compose to use Dockerfile.sidecar (was using wrong Dockerfile) - Remove GOVERNANCE_PROXY claim (OpenClaw doesn't natively read this) - Replace fictional SLO/Grafana sections with real /api/v1/metrics docs - Add Roadmap section listing what's planned vs shipped openshell.md: - Remove references to non-existent shell scripts - Fix python -m agentmesh.server to python -m agent_os.server - Add note that sidecar doesn't transparently intercept (must call API) - Replace pip install agentmesh-platform with Python skill library usage Co-authored-by: Copilot <[email protected]> * fix(ci): fix Rust crate packaging — use workspace root with -p agentmesh cargo package in a workspace writes .crate files to the workspace root's target/package/, not the individual crate's directory. The pipeline was running from the crate subdirectory and couldn't find the output. Fix: change workingDirectory from packages/agent-mesh/sdks/rust/agentmesh to packages/agent-mesh/sdks/rust (workspace root) and add -p agentmesh to all cargo commands to target the specific crate. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * docs(adr): ADR 0005 — Liveness attestation extension for TrustHandshake (#948) Proposes liveness attestation as opt-in gate for TrustHandshake. Addresses ghost-agent and ungraceful-handoff gaps from #772. Co-authored-by: kevinkaylie <[email protected]> * blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall (#899) Co-authored-by: aymenhmaidiwastaken <[email protected]> * feat: add LotL prevention policy for security measures (#949) YAML policy template for Living-off-the-Land detection and prevention. * feat(examples): add ATR community security rules for PolicyEvaluator (#908) 15 curated ATR detection rules + sync script. Closes #901. * fix(docs): correct npm package name and stale version refs across 21 files (#960) - Fix @agentmesh/sdk → @microsoft/agentmesh-sdk in 13 markdown files (README, QUICKSTART, tutorials, SDK docs, i18n, changelog) - Fix broken demo path in agent-os README (agent-os/demo.py → demo/maf_governance_demo.py) - Remove stale v1.0.0 labels from extension status table - Bump AGT Version refs 3.0.2 → 3.1.0 in case study templates and ATF conformance assessment Co-authored-by: Copilot <[email protected]> * fix(ci): use ESRP Release for NuGet signing (#961) Co-authored-by: Copilot <[email protected]> * fix(ci): correct ESRP NuGet contenttype casing (#962) * fix(ci): add missing packages to ESRP pipeline and fix Go version tag Three gaps found during publish verification: 1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix) 2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp) - Changed from single-crate to workspace build (--workspace) - Package loop builds both .crate files - Renamed artifact from 'rust-agentmesh' to 'rust-crates' 3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0) - Go module tags require semver with v prefix - Pipeline grep expects '// Version: v...' format Co-authored-by: Copilot <[email protected]> * fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget' ESRP Release rejected 'Nuget' with: 'The value provided for ReleaseContentType property is invalid.' ErrorCode 2254. ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * fix(ci): add missing packages to ESRP pipeline and fix Go version tag (#963) * fix(ci): add missing packages to ESRP pipeline and fix Go version tag Three gaps found during publish verification: 1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix) 2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp) - Changed from single-crate to workspace build (--workspace) - Package loop builds both .crate files - Renamed artifact from 'rust-agentmesh' to 'rust-crates' 3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0) - Go module tags require semver with v prefix - Pipeline grep expects '// Version: v...' format Co-authored-by: Copilot <[email protected]> * fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget' ESRP Release rejected 'Nuget' with: 'The value provided for ReleaseContentType property is invalid.' ErrorCode 2254. ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * fix(ci): use EsrpCodeSigning + dotnet push for NuGet (#965) EsrpRelease@11 does not support NuGet as a contenttype — it's for PyPI/npm/Maven/crates.io package distribution. NuGet packages must be signed with EsrpCodeSigning@5 first, then pushed with dotnet nuget push. New flow: 1. EsrpCodeSigning@5 with NuGetSign + NuGetVerify operations (CP-401405) 2. dotnet nuget push with the signed .nupkg to nuget.org This matches the standard Microsoft NuGet ESRP signing pattern used by azure-sdk, dotnet runtime, and other Microsoft OSS projects. Co-authored-by: Copilot <[email protected]> * fix(security): upgrade axios to 1.15.0 - CVE-2026-40175, CVE-2025-62718 (#966) Critical S360 action items for SFI-ES5.2 1ES Open Source Vulnerabilities. CVE-2026-40175 (CVSS 9.9): Unrestricted Cloud Metadata Exfiltration via Header Injection Chain — prototype pollution gadget enables CRLF injection in HTTP headers, bypassing AWS IMDSv2 session tokens. CVE-2025-62718: NO_PROXY Bypass via Hostname Normalization — trailing dots and IPv6 literals skip NO_PROXY matching, enabling SSRF through attacker-controlled proxy. Upgraded in 3 packages: - extensions/copilot: 1.14.0 → 1.15.0 - extensions/cursor: 1.13.5 → 1.15.0 - agent-os-vscode: 1.13.6 → 1.15.0 Co-authored-by: Copilot <[email protected]> * fix(ci): resolve ESRP_DOMAIN_TENANT_ID cyclical reference (#967) The ADO variable ESRP_DOMAIN_TENANT_ID had a cyclical self-reference, preventing ESRP authentication across ALL publishing stages (PyPI, npm, NuGet, crates.io). Fix: Define MICROSOFT_TENANT_ID as a pipeline-level variable with the well-known Microsoft corporate tenant ID (72f988bf-..., same default used by ESRP Release action.yml). This is a public value, not a secret. Also: NuGet publishing requires Microsoft as co-owner of the package on NuGet.org. See https://aka.ms/Microsoft-NuGet-Compliance Co-authored-by: Copilot <[email protected]> * docs: sync audit redaction status and framing with current code - Update SOC2 mapping to reflect CredentialRedactor now redacts credential-like secrets before audit persistence (API keys, tokens, JWTs, connection strings, etc.). Remaining gap: non-credential PII (email, phone, addresses) not yet redacted in audit entries. - Replace 'kernel-level enforcement' with 'policy-layer enforcement' in README, OWASP compliance, and architecture overview to match the existing 'application-level governance' framing in README Security section and LIMITATIONS.md. - Qualify 10/10 OWASP coverage claim in COMPARISON.md with footnote clarifying this means mitigation components exist per risk category, not full elimination. - Update owasp-llm-top10-mapping.md LLM06 row for credential redaction. Addresses doc/code inconsistencies identified in external review. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: kevinkaylie <[email protected]> Co-authored-by: Aymen Hmaidi <[email protected]> Co-authored-by: harshnair75567-cloud <[email protected]> Co-authored-by: Adamthereal <[email protected]> --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: kevinkaylie <[email protected]> Co-authored-by: Aymen Hmaidi <[email protected]> Co-authored-by: harshnair75567-cloud <[email protected]> Co-authored-by: Adamthereal <[email protected]> * fix(lint): resolve agent-mesh lint errors in eu_ai_act.py (#1028) - Remove unused variable profiling_override (F841) - Remove f-string without placeholders (F541) - Fix whitespace in docstrings (W293) Co-authored-by: Copilot <[email protected]> * fix(ci): add path filters and concurrency; announce v3.1.0 release (#1039) CI optimization: - Add paths-ignore for docs to 5 code-only workflows - Add paths filter to Link Check (only run on docs changes) - Add concurrency groups to 7 heavy workflows - Docs-only PRs drop from ~14 checks to ~4 README: - Add v3.1.0 release announcement callout - Add PyPI version badge - Update tutorial count to 31 Co-authored-by: Copilot <[email protected]> * docs: add ADOPTERS.md and make deployment guides multi-cloud (#1040) - New ADOPTERS.md following Backstage/Flatcar pattern with Production, Evaluation, and Academic tables + instructions for adding your org - Rewrite docs/deployment/README.md from Azure-only to multi-cloud: Azure (AKS, Foundry, Container Apps), AWS (ECS/Fargate), GCP (GKE), Docker Compose, self-hosted. Updated architecture diagram to show cloud-agnostic deployment patterns. - Fix broken AWS/GCP links (pointed to non-existent paths) - README now links to 'Deployment Guides' (multi-cloud) instead of 'Azure Deployment' - README Contributing section invites adopters to add their org Co-authored-by: Copilot <[email protected]> * feat: add AGT Lite — zero-config governance in 3 lines, fix broken quickstart (#1044) Addresses the #1 developer experience criticism: AGT is too complex to start. New: agent_os.lite — lightweight governance module - govern() factory: one line to create a governance gate - check(action): one line to enforce — raises GovernanceViolation or returns True - check.is_allowed(action): non-raising bool version - Allow lists, deny lists, regex patterns, content filtering, rate limiting - Built-in audit trail and stats - Sub-millisecond evaluation (0.003ms avg, 1000 evals in <100ms) - Zero dependencies beyond stdlib (re, time, datetime) - 16 tests passing Fix: govern_in_60_seconds.py quickstart - BROKEN: was calling PolicyEvaluator.add_rules() which does not exist - FIXED: now uses agent_os.lite.govern() which actually works - Verified end-to-end: script runs and produces correct output The lite module is for developers who just want basic governance without learning PolicyEvaluator, YAML, OPA/Rego, trust mesh, etc. Upgrade to the full stack when you need it. Co-authored-by: Copilot <[email protected]> * feat(ci): enhance weekly security audit with 7 new scan jobs (#1051) Add comprehensive security checks based on issues found during the MSRC-111178 security audit and ongoing post-merge reviews: - Workflow security regression (MSRC-111178 pull_request_target check) - Expression injection scan (github.event.* in run: blocks) - Docker security (root containers, wildcard CORS, hardcoded passwords, 0.0.0.0 bindings) - XSS and unsafe DOM (innerHTML, eval, yaml.load, shell=True) - Action SHA pinning compliance - Version pinning (pyproject.toml upper bounds, Docker :latest tags, license field format) - Dependency confusion with --strict mode (pyproject.toml + package.json) - Retention days updated to 180 (EU AI Act Art. 26(6)) Co-authored-by: Copilot <[email protected]> * fix(ci): fix OpenShell integration CI — spelling, link check, policy validation (#1057) - Add OpenShell/NVIDIA terms to cspell dictionary (Landlock, seccomp, syscall, etc.) - Fix broken link: openclaw-skill -> openshell-skill in docs/integrations/openshell.md - Fix policy validation: replace starts_with (invalid) with matches + regex Co-authored-by: Copilot <[email protected]> * feat: add reversibility checker, trust calibration guide, and escalation/reversibility tests (#1061) Addresses critical review feedback: 1. Rollback/reversibility (agent_os.reversibility) - ReversibilityChecker: pre-execution assessment of action reversibility - 4 levels: fully_reversible, partially_reversible, irreversible, unknown - CompensatingAction: structured undo plans for each action type - Built-in rules for 12 common actions (write, deploy, delete, email, etc.) - block_irreversible mode for strict environments 2. Trust score calibration guide (docs/security/trust-score-calibration.md) - Score component weights (compliance 35%, task 25%, behavior 25%, identity 15%) - Decay functions with tier floors - Initial score assignments by agent origin - Threshold recommendations (conservative/moderate/permissive) - Anti-gaming measures and operational playbook 3. Tests: 19 passing (10 escalation + 9 reversibility) Co-authored-by: Copilot <[email protected]> * feat: deployment runtime (Docker/AKS) and shared trust core types (#1062) agent-runtime: Evolve from thin re-export shim to deployment runtime - DockerDeployer: container deployment with security hardening (cap-drop ALL, no-new-privileges, read-only rootfs) - KubernetesDeployer: AKS pod deployment with governance sidecars (runAsNonRoot, seccompProfile, resource limits) - GovernanceConfig: policy/trust/audit config injected as env vars - DeploymentTarget protocol for extensibility (ADC, nono, etc.) - 24 tests (all subprocess calls mocked) agent-mesh: Extract shared trust types into agentmesh.trust_types - TrustScore, AgentProfile, TrustRecord, TrustTracker - Canonical implementations replacing ~800 lines of duplicated code across 6+ integration packages - 25 tests covering clamping, scoring, history, capabilities Co-authored-by: Copilot <[email protected]> * feat(dotnet): add kill switch and lifecycle management to .NET SDK (#1065) - Add KillSwitch with arm/disarm, event history, and subscriber notifications - Add LifecycleManager with 8-state machine and validated transitions - Add comprehensive xUnit tests for both components (26 tests) - Update .NET SDK README with usage documentation Co-authored-by: Copilot <[email protected]> * feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (#1066) - mcp.go: MCP security scanner detecting tool poisoning, typosquatting, hidden instructions (zero-width chars, homoglyphs), and rug pulls - rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed) with default-deny access control - lifecycle.go: Eight-state agent lifecycle manager with validated transitions - Full test coverage for all three modules - Updated README with API docs and examples Co-authored-by: Copilot <[email protected]> * feat(rust): add execution rings and lifecycle management to Rust SDK (#1067) * feat(dotnet): add kill switch and lifecycle management to .NET SDK - Add KillSwitch with arm/disarm, event history, and subscriber notifications - Add LifecycleManager with 8-state machine and validated transitions - Add comprehensive xUnit tests for both components (26 tests) - Update .NET SDK README with usage documentation Co-authored-by: Copilot <[email protected]> * feat(rust): add execution rings and lifecycle management to Rust SDK Add two new modules to the agentmesh Rust crate: - rings.rs: Four-level execution privilege ring model (Admin/Standard/ Restricted/Sandboxed) with per-agent assignment and per-ring action permissions, ported from the Python hypervisor enforcer. - lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through Decommissioned) with validated state transitions and event history, matching the lifecycle model used across other SDK languages. Both modules include comprehensive unit tests and are re-exported from the crate root. README updated with API tables and usage examples. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * fix: align lotl_prevention_policy.yaml with PolicyDocument schema The policy file used an incompatible schema format (id, parameter, regex_match, effect) instead of the expected PolicyDocument fields (name, condition.field, operator, action). This caused the validate-policies CI check to fail for all PRs. Changes: - id → name - condition.parameter → condition.field - operator: regex_match → operator: matches - action at rule level (shell_exec/file_read) → action: deny - effect: DENY → removed (redundant with action: deny) - Added version, name, description, disclaimer at top level Co-authored-by: Copilot <[email protected]> * fix: resolve .NET ESRP signing issues blocking NuGet publish GitHub Actions (publish.yml): - Fix broken if-guards on signing steps: env.ESRP_AAD_ID was set in step-level env (invisible to if-expressions). Replace with job-level ESRP_CONFIGURED env derived from secrets. - Add missing ESRP_CERT_IDENTIFIER to signing step env blocks. - Gate the publish step on ESRP_CONFIGURED so unsigned packages are never pushed to NuGet.org under the Microsoft.* prefix. - Make stub signing steps fail-fast (exit 1) instead of silently succeeding, preventing unsigned packages from reaching NuGet push. ADO Pipeline (esrp-publish.yml): - Add UseDotNet@2 task to Publish_NuGet stage so dotnet nuget push has a guaranteed SDK version on the Windows agent. Co-authored-by: Copilot <[email protected]> * fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1163) The docker-compose example in openclaw-sidecar.md was illustrative only and did not work — it referenced a non-existent OpenClaw image and lacked healthchecks. Users were hitting this and getting confused. Changes: - Add working demo at demo/openclaw-governed/ with docker-compose.yaml that builds and runs the governance sidecar from source - Replace the inline docker-compose in the doc with a link to the demo plus a clearly-labeled reference template for custom deployments - Add prominent WARNING callout listing known limitations (no native OpenClaw integration, no published images, explicit API required) - Remove stale orphaned curl snippet after the docker-compose block - Add healthcheck to docker-compose governance-sidecar service - Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest to a placeholder users must replace with their own image Co-authored-by: Copilot <[email protected]> * fix(docs): fix OpenClaw sidecar demo and add limitations callout (#1164) The docker-compose example in openclaw-sidecar.md was illustrative only and did not work — it referenced a non-existent OpenClaw image and lacked healthchecks. Users were hitting this and getting confused. Changes: - Add working demo at demo/openclaw-governed/ with docker-compose.yaml that builds and runs the governance sidecar from source - Replace the inline docker-compose in the doc with a link to the demo plus a clearly-labeled reference template for custom deployments - Add prominent WARNING callout listing known limitations (no native OpenClaw integration, no published images, explicit API required) - Remove stale orphaned curl snippet after the docker-compose block - Add healthcheck to docker-compose governance-sidecar service - Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest to a placeholder users must replace with their own image Co-authored-by: Copilot <[email protected]> * fix(ci): use PME tenant ID for ESRP cert signing The ESRP signing cert lives in the PME (Partner Managed Engineering) tenant (975f013f), not the Microsoft corporate tenant (72f988bf). Using the wrong tenant ID causes ESRP signing to fail when looking up the cert. Co-authored-by: Copilot <[email protected]> * docs: Add Scaling AI Agents article to COMMUNITY.md (#857) Co-authored-by: deepsearch <[email protected]> * Add runtime evidence mode to agt verify (#969) * Track agt verify evidence plan * Add runtime evidence mode to agt verify * Add runtime evidence verifier tests * Add CLI tests for agt verify evidence mode * Document evidence mode for compliance verification * Remove local implementation notes * Document agt verify evidence mode * Harden evidence path handling in verify --------- Co-authored-by: T. Smith <[email protected]> * docs: add Entra Agent ID bridge tutorial with R&R matrix and DID fix - Add Tutorial 31: Bridging AGT Identity with Microsoft Entra Agent ID - Detailed roles & responsibilities between AGT and Entra/Agent365 - Architecture diagram showing the identity bridge - Step-by-step: DID creation, Entra binding, AKS workload identity, token validation, lifecycle sync, access verification - Known gaps and limitations table - Platform independence note (AWS, GCP, Okta patterns) - Fix DID prefix in .NET MCP gateway tests (did:agentmesh → did:mesh for consistency with Python reference implementation and .NET SDK) - Update tutorials README with Enterprise Identity section Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: kevinkaylie <[email protected]> Co-authored-by: Aymen Hmaidi <[email protected]> Co-authored-by: harshnair75567-cloud <[email protected]> Co-authored-by: Adamthereal <[email protected]> Co-authored-by: Jack Batzner <[email protected]> Co-authored-by: lawcontinue <[email protected]> Co-authored-by: deepsearch <[email protected]> Co-authored-by: ewmh <[email protected]> Co-authored-by: T. Smith <[email protected]> * docs: address external critic gaps in limitations and threat model (#11) Add three new sections to LIMITATIONS.md addressing gaps identified in public criticism and external security analysis: - §10 Physical AI and Embodied Agent Governance: documents that AGT governs software agents not physical actuators, with mitigations - §11 Streaming Data and Real-Time Assurance: documents that AGT evaluates per-action not continuously over data streams - §12 DID Method Inconsistency Across SDKs: documents the did:mesh vs did:agentmesh split with migration plan for v4.0 Update THREAT_MODEL.md residual risks to reference all three new limitation sections. Co-authored-by: Copilot <[email protected]> * fix!: standardize DID method to did:agentmesh across all SDKs (#12) * fix!: standardize DID method to did:agentmesh across all SDKs BREAKING CHANGE: All agent DIDs now use the did:agentmesh: prefix. The legacy did:mesh: prefix used by Python and .NET has been migrated to match the did:agentmesh: convention already used by TypeScript, Rust, and Go SDKs. Changes: - Python: agent_id.py, delegation.py, entra.py, all integrations - .NET: AgentIdentity.cs, Jwk.cs, GovernanceKernel.cs, all tests - Docs: README, tutorials, identity docs, FAQ, compliance docs - Tests: all test fixtures updated across Python, .NET, TS, VSCode - Version bump: 3.1.0 → 3.2.0 (.NET, Python agent-mesh, TypeScript) Migration: replace did:mesh: with did:agentmesh: in your policies, identity registries, and agent configurations. Co-authored-by: Copilot <[email protected]> * docs: add Q11-Q13 to FAQ — AGT scope, Agent 365, and DLP comparison Adds three new customer Q&As: - Q11: Is AGT for Foundry agents or any agent type? (any) - Q12: Relationship between AGT and Agent 365 (different layers) - Q13: How is AGT different from DLP/communication compliance (content vs action governance) Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * fix(security): address all 14 open code scanning alerts (#13) * fix: address 6 Dependabot security vulnerabilities - python-multipart 0.0.22 → 0.0.26 (DoS via large preamble/epilogue) - pytest 8.4.1 → 9.0.3 (tmpdir handling vulnerability) - langchain-core 1.2.11 → 1.2.28 (SSRF, path traversal, f-string validation) - langchain-core >=0.2.0,<1.0 → >=1.2.28 in langchain-agentmesh pyproject.toml - tsup 8.0.0 → 8.5.1 (DOM clobbering vulnerability) - rand 0.8.5: dismissed #176 as inaccurate (vuln affects rand::rng() 0.9.x API only) Fixes Dependabot alerts: #177, #175, #166, #164, #157, #156 Dismissed: #176 (not applicable to rand 0.8.x) Co-authored-by: Copilot <[email protected]> * fix(security): address all 14 open code scanning alerts Scorecard HIGH: - publish-containers.yml: scope packages:write to job level (#316) Scorecard MEDIUM (pinned dependencies): - docs.yml: pin 4 GitHub Actions by SHA hash (#311-314) - docs.yml: use requirements.txt for pip install (#315) - agent-mesh Dockerfile: pin python:3.11-slim by SHA (#317,#318) - agent-os Dockerfile.sidecar: pin python:3.14-slim by SHA (#295,#296) - dashboard Dockerfile: pin python:3.12-slim by SHA (#291,#293) CodeQL: - test_time_decay.py: timedelta(days=365) -> 366 for leap safety (#289,#290) Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: kevinkaylie <[email protected]> Co-authored-by: Aymen Hmaidi <[email protected]> Co-authored-by: harshnair75567-cloud <[email protected]> Co-authored-by: Adamthereal <[email protected]> Co-authored-by: Jack Batzner <[email protected]> Co-authored-by: lawcontinue <[email protected]> Co-authored-by: deepsearch <[email protected]> Co-authored-by: ewmh <[email protected]> Co-authored-by: T. Smith <[email protected]>
…ents (microsoft#296) * fix(security): replace XOR placeholder with AES-256-GCM, add Security Model section Address 3 findings from security review: 1. Replace insecure XOR placeholder encryption in DMZ module with real AES-256-GCM via cryptography library (was: 'NOT SECURE - placeholder only' comment in nexus/dmz.py) 2. Add 'Security Model & Limitations' section to root README making clear this is application-level middleware, not OS kernel isolation. Includes table of what each layer provides vs. does not provide. 3. Add checksum verification guidance to community preview disclaimer. Co-authored-by: Copilot <[email protected]> * fix(security): add demo warnings, adversarial mode, and security advisories - Add in-memory storage warning to demo startup - Add sample policy disclaimer to demo startup - Add --include-attacks flag for adversarial demo scenarios (prompt injection, tool alias bypass, SQL policy bypass) - Add security advisories to SECURITY.md for CostGuard org kill bypass (microsoft#272) and thread safety fixes (v2.1.0) Co-authored-by: Copilot <[email protected]> * docs: relabel CostGuard and thread safety fixes as security items in CHANGELOG Move CostGuard org kill bypass (microsoft#272), CostGuard thread safety (microsoft#253), ErrorBudget unbounded deque (microsoft#172), and VectorClock race condition (microsoft#243) from 'Fixed' to 'Security' section in v2.1.0 CHANGELOG — these are security fixes affecting concurrent governance enforcement. Co-authored-by: Copilot <[email protected]> * fix: address PR review feedback — docstrings, changelog, yaml safety - Add docstring to scenario_adversarial_attacks - Document --include-attacks flag in README - Pin pyyaml version in security-scan workflow - Audit and fix unsafe yaml.load() calls (if any) - Add unreleased changelog entries Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]>
* feat(dotnet): add MCP security namespace — completes cross-language MCP parity * fix(ci): add path filters and concurrency; announce v3.1.0 release CI optimization: - Add paths-ignore for docs to 5 code-only workflows - Add paths filter to Link Check (only run on docs changes) - Add concurrency groups to 7 heavy workflows - Docs-only PRs drop from ~14 checks to ~4 README: - Add v3.1.0 release announcement callout - Add PyPI version badge - Update tutorial count to 31 Co-authored-by: Copilot <[email protected]> * docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps - Kill switch is no longer placeholder: now implements saga handoff with handoff_success_count tracking (kill_switch.py:69-178) - DeltaEngine verify_chain() is no longer a stub: now performs SHA-256 chain verification (delta.py:67-127) - Move both from Critical/High gaps to new 'Resolved' section - Update Processing Integrity coverage (2 of 4 defects, not 3 of 4) - Update evidence table with current line ranges Co-authored-by: Copilot <[email protected]> * feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer Add AgentGovernance.Mcp namespace implementing full MCP security parity with TypeScript and Rust SDKs: - McpSecurityScanner: tool poisoning, typosquatting, hidden instructions, rug pull, schema abuse, cross-server attack, and description injection detection - McpCredentialRedactor: regex-based redaction of API keys, bearer tokens, connection strings, and secret assignments - McpResponseSanitizer: response scanning for prompt injection tags, imperative phrasing, credential leakage, and exfiltration URLs - McpGateway: policy enforcement pipeline with deny/allow lists, payload sanitization, rate limiting, and human approval gates Includes 46 xUnit tests covering all threat categories. Updates SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * docs: add Entra Agent ID bridge tutorial (Tutorial 31) (microsoft#10) * fix(pipeline): run NuGet ESRP signing on Windows agent (microsoft#1022) The EsrpCodeSigning@5 task constructs internal paths (batchSignPolicyFile, ciPolicyFile) using Windows-style backslashes. Running on ubuntu-latest produced garbled mixed paths like '/home/vsts/work/1/s/src\myapp\'. Changes: - Add per-job pool override: PublishNuGet runs on windows-latest - Convert FolderPath and all shell commands to Windows paths - Replace bash scripts with PowerShell for the Windows agent - PyPI and npm stages remain on ubuntu-latest (unchanged) - Add comment to delete orphaned ESRP_DOMAIN_TENANT_ID ADO variable Co-authored-by: Copilot <[email protected]> * docs: reland empty-merge changes from PRs microsoft#1017 and microsoft#1020 (microsoft#1125) PRs microsoft#1017 and microsoft#1020 were squash-merged as empty commits (0 file changes). This commit re-applies the intended documentation updates. From PR microsoft#1017 (critic gaps): - LIMITATIONS.md: add sections 7 (knowledge governance gap), 8 (credential persistence gap), 9 (initialization bypass risk) - LIMITATIONS.md: add knowledge governance and enforcement infra rows to 'What AGT Is Not' table - THREAT_MODEL.md: add knowledge flow and credential persistence to residual risks, add configuration bypass vectors table, remove stale '10/10' qualifier From PR microsoft#1020 (SOC2 resolved gaps): - soc2-mapping.md: mark kill switch as resolved (saga handoff implemented in kill_switch.py:69-178) - soc2-mapping.md: mark DeltaEngine verify_chain() as resolved (SHA-256 chain verification in delta.py:67-127) - soc2-mapping.md: add Resolved section to gaps summary, update Processing Integrity to 2 of 4 defects (was 3 of 4) Co-authored-by: Copilot <[email protected]> * feat(dotnet): add MCP security namespace — completes cross-language MCP parity (microsoft#1021) * fix(ci): add path filters and concurrency; announce v3.1.0 release CI optimization: - Add paths-ignore for docs to 5 code-only workflows - Add paths filter to Link Check (only run on docs changes) - Add concurrency groups to 7 heavy workflows - Docs-only PRs drop from ~14 checks to ~4 README: - Add v3.1.0 release announcement callout - Add PyPI version badge - Update tutorial count to 31 * docs: update SOC2 mapping for resolved kill switch and DeltaEngine gaps - Kill switch is no longer placeholder: now implements saga handoff with handoff_success_count tracking (kill_switch.py:69-178) - DeltaEngine verify_chain() is no longer a stub: now performs SHA-256 chain verification (delta.py:67-127) - Move both from Critical/High gaps to new 'Resolved' section - Update Processing Integrity coverage (2 of 4 defects, not 3 of 4) - Update evidence table with current line ranges * feat(dotnet): add MCP security namespace with scanner, gateway, redactor, and sanitizer Add AgentGovernance.Mcp namespace implementing full MCP security parity with TypeScript and Rust SDKs: - McpSecurityScanner: tool poisoning, typosquatting, hidden instructions, rug pull, schema abuse, cross-server attack, and description injection detection - McpCredentialRedactor: regex-based redaction of API keys, bearer tokens, connection strings, and secret assignments - McpResponseSanitizer: response scanning for prompt injection tags, imperative phrasing, credential leakage, and exfiltration URLs - McpGateway: policy enforcement pipeline with deny/allow lists, payload sanitization, rate limiting, and human approval gates Includes 46 xUnit tests covering all threat categories. Updates SDK-FEATURE-MATRIX.md to flip .NET MCP Security from — to ✅. --------- Co-authored-by: Copilot <[email protected]> * docs: address external critic gaps (microsoft#1025) * feat(dotnet): add kill switch and lifecycle management to .NET SDK (microsoft#5) - Add KillSwitch with arm/disarm, event history, and subscriber notifications - Add LifecycleManager with 8-state machine and validated transitions - Add 26 xUnit tests - Update README Co-authored-by: Copilot <[email protected]> * feat(rust): add execution rings and lifecycle management to Rust SDK (microsoft#6) * feat(dotnet): add kill switch and lifecycle management to .NET SDK - Add KillSwitch with arm/disarm, event history, and subscriber notifications - Add LifecycleManager with 8-state machine and validated transitions - Add comprehensive xUnit tests for both components (26 tests) - Update .NET SDK README with usage documentation Co-authored-by: Copilot <[email protected]> * feat(rust): add execution rings and lifecycle management to Rust SDK Add two new modules to the agentmesh Rust crate: - rings.rs: Four-level execution privilege ring model (Admin/Standard/ Restricted/Sandboxed) with per-agent assignment and per-ring action permissions, ported from the Python hypervisor enforcer. - lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through Decommissioned) with validated state transitions and event history, matching the lifecycle model used across other SDK languages. Both modules include comprehensive unit tests and are re-exported from the crate root. README updated with API tables and usage examples. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (microsoft#7) * feat(openshell): add governance skill package and runnable example (microsoft#942) Co-authored-by: Copilot <[email protected]> * feat(go): add MCP security, execution rings, and lifecycle management to Go SDK - mcp.go: MCP security scanner detecting tool poisoning, typosquatting, hidden instructions (zero-width chars, homoglyphs), and rug pulls - rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed) with default-deny access control - lifecycle.go: Eight-state agent lifecycle manager with validated transitions - Full test coverage for all three modules - Updated README with API docs and examples Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * docs: sync audit redaction status and framing with current code (microsoft#8) * feat(openshell): add governance skill package and runnable example (microsoft#942) Co-authored-by: Copilot <[email protected]> * feat(typescript): add MCP security scanner and lifecycle management to TS SDK (microsoft#947) Co-authored-by: Copilot <[email protected]> * docs: update SDK feature matrix after parity pass (microsoft#950) Reflects new capabilities added in PRs microsoft#947 (TS), .NET, Rust, Go: - TypeScript: MCP security scanner + lifecycle management (was 5/14, now 7/14) - .NET: Kill switch + lifecycle management (was 8/14, now 10/14) - Rust: Execution rings + lifecycle management (was 6/14, now 8/14) - Go: MCP security + rings + lifecycle (was 4/14, now 7/14) All SDKs now have lifecycle management. Core governance (policy, identity, trust, audit) + lifecycle = 5 primitives shared across all 5 languages. Co-authored-by: Copilot <[email protected]> * docs: add LIMITATIONS.md - honest design boundaries and layered defense (microsoft#953) Addresses valid external critique of AGT's architectural blind spots: 1. Action vs Intent: AGT governs individual actions, not reasoning or action sequences. Documents the compound-action gap explicitly and recommends content policies + model safety layers. 2. Audit logs record attempts, not outcomes: Documents that post-action state verification is the user's responsibility today, with hooks planned. 3. Performance honesty: README now notes that <0.1ms is policy-eval only; distributed mesh adds 5-50ms. Full breakdown in LIMITATIONS.md. 4. Complexity spectrum: Documents the minimal path (just PolicyEvaluator, no mesh/crypto) vs full enterprise stack. 5. Vendor independence: Documents zero cloud dependencies in core, standard formats for all state, migration path. 6. Recommended layered defense architecture diagram showing AGT as one layer alongside model safety, application logic, and infrastructure. Co-authored-by: Copilot <[email protected]> * fix(docs): rewrite OpenClaw sidecar deployment with working K8s manifests (microsoft#954) Closes microsoft#952 Co-authored-by: Copilot <[email protected]> * feat: reversibility checker, trust calibration guide, escalation tests (microsoft#955) ReversibilityChecker with 4 levels and compensation plans. Trust score calibration guide with weights, decay, thresholds. 19 tests. Co-authored-by: Copilot <[email protected]> * feat: AGT Lite — zero-config governance in 3 lines + fix broken quickstart (microsoft#956) agent_os.lite: govern() factory, sub-ms enforcement, 16 tests. Fixed quickstart that called nonexistent add_rules(). Co-authored-by: Copilot <[email protected]> * fix: bump all runtime versions to 3.1.0 and fix CI lint/test failures (microsoft#957) - Bump __version__ in 29 Python __init__.py files from 3.0.2 to 3.1.0 - Bump version= in 6 setup.py files from 3.0.2 to 3.1.0 - Bump meter version strings in _mcp_metrics.py - Bump 9 package.json files from 3.0.2 to 3.1.0 - Bump .NET csproj Version from 3.0.2 to 3.1.0 - Bump Rust workspace Cargo.toml from 3.0.2 to 3.1.0 - Create Go sdk doc.go with version marker 3.1.0 - Fix ruff W292 (missing newline at EOF) in data_classification.py - Fix CLI init regex to allow dots in agent names (test_init_special_characters) Co-authored-by: Copilot <[email protected]> * fix(openclaw): critical honesty pass — document what works vs what's planned (microsoft#958) Co-authored-by: Copilot <[email protected]> * fix(ci): fix Rust crate packaging - use workspace root with -p agentmesh (microsoft#959) * fix(openclaw): critical honesty pass — document what works vs what's planned Server (__main__.py): - Add --host/--port argparse + env var support (was hardcoded 127.0.0.1:8080) Dockerfile.sidecar: - Copy modules/ directory (was missing, causing build failure) - Use 0.0.0.0 for container binding (127.0.0.1 is wrong inside containers) - Remove phantom port 9091 (no separate metrics listener exists) openclaw-sidecar.md — full honesty rewrite: - Add status banner: transparent interception is NOT yet implemented - Document actual sidecar API endpoints (health, detect/injection, execute, metrics) - Fix Docker Compose to use Dockerfile.sidecar (was using wrong Dockerfile) - Remove GOVERNANCE_PROXY claim (OpenClaw doesn't natively read this) - Replace fictional SLO/Grafana sections with real /api/v1/metrics docs - Add Roadmap section listing what's planned vs shipped openshell.md: - Remove references to non-existent shell scripts - Fix python -m agentmesh.server to python -m agent_os.server - Add note that sidecar doesn't transparently intercept (must call API) - Replace pip install agentmesh-platform with Python skill library usage Co-authored-by: Copilot <[email protected]> * fix(ci): fix Rust crate packaging — use workspace root with -p agentmesh cargo package in a workspace writes .crate files to the workspace root's target/package/, not the individual crate's directory. The pipeline was running from the crate subdirectory and couldn't find the output. Fix: change workingDirectory from packages/agent-mesh/sdks/rust/agentmesh to packages/agent-mesh/sdks/rust (workspace root) and add -p agentmesh to all cargo commands to target the specific crate. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * docs(adr): ADR 0005 — Liveness attestation extension for TrustHandshake (microsoft#948) Proposes liveness attestation as opt-in gate for TrustHandshake. Addresses ghost-agent and ungraceful-handoff gaps from microsoft#772. Co-authored-by: kevinkaylie <[email protected]> * blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall (microsoft#899) Co-authored-by: aymenhmaidiwastaken <[email protected]> * feat: add LotL prevention policy for security measures (microsoft#949) YAML policy template for Living-off-the-Land detection and prevention. * feat(examples): add ATR community security rules for PolicyEvaluator (microsoft#908) 15 curated ATR detection rules + sync script. Closes microsoft#901. * fix(docs): correct npm package name and stale version refs across 21 files (microsoft#960) - Fix @agentmesh/sdk → @microsoft/agentmesh-sdk in 13 markdown files (README, QUICKSTART, tutorials, SDK docs, i18n, changelog) - Fix broken demo path in agent-os README (agent-os/demo.py → demo/maf_governance_demo.py) - Remove stale v1.0.0 labels from extension status table - Bump AGT Version refs 3.0.2 → 3.1.0 in case study templates and ATF conformance assessment Co-authored-by: Copilot <[email protected]> * fix(ci): use ESRP Release for NuGet signing (microsoft#961) Co-authored-by: Copilot <[email protected]> * fix(ci): correct ESRP NuGet contenttype casing (microsoft#962) * fix(ci): add missing packages to ESRP pipeline and fix Go version tag Three gaps found during publish verification: 1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix) 2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp) - Changed from single-crate to workspace build (--workspace) - Package loop builds both .crate files - Renamed artifact from 'rust-agentmesh' to 'rust-crates' 3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0) - Go module tags require semver with v prefix - Pipeline grep expects '// Version: v...' format Co-authored-by: Copilot <[email protected]> * fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget' ESRP Release rejected 'Nuget' with: 'The value provided for ReleaseContentType property is invalid.' ErrorCode 2254. ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * fix(ci): add missing packages to ESRP pipeline and fix Go version tag (microsoft#963) * fix(ci): add missing packages to ESRP pipeline and fix Go version tag Three gaps found during publish verification: 1. PyPI: add agentmesh-marketplace (8th package, was missing from matrix) 2. Rust: build+publish both workspace crates (agentmesh + agentmesh-mcp) - Changed from single-crate to workspace build (--workspace) - Package loop builds both .crate files - Renamed artifact from 'rust-agentmesh' to 'rust-crates' 3. Go: add 'v' prefix to version in doc.go (3.1.0 → v3.1.0) - Go module tags require semver with v prefix - Pipeline grep expects '// Version: v...' format Co-authored-by: Copilot <[email protected]> * fix(ci): correct ESRP NuGet contenttype casing — 'NuGet' not 'Nuget' ESRP Release rejected 'Nuget' with: 'The value provided for ReleaseContentType property is invalid.' ErrorCode 2254. ESRP content types are case-sensitive. Fix: 'Nuget' -> 'NuGet'. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * fix(ci): use EsrpCodeSigning + dotnet push for NuGet (microsoft#965) EsrpRelease@11 does not support NuGet as a contenttype — it's for PyPI/npm/Maven/crates.io package distribution. NuGet packages must be signed with EsrpCodeSigning@5 first, then pushed with dotnet nuget push. New flow: 1. EsrpCodeSigning@5 with NuGetSign + NuGetVerify operations (CP-401405) 2. dotnet nuget push with the signed .nupkg to nuget.org This matches the standard Microsoft NuGet ESRP signing pattern used by azure-sdk, dotnet runtime, and other Microsoft OSS projects. Co-authored-by: Copilot <[email protected]> * fix(security): upgrade axios to 1.15.0 - CVE-2026-40175, CVE-2025-62718 (microsoft#966) Critical S360 action items for SFI-ES5.2 1ES Open Source Vulnerabilities. CVE-2026-40175 (CVSS 9.9): Unrestricted Cloud Metadata Exfiltration via Header Injection Chain — prototype pollution gadget enables CRLF injection in HTTP headers, bypassing AWS IMDSv2 session tokens. CVE-2025-62718: NO_PROXY Bypass via Hostname Normalization — trailing dots and IPv6 literals skip NO_PROXY matching, enabling SSRF through attacker-controlled proxy. Upgraded in 3 packages: - extensions/copilot: 1.14.0 → 1.15.0 - extensions/cursor: 1.13.5 → 1.15.0 - agent-os-vscode: 1.13.6 → 1.15.0 Co-authored-by: Copilot <[email protected]> * fix(ci): resolve ESRP_DOMAIN_TENANT_ID cyclical reference (microsoft#967) The ADO variable ESRP_DOMAIN_TENANT_ID had a cyclical self-reference, preventing ESRP authentication across ALL publishing stages (PyPI, npm, NuGet, crates.io). Fix: Define MICROSOFT_TENANT_ID as a pipeline-level variable with the well-known Microsoft corporate tenant ID (72f988bf-..., same default used by ESRP Release action.yml). This is a public value, not a secret. Also: NuGet publishing requires Microsoft as co-owner of the package on NuGet.org. See https://aka.ms/Microsoft-NuGet-Compliance Co-authored-by: Copilot <[email protected]> * docs: sync audit redaction status and framing with current code - Update SOC2 mapping to reflect CredentialRedactor now redacts credential-like secrets before audit persistence (API keys, tokens, JWTs, connection strings, etc.). Remaining gap: non-credential PII (email, phone, addresses) not yet redacted in audit entries. - Replace 'kernel-level enforcement' with 'policy-layer enforcement' in README, OWASP compliance, and architecture overview to match the existing 'application-level governance' framing in README Security section and LIMITATIONS.md. - Qualify 10/10 OWASP coverage claim in COMPARISON.md with footnote clarifying this means mitigation components exist per risk category, not full elimination. - Update owasp-llm-top10-mapping.md LLM06 row for credential redaction. Addresses doc/code inconsistencies identified in external review. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: kevinkaylie <[email protected]> Co-authored-by: Aymen Hmaidi <[email protected]> Co-authored-by: harshnair75567-cloud <[email protected]> Co-authored-by: Adamthereal <[email protected]> --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: kevinkaylie <[email protected]> Co-authored-by: Aymen Hmaidi <[email protected]> Co-authored-by: harshnair75567-cloud <[email protected]> Co-authored-by: Adamthereal <[email protected]> * fix(lint): resolve agent-mesh lint errors in eu_ai_act.py (microsoft#1028) - Remove unused variable profiling_override (F841) - Remove f-string without placeholders (F541) - Fix whitespace in docstrings (W293) Co-authored-by: Copilot <[email protected]> * fix(ci): add path filters and concurrency; announce v3.1.0 release (microsoft#1039) CI optimization: - Add paths-ignore for docs to 5 code-only workflows - Add paths filter to Link Check (only run on docs changes) - Add concurrency groups to 7 heavy workflows - Docs-only PRs drop from ~14 checks to ~4 README: - Add v3.1.0 release announcement callout - Add PyPI version badge - Update tutorial count to 31 Co-authored-by: Copilot <[email protected]> * docs: add ADOPTERS.md and make deployment guides multi-cloud (microsoft#1040) - New ADOPTERS.md following Backstage/Flatcar pattern with Production, Evaluation, and Academic tables + instructions for adding your org - Rewrite docs/deployment/README.md from Azure-only to multi-cloud: Azure (AKS, Foundry, Container Apps), AWS (ECS/Fargate), GCP (GKE), Docker Compose, self-hosted. Updated architecture diagram to show cloud-agnostic deployment patterns. - Fix broken AWS/GCP links (pointed to non-existent paths) - README now links to 'Deployment Guides' (multi-cloud) instead of 'Azure Deployment' - README Contributing section invites adopters to add their org Co-authored-by: Copilot <[email protected]> * feat: add AGT Lite — zero-config governance in 3 lines, fix broken quickstart (microsoft#1044) Addresses the microsoft#1 developer experience criticism: AGT is too complex to start. New: agent_os.lite — lightweight governance module - govern() factory: one line to create a governance gate - check(action): one line to enforce — raises GovernanceViolation or returns True - check.is_allowed(action): non-raising bool version - Allow lists, deny lists, regex patterns, content filtering, rate limiting - Built-in audit trail and stats - Sub-millisecond evaluation (0.003ms avg, 1000 evals in <100ms) - Zero dependencies beyond stdlib (re, time, datetime) - 16 tests passing Fix: govern_in_60_seconds.py quickstart - BROKEN: was calling PolicyEvaluator.add_rules() which does not exist - FIXED: now uses agent_os.lite.govern() which actually works - Verified end-to-end: script runs and produces correct output The lite module is for developers who just want basic governance without learning PolicyEvaluator, YAML, OPA/Rego, trust mesh, etc. Upgrade to the full stack when you need it. Co-authored-by: Copilot <[email protected]> * feat(ci): enhance weekly security audit with 7 new scan jobs (microsoft#1051) Add comprehensive security checks based on issues found during the MSRC-111178 security audit and ongoing post-merge reviews: - Workflow security regression (MSRC-111178 pull_request_target check) - Expression injection scan (github.event.* in run: blocks) - Docker security (root containers, wildcard CORS, hardcoded passwords, 0.0.0.0 bindings) - XSS and unsafe DOM (innerHTML, eval, yaml.load, shell=True) - Action SHA pinning compliance - Version pinning (pyproject.toml upper bounds, Docker :latest tags, license field format) - Dependency confusion with --strict mode (pyproject.toml + package.json) - Retention days updated to 180 (EU AI Act Art. 26(6)) Co-authored-by: Copilot <[email protected]> * fix(ci): fix OpenShell integration CI — spelling, link check, policy validation (microsoft#1057) - Add OpenShell/NVIDIA terms to cspell dictionary (Landlock, seccomp, syscall, etc.) - Fix broken link: openclaw-skill -> openshell-skill in docs/integrations/openshell.md - Fix policy validation: replace starts_with (invalid) with matches + regex Co-authored-by: Copilot <[email protected]> * feat: add reversibility checker, trust calibration guide, and escalation/reversibility tests (microsoft#1061) Addresses critical review feedback: 1. Rollback/reversibility (agent_os.reversibility) - ReversibilityChecker: pre-execution assessment of action reversibility - 4 levels: fully_reversible, partially_reversible, irreversible, unknown - CompensatingAction: structured undo plans for each action type - Built-in rules for 12 common actions (write, deploy, delete, email, etc.) - block_irreversible mode for strict environments 2. Trust score calibration guide (docs/security/trust-score-calibration.md) - Score component weights (compliance 35%, task 25%, behavior 25%, identity 15%) - Decay functions with tier floors - Initial score assignments by agent origin - Threshold recommendations (conservative/moderate/permissive) - Anti-gaming measures and operational playbook 3. Tests: 19 passing (10 escalation + 9 reversibility) Co-authored-by: Copilot <[email protected]> * feat: deployment runtime (Docker/AKS) and shared trust core types (microsoft#1062) agent-runtime: Evolve from thin re-export shim to deployment runtime - DockerDeployer: container deployment with security hardening (cap-drop ALL, no-new-privileges, read-only rootfs) - KubernetesDeployer: AKS pod deployment with governance sidecars (runAsNonRoot, seccompProfile, resource limits) - GovernanceConfig: policy/trust/audit config injected as env vars - DeploymentTarget protocol for extensibility (ADC, nono, etc.) - 24 tests (all subprocess calls mocked) agent-mesh: Extract shared trust types into agentmesh.trust_types - TrustScore, AgentProfile, TrustRecord, TrustTracker - Canonical implementations replacing ~800 lines of duplicated code across 6+ integration packages - 25 tests covering clamping, scoring, history, capabilities Co-authored-by: Copilot <[email protected]> * feat(dotnet): add kill switch and lifecycle management to .NET SDK (microsoft#1065) - Add KillSwitch with arm/disarm, event history, and subscriber notifications - Add LifecycleManager with 8-state machine and validated transitions - Add comprehensive xUnit tests for both components (26 tests) - Update .NET SDK README with usage documentation Co-authored-by: Copilot <[email protected]> * feat(go): add MCP security, execution rings, and lifecycle management to Go SDK (microsoft#1066) - mcp.go: MCP security scanner detecting tool poisoning, typosquatting, hidden instructions (zero-width chars, homoglyphs), and rug pulls - rings.go: Execution privilege ring model (Admin/Standard/Restricted/Sandboxed) with default-deny access control - lifecycle.go: Eight-state agent lifecycle manager with validated transitions - Full test coverage for all three modules - Updated README with API docs and examples Co-authored-by: Copilot <[email protected]> * feat(rust): add execution rings and lifecycle management to Rust SDK (microsoft#1067) * feat(dotnet): add kill switch and lifecycle management to .NET SDK - Add KillSwitch with arm/disarm, event history, and subscriber notifications - Add LifecycleManager with 8-state machine and validated transitions - Add comprehensive xUnit tests for both components (26 tests) - Update .NET SDK README with usage documentation Co-authored-by: Copilot <[email protected]> * feat(rust): add execution rings and lifecycle management to Rust SDK Add two new modules to the agentmesh Rust crate: - rings.rs: Four-level execution privilege ring model (Admin/Standard/ Restricted/Sandboxed) with per-agent assignment and per-ring action permissions, ported from the Python hypervisor enforcer. - lifecycle.rs: Eight-state agent lifecycle manager (Provisioning through Decommissioned) with validated state transitions and event history, matching the lifecycle model used across other SDK languages. Both modules include comprehensive unit tests and are re-exported from the crate root. README updated with API tables and usage examples. Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * fix: align lotl_prevention_policy.yaml with PolicyDocument schema The policy file used an incompatible schema format (id, parameter, regex_match, effect) instead of the expected PolicyDocument fields (name, condition.field, operator, action). This caused the validate-policies CI check to fail for all PRs. Changes: - id → name - condition.parameter → condition.field - operator: regex_match → operator: matches - action at rule level (shell_exec/file_read) → action: deny - effect: DENY → removed (redundant with action: deny) - Added version, name, description, disclaimer at top level Co-authored-by: Copilot <[email protected]> * fix: resolve .NET ESRP signing issues blocking NuGet publish GitHub Actions (publish.yml): - Fix broken if-guards on signing steps: env.ESRP_AAD_ID was set in step-level env (invisible to if-expressions). Replace with job-level ESRP_CONFIGURED env derived from secrets. - Add missing ESRP_CERT_IDENTIFIER to signing step env blocks. - Gate the publish step on ESRP_CONFIGURED so unsigned packages are never pushed to NuGet.org under the Microsoft.* prefix. - Make stub signing steps fail-fast (exit 1) instead of silently succeeding, preventing unsigned packages from reaching NuGet push. ADO Pipeline (esrp-publish.yml): - Add UseDotNet@2 task to Publish_NuGet stage so dotnet nuget push has a guaranteed SDK version on the Windows agent. Co-authored-by: Copilot <[email protected]> * fix(docs): fix OpenClaw sidecar demo and add limitations callout (microsoft#1163) The docker-compose example in openclaw-sidecar.md was illustrative only and did not work — it referenced a non-existent OpenClaw image and lacked healthchecks. Users were hitting this and getting confused. Changes: - Add working demo at demo/openclaw-governed/ with docker-compose.yaml that builds and runs the governance sidecar from source - Replace the inline docker-compose in the doc with a link to the demo plus a clearly-labeled reference template for custom deployments - Add prominent WARNING callout listing known limitations (no native OpenClaw integration, no published images, explicit API required) - Remove stale orphaned curl snippet after the docker-compose block - Add healthcheck to docker-compose governance-sidecar service - Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest to a placeholder users must replace with their own image Co-authored-by: Copilot <[email protected]> * fix(docs): fix OpenClaw sidecar demo and add limitations callout (microsoft#1164) The docker-compose example in openclaw-sidecar.md was illustrative only and did not work — it referenced a non-existent OpenClaw image and lacked healthchecks. Users were hitting this and getting confused. Changes: - Add working demo at demo/openclaw-governed/ with docker-compose.yaml that builds and runs the governance sidecar from source - Replace the inline docker-compose in the doc with a link to the demo plus a clearly-labeled reference template for custom deployments - Add prominent WARNING callout listing known limitations (no native OpenClaw integration, no published images, explicit API required) - Remove stale orphaned curl snippet after the docker-compose block - Add healthcheck to docker-compose governance-sidecar service - Fix OpenClaw image reference from ghcr.io/openclaw/openclaw:latest to a placeholder users must replace with their own image Co-authored-by: Copilot <[email protected]> * fix(ci): use PME tenant ID for ESRP cert signing The ESRP signing cert lives in the PME (Partner Managed Engineering) tenant (975f013f), not the Microsoft corporate tenant (72f988bf). Using the wrong tenant ID causes ESRP signing to fail when looking up the cert. Co-authored-by: Copilot <[email protected]> * docs: Add Scaling AI Agents article to COMMUNITY.md (microsoft#857) Co-authored-by: deepsearch <[email protected]> * Add runtime evidence mode to agt verify (microsoft#969) * Track agt verify evidence plan * Add runtime evidence mode to agt verify * Add runtime evidence verifier tests * Add CLI tests for agt verify evidence mode * Document evidence mode for compliance verification * Remove local implementation notes * Document agt verify evidence mode * Harden evidence path handling in verify --------- Co-authored-by: T. Smith <[email protected]> * docs: add Entra Agent ID bridge tutorial with R&R matrix and DID fix - Add Tutorial 31: Bridging AGT Identity with Microsoft Entra Agent ID - Detailed roles & responsibilities between AGT and Entra/Agent365 - Architecture diagram showing the identity bridge - Step-by-step: DID creation, Entra binding, AKS workload identity, token validation, lifecycle sync, access verification - Known gaps and limitations table - Platform independence note (AWS, GCP, Okta patterns) - Fix DID prefix in .NET MCP gateway tests (did:agentmesh → did:mesh for consistency with Python reference implementation and .NET SDK) - Update tutorials README with Enterprise Identity section Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: kevinkaylie <[email protected]> Co-authored-by: Aymen Hmaidi <[email protected]> Co-authored-by: harshnair75567-cloud <[email protected]> Co-authored-by: Adamthereal <[email protected]> Co-authored-by: Jack Batzner <[email protected]> Co-authored-by: lawcontinue <[email protected]> Co-authored-by: deepsearch <[email protected]> Co-authored-by: ewmh <[email protected]> Co-authored-by: T. Smith <[email protected]> * docs: address external critic gaps in limitations and threat model (microsoft#11) Add three new sections to LIMITATIONS.md addressing gaps identified in public criticism and external security analysis: - §10 Physical AI and Embodied Agent Governance: documents that AGT governs software agents not physical actuators, with mitigations - §11 Streaming Data and Real-Time Assurance: documents that AGT evaluates per-action not continuously over data streams - §12 DID Method Inconsistency Across SDKs: documents the did:mesh vs did:agentmesh split with migration plan for v4.0 Update THREAT_MODEL.md residual risks to reference all three new limitation sections. Co-authored-by: Copilot <[email protected]> * fix!: standardize DID method to did:agentmesh across all SDKs (microsoft#12) * fix!: standardize DID method to did:agentmesh across all SDKs BREAKING CHANGE: All agent DIDs now use the did:agentmesh: prefix. The legacy did:mesh: prefix used by Python and .NET has been migrated to match the did:agentmesh: convention already used by TypeScript, Rust, and Go SDKs. Changes: - Python: agent_id.py, delegation.py, entra.py, all integrations - .NET: AgentIdentity.cs, Jwk.cs, GovernanceKernel.cs, all tests - Docs: README, tutorials, identity docs, FAQ, compliance docs - Tests: all test fixtures updated across Python, .NET, TS, VSCode - Version bump: 3.1.0 → 3.2.0 (.NET, Python agent-mesh, TypeScript) Migration: replace did:mesh: with did:agentmesh: in your policies, identity registries, and agent configurations. Co-authored-by: Copilot <[email protected]> * docs: add Q11-Q13 to FAQ — AGT scope, Agent 365, and DLP comparison Adds three new customer Q&As: - Q11: Is AGT for Foundry agents or any agent type? (any) - Q12: Relationship between AGT and Agent 365 (different layers) - Q13: How is AGT different from DLP/communication compliance (content vs action governance) Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> * fix(security): address all 14 open code scanning alerts (microsoft#13) * fix: address 6 Dependabot security vulnerabilities - python-multipart 0.0.22 → 0.0.26 (DoS via large preamble/epilogue) - pytest 8.4.1 → 9.0.3 (tmpdir handling vulnerability) - langchain-core 1.2.11 → 1.2.28 (SSRF, path traversal, f-string validation) - langchain-core >=0.2.0,<1.0 → >=1.2.28 in langchain-agentmesh pyproject.toml - tsup 8.0.0 → 8.5.1 (DOM clobbering vulnerability) - rand 0.8.5: dismissed microsoft#176 as inaccurate (vuln affects rand::rng() 0.9.x API only) Fixes Dependabot alerts: microsoft#177, microsoft#175, microsoft#166, microsoft#164, microsoft#157, microsoft#156 Dismissed: microsoft#176 (not applicable to rand 0.8.x) Co-authored-by: Copilot <[email protected]> * fix(security): address all 14 open code scanning alerts Scorecard HIGH: - publish-containers.yml: scope packages:write to job level (microsoft#316) Scorecard MEDIUM (pinned dependencies): - docs.yml: pin 4 GitHub Actions by SHA hash (microsoft#311-314) - docs.yml: use requirements.txt for pip install (microsoft#315) - agent-mesh Dockerfile: pin python:3.11-slim by SHA (microsoft#317,microsoft#318) - agent-os Dockerfile.sidecar: pin python:3.14-slim by SHA (microsoft#295,microsoft#296) - dashboard Dockerfile: pin python:3.12-slim by SHA (microsoft#291,microsoft#293) CodeQL: - test_time_decay.py: timedelta(days=365) -> 366 for leap safety (microsoft#289,microsoft#290) Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: kevinkaylie <[email protected]> Co-authored-by: Aymen Hmaidi <[email protected]> Co-authored-by: harshnair75567-cloud <[email protected]> Co-authored-by: Adamthereal <[email protected]> Co-authored-by: Jack Batzner <[email protected]> Co-authored-by: lawcontinue <[email protected]> Co-authored-by: deepsearch <[email protected]> Co-authored-by: ewmh <[email protected]> Co-authored-by: T. Smith <[email protected]>
Summary
Pre-announcement security hardening, CI fixes, and demo improvements addressing findings from security review.
Security Fixes
exus/dmz.py) — was marked \NOT SECURE - placeholder only\
Demo Improvements
CI Fixes
Issue Triage