Replies: 1 comment
-
|
Small clarification after looking more closely at the current repo state: I see that The distinction Iβm trying to explore here is narrower: not βshould dangerous actions ask for confirmation?β, but βcan the confirmation be bound to the exact mutation payload that will later be executed?β In other words, I see the existing confirmation rules as a strong foundation. The gap Iβm proposing to discuss is whether high-risk mutation tools should optionally support a plan/challenge/execute lifecycle where:
So this is meant as a complement to the existing elicitation/confirmation system, not a claim that the repo has no HITL support today. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi!
I've been building a 'production-ready' gateway for AI-assisted Kubernetes operations (Kubernetes-MCP-Guard) and I'd like to raise a security concern and propose a pattern that I think deserves standardization across MCP infrastructure servers.
The Problem: Consent Fatigue and TOCTOU-Style Gaps in Simple Approval Flows
Current approaches to human approval for AI-assisted mutations often fall into one of three patterns, each with its own gap:
Boolean prompts ("Do you want me to restart nginx-prod? yes/no") β the human approves based on a description the AI wrote, not the actual payload. A compromised or prompt-injected AI can describe one thing and send another. This creates a Time-of-Check to Time-of-Use (TOCTOU)-style gap.
UI takeover (for example, AWS Nova Act) β designed for browser UI automation. When applied to structured API mutations, the approved payload may not be cryptographically bound between the human's approval click and the actual API call. This is a useful contrast, but not a mutation approval protocol.
Stateful workflow engines (for example, Oracle Integration Cloud HITL, BPEL) β can address TOCTOU-style issues when designed with immutable plan storage and hash binding, but require adopting a heavyweight workflow platform. That is not always a good fit for teams running open-source Kubernetes tooling.
The existing
--read-onlyflags and RBAC inkubernetes-mcp-serverare excellent safeguards, but they do not solve the case where an AI-assisted workflow legitimately has write access, a human has approved a specific operation, and then the payload drifts before execution.The Proposed Solution: Plans, Challenges, and Hashes
In Kubernetes-MCP-Guard I've implemented a pattern I'm calling Plan-Challenge-Execute. Here's how it works:
1. Plan Generation, Not Execution
The AI calls a planning/request tool such as
request_apply_manifestorrequest_restart_deployment.Instead of mutating the cluster, the server:
hostPath,hostNetwork, dangerous capabilities, and similar risky patternsNo cluster mutation happens during this phase.
2. Out-of-Band Approval
The AI/client then requests approval for the plan.
In my implementation, the same
apply_approved_plan(planId)tool is stateful: before approval it returns an approval URL; after approval it executes the approved plan. For a portable contract, this could also be modeled as a separatecreate_approval_challenge(planId)phase.The gateway:
The human must open the URL in a browser session authenticated through the gateway's own out-of-band OAuth flow, independent of the AI client's session.
The browser renders the exact plan being approved:
The human then clicks Approve or Reject.
3. Hash-Bound Execution Gate
The AI/client calls the execution tool again after approval.
Before mutating Kubernetes, the server:
submatches the requestingsubin same-subject modeIf the pending plan was tampered with between challenge creation and execution, the digest comparison fails and the operation is refused with an error such as
approval_hash_mismatch.The AI/client cannot cause a different stored payload to be executed without invalidating the approval.
Key Security Properties
submust match requester'ssubin same-subject modeapplied/<planId>.jsonmarker blocks a second executionaudit.jsonlwith typed payloadsProof: SafetyE2E Coverage Against Real Keycloak + Kubernetes Paths
These properties are not just documented β they are covered by targeted SafetyE2E regression tests that run against a real Keycloak instance via Testcontainers and a real Kubernetes cluster:
PlanHashMismatchTestsModifiedPendingPlanTestsWrongUserApprovalTests.ApproveChallengeEndpoint_ByDifferentSubject_IsRefusedExpiredApprovalTestsAlreadyAppliedPlanTestsDryRunFailureTestsDangerousManifestTestsFullApprovalFlowTests.RestartDeployment_ApprovedThroughBrowser_AppliesExactPlanAndAuditsRbacMatrixTestsThe
mainbranch includes SafetyE2E coverage for digest mismatch, modified pending plans, expired approvals, wrong-user approvals, already-applied plans, dangerous manifests, dry-run failure, and RBAC boundaries.The suite exercises real gateway, MCP, and Kubernetes paths and uses real Keycloak JWTs for MCP bearer authentication. Browser approval OAuth is simulated at the callback/backchannel boundary in tests, with separate service-level coverage for real-JWT wrong-user rejection.
A GitHub Actions workflow (
safety-e2e.yml) runs the suite against an ephemeral KinD cluster on demand.Reference implementation and tests:
Why This Needs a Portable Standard
MCP already has useful primitives:
readOnlyHintanddestructiveHintThose are good foundations.
What MCP does not currently standardize is a portable mutation-approval profile:
planIdServers must implement these details themselves today.
Your repository is one of the most visible Kubernetes/OpenShift MCP implementations, so it seems like a good place to incubate this pattern. Without a shared profile, infrastructure MCP servers may diverge:
needs_approvalflag, a URL, a digest, or nothingdestructiveHintannotations for real cluster mutationsI'm proposing an optional MCP mutation-approval profile that defines a minimal contract for structured mutation approval:
planphase that returns a plan ID and digest, not a mutationchallengephase that creates a time-bounded, identity-bound approval ticketexecutephase that verifies both the digest and the challenge before mutatingThis does not require adoption of my implementation. It requires agreement on the interface contract so that MCP clients and security reviewers can reason about approval state portably.
Threat Model
This profile treats the mutation server or gateway as the trusted policy enforcement point.
The goal is to prevent the model, MCP client, or intermediate workflow from substituting, replaying, or executing a different payload after human review.
This profile is intended to protect against:
This profile is not intended to prove correctness of a malicious server implementation. A malicious server could still render one plan and execute another. That is outside the trust boundary of this pattern.
The Ask
propose β challenge β executetool contract for this repository or as an MCP mutation-approval profile?For reference, the architecture rationale is documented here:
Short version:
Thanks for the great work on this project.
β @mirusser
Beta Was this translation helpful? Give feedback.
All reactions