Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 1e51e46

Browse files
committed
docs: foundation of SECURITY.md
1 parent 056040a commit 1e51e46

1 file changed

Lines changed: 133 additions & 0 deletions

File tree

SECURITY.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# DryRun Security Overview
2+
3+
## What DryRun is
4+
5+
**Don't let AI touch your production database.** That is the whole
6+
reason DryRun exists.
7+
8+
DryRun captures a PostgreSQL database's schema, planner statistics, and
9+
activity counters into a local file (`.dryrun/history.db`). AI
10+
assistants, developers, and tooling read that file instead of opening a
11+
connection to production.
12+
13+
The single binary has two modes, and the security model follows that
14+
split:
15+
16+
| Mode | Who runs it | Touches the database? | Needs credentials? |
17+
|------|-------------|-----------------------|--------------------|
18+
| **CLI** (`init`, `probe`, `dump-schema`, `drift`, `snapshot take/diff`, `stats apply`) | DBA / CI | Yes | Yes |
19+
| **MCP + offline tools** (`mcp-serve`, `lint`, `import`, `snapshot list/push/pull`) | Agent / developer | No | No |
20+
21+
The captured file is the boundary. The CLI writes it against
22+
production. Everything else only reads it. Production credentials stay
23+
with the DBA.
24+
25+
## MCP and offline tools
26+
27+
This is the side an agent or teammate actually interacts with.
28+
29+
- No database connection, no credentials. Only reads
30+
`.dryrun/history.db` and snapshot files.
31+
- SQL submitted by an agent through MCP is parsed locally with
32+
libpg_query and validated against the captured schema. It is never
33+
executed against a real database.
34+
- An agent sees exactly what the snapshot contains, nothing more.
35+
36+
Rule of thumb: **if you would not hand a value to your AI assistant, it
37+
must not be in the snapshot.**
38+
39+
## CLI, the side that touches production
40+
41+
The CLI is the only place real credentials live. Run it as a DBA, not
42+
as an agent.
43+
44+
- **Prod-reading**: `init`, `probe`, `dump-schema`, `drift`,
45+
`snapshot take`, `snapshot diff`. Connect read-only. Use the included
46+
`dryrun-readonly-role.sql`. Never SUPERUSER, never the app role. Pass
47+
`DATABASE_URL` via environment, not flags.
48+
- **Local-writing**: `stats apply` mutates planner stats on local or
49+
dev database so `EXPLAIN` matches production shape. Never point it at
50+
production.
51+
52+
Of the prod-reading commands, `init` and `snapshot take` write the
53+
shared `history.db`. Both apply the masking policy in-process *before*
54+
anything is written to disk. DryRun does not persist the connection
55+
string. The snapshot records a logical `database_id`, not the URL.
56+
57+
## Data masking
58+
59+
Masking runs **once**, at capture time, inside `init` or `snapshot
60+
take`. The masked form is what lands in `history.db`. There is no
61+
re-masking later, and `snapshot push`/`pull` move bytes without
62+
transforming them.
63+
64+
Two consequences worth knowing:
65+
66+
- A missing or wrong policy at capture is a **permanent leak**.
67+
Recapture is the only fix. Old snapshots in shared storage stay leaky
68+
until you delete them.
69+
- For projects with real PII, set `require_masks = true` in
70+
`dryrun.toml`. That turns "no policy resolved" into a hard error and
71+
refuses `--no-masks`.
72+
73+
Independent backstop, always on: `jsonb` `most_common_vals` and
74+
`most_common_freqs` are stripped at capture regardless of policy.
75+
`histogram_bounds` is *not* auto-stripped. List sensitive jsonb columns
76+
in the policy explicitly.
77+
78+
Full workflow and examples: see [`MASKING-TUTORIAL.md`](MASKING-TUTORIAL.md).
79+
80+
## Telemetry
81+
82+
DryRun sends telemetry from the **MCP server only**. The CLI, capture,
83+
and snapshot commands send nothing.
84+
85+
The MCP server emits two events per session: a `dryrun_start` when an
86+
agent connects, and a `dryrun_summary` when the session ends. No
87+
snapshot contents, no DDLs, no SQL text, no database identifiers, no column or
88+
table names.
89+
90+
Example `dryrun_start` payload:
91+
92+
```json
93+
{
94+
"event": "dryrun_start",
95+
"session_id": "00000000-0000-0000-0000-000000000001",
96+
"app_version": "0.8.0",
97+
"timestamp": "2026-05-25T21:00:00Z",
98+
"transport": "stdio",
99+
"schema_loaded": true,
100+
"table_count": 12,
101+
"planner_loaded": true,
102+
"activity_nodes": 4
103+
}
104+
```
105+
106+
`dryrun_summary` adds session duration and a `{tool_name: count}` map
107+
of which MCP tools the agent invoked. Tool names are the built-in
108+
DryRun tool identifiers, not anything the agent typed.
109+
110+
**Opt out** by setting either of these in the MCP server's
111+
environment:
112+
113+
```sh
114+
export DO_NOT_TRACK=1 # industry-standard, also honored
115+
export DRYRUN_TELEMETRY=off # dryrun-specific
116+
```
117+
118+
Or persist it in `dryrun.toml`:
119+
120+
```toml
121+
[telemetry]
122+
enabled = false
123+
```
124+
125+
Any of these disables the endpoint entirely. No event leaves the
126+
process. The env var wins over the config file, so CI runs can stay
127+
silent without editing checked-in TOML.
128+
129+
## Related documents
130+
131+
- [`MASKING-TUTORIAL.md`](MASKING-TUTORIAL.md): masking policy
132+
workflow.
133+
- `dryrun-readonly-role.sql`: minimum-privilege role for capture.

0 commit comments

Comments
 (0)