Python
Agent-First Data (AFDATA) — Suffix-driven output formatting and protocol templates for AI agents.
The field name is the schema. Agents read latency_ms and know milliseconds, api_key_secret and know to redact, no external schema needed.
Installation
pip install agent-first-data
Quick Example
A backup tool invoked from the CLI — flags, env vars, and config all use the same suffixes:
API_KEY_SECRET=sk-1234 cloudback --timeout-s 30 --max-file-size-bytes 10737418240 /data/backup.tar.gz
For CLI diagnostics, enable log categories explicitly:
--log startup,request,progress,retry,redirect
--verbose # shorthand for all categories
Without these flags, startup diagnostics should stay off by default.
The tool reads env vars, flags, and config — all with AFDATA suffixes — and can emit a startup diagnostic event:
from agent_first_data import *
import os
startup = build_json(
"log",
{
"event": "startup",
"config": {"timeout_s": 30, "max_file_size_bytes": 10737418240},
"args": {"input_path": "/data/backup.tar.gz"},
"env": {"API_KEY_SECRET": os.environ.get("API_KEY_SECRET")},
},
trace=None,
)
Three output formats, same data:
JSON: {"code":"log","event":"startup","args":{"input_path":"/data/backup.tar.gz"},"config":{"max_file_size_bytes":10737418240,"timeout_s":30},"env":{"API_KEY_SECRET":"***"}}
YAML: code: "log"
event: "startup"
args:
input_path: "/data/backup.tar.gz"
config:
max_file_size: "10.0GB"
timeout: "30s"
env:
API_KEY: "***"
Plain: args.input_path=/data/backup.tar.gz code=log event=startup config.max_file_size=10.0GB config.timeout=30s env.API_KEY=***
--timeout-s → timeout_s → timeout: 30s. API_KEY_SECRET → API_KEY: "***". The suffix is the schema.
API Reference
Total: 15 public APIs and 2 types + AFDATA logging (3 protocol builders + 2 redacted value helpers + 4 output functions + 1 internal + 1 utility + 4 CLI helpers + OutputFormat + RedactionPolicy)
Protocol Builders (returns dict)
Build AFDATA protocol structures. Return dict objects for transport payloads.
# Success (result)
build_json_ok(result: Any, trace: Any = None) -> dict
# Error (simple message, optional hint)
build_json_error(message: str, hint: str = None, trace: Any = None) -> dict
# Generic (any code + fields)
build_json(code: str, fields: Any, trace: Any = None) -> dict
Redacted Values (returns Any)
Use these before raw HTTP/MCP/SSE serializers that do not call output_json.
redacted_value(value: Any) -> Any
redacted_value_with(value: Any, redaction_policy: RedactionPolicy) -> Any
Use case: structured protocol payloads (frameworks automatically serialize)
Example:
from agent_first_data import *
# Startup
startup = build_json(
"log",
{
"event": "startup",
"config": {"api_key_secret": "sk-123", "timeout_s": 30},
"args": {"config_path": "config.yml"},
"env": {"RUST_LOG": "info"},
},
trace=None,
)
# Success (always include trace)
response = build_json_ok(
{"user_id": 123},
trace={"duration_ms": 150, "source": "db"},
)
# Error
err = build_json_error("user not found", trace={"duration_ms": 5})
# Error with hint
err_hint = build_json_error("wallet not found", hint="list wallets with: afpay wallet list", trace={"duration_ms": 5})
# Specific error code
not_found = build_json(
"not_found",
{"resource": "user", "id": 123},
trace={"duration_ms": 8},
)
CLI/Log Output (returns str)
Format values for CLI output and logs. output_json uses full _secret redaction by default. output_json_with supports explicit scoped policies. YAML and Plain always redact _secret and apply human-readable formatting.
output_json(value: Any) -> str # Single-line JSON, original keys, for programs/logs
output_json_with(value: Any, redaction_policy: RedactionPolicy) -> str
output_yaml(value: Any) -> str # Multi-line YAML, keys stripped, values formatted
output_plain(value: Any) -> str # Single-line logfmt, keys stripped, values formatted
class RedactionPolicy(enum.Enum):
RedactionTraceOnly = "RedactionTraceOnly"
RedactionNone = "RedactionNone"
RedactionStrict = "RedactionStrict"
Example:
from agent_first_data import *
data = {
"user_id": 123,
"api_key_secret": "sk-1234567890abcdef",
"created_at_epoch_ms": 1738886400000,
"file_size_bytes": 5242880,
}
# JSON (secrets redacted, original keys, raw values)
print(output_json(data))
# {"api_key_secret":"***","created_at_epoch_ms":1738886400000,"file_size_bytes":5242880,"user_id":123}
# YAML (keys stripped, values formatted, secrets redacted)
print(output_yaml(data))
# ---
# api_key: "***"
# created_at: "2025-02-07T00:00:00.000Z"
# file_size: "5.0MB"
# user_id: 123
# Plain logfmt (keys stripped, values formatted, secrets redacted)
print(output_plain(data))
# api_key=*** created_at=2025-02-07T00:00:00.000Z file_size=5.0MB user_id=123
Internal Tools
internal_redact_secrets(value: Any) -> None # Manually redact secrets in-place
Most users don’t need this. Output functions automatically protect secrets.
Utility Functions
parse_size(s: str) -> int | None # Parse "10M" → bytes
Returns None for invalid, negative, or overflow input.
Example:
from agent_first_data import *
assert parse_size("10M") == 10485760
assert parse_size("1.5K") == 1536
assert parse_size("512") == 512
CLI Helpers (for tools built on AFDATA)
Shared helpers that prevent flag-parsing drift between CLI tools. Use these instead of reimplementing --output and --log handling in each tool.
class OutputFormat(enum.Enum): # JSON="json", YAML="yaml", PLAIN="plain"
cli_parse_output(s: str) -> OutputFormat # Parse --output flag; raises ValueError on unknown
cli_parse_log_filters(entries: list[str]) -> list[str] # Normalize --log: trim, lowercase, dedup, remove empty
cli_output(value: Any, format: OutputFormat) -> str # Dispatch to output_json/yaml/plain
build_cli_error(message: str, hint: str = None) -> dict # {code:"error", error_code:"invalid_request", hint?, retryable:False, trace:{duration_ms:0}}
Canonical pattern — parse all flags before doing work, emit JSONL errors to stdout:
import sys
from agent_first_data import (
OutputFormat, cli_parse_output, cli_parse_log_filters,
cli_output, build_cli_error, output_json,
)
try:
fmt = cli_parse_output(args.output)
except ValueError as e:
print(output_json(build_cli_error(str(e))))
sys.exit(2)
log = cli_parse_log_filters(args.log.split(",") if args.log else [])
# ... do work ...
print(cli_output(result, fmt))
See examples/agent_cli.py for the complete working example (pytest examples/agent_cli.py).
Usage Examples
Example 1: REST API
from agent_first_data import *
from fastapi import FastAPI
app = FastAPI()
@app.get("/users/{user_id}")
async def get_user(user_id: int):
response = build_json_ok(
{"user_id": user_id, "name": "alice"},
trace={"duration_ms": 150, "source": "db"},
)
# API returns raw JSON — no output processing, no key stripping
return response
Example 2: CLI Tool (Complete Lifecycle)
from agent_first_data import *
# 1. Startup
startup = build_json(
"log",
{
"event": "startup",
"config": {"api_key_secret": "sk-sensitive-key", "timeout_s": 30},
"args": {"input_path": "data.json"},
"env": {"RUST_LOG": "info"},
},
trace=None,
)
print(output_yaml(startup))
# ---
# code: "log"
# event: "startup"
# args:
# input_path: "data.json"
# config:
# api_key: "***"
# timeout: "30s"
# env:
# RUST_LOG: "info"
# 2. Progress
progress = build_json(
"progress",
{"current": 3, "total": 10, "message": "processing"},
trace={"duration_ms": 1500},
)
print(output_plain(progress))
# code=progress current=3 message=processing total=10 trace.duration=1.5s
# 3. Result
result = build_json_ok(
{
"records_processed": 10,
"file_size_bytes": 5242880,
"created_at_epoch_ms": 1738886400000,
},
trace={"duration_ms": 3500, "source": "file"},
)
print(output_yaml(result))
# ---
# code: "ok"
# result:
# created_at: "2025-02-07T00:00:00.000Z"
# file_size: "5.0MB"
# records_processed: 10
# trace:
# duration: "3.5s"
# source: "file"
Example 3: JSONL Output
from agent_first_data import *
result = build_json_ok(
{"status": "success"},
trace={"duration_ms": 250, "api_key_secret": "sk-123"},
)
# Print JSONL to stdout (secrets redacted, one JSON object per line)
# Channel policy: machine-readable protocol/log events must not use stderr.
print(output_json(result))
# {"code":"ok","result":{"status":"success"},"trace":{"api_key_secret":"***","duration_ms":250}}
Complete Suffix Example
from agent_first_data import *
data = {
"created_at_epoch_ms": 1738886400000,
"request_timeout_ms": 5000,
"cache_ttl_s": 3600,
"file_size_bytes": 5242880,
"payment_msats": 50000000,
"price_usd_cents": 9999,
"success_rate_percent": 95.5,
"api_key_secret": "sk-1234567890abcdef",
"user_name": "alice",
"count": 42,
}
# YAML output (keys stripped, values formatted, secrets redacted)
print(output_yaml(data))
# ---
# api_key: "***"
# cache_ttl: "3600s"
# count: 42
# created_at: "2025-02-07T00:00:00.000Z"
# file_size: "5.0MB"
# payment: "50000000msats"
# price: "$99.99"
# request_timeout: "5.0s"
# success_rate: "95.5%"
# user_name: "alice"
# Plain logfmt output (same transformations, single line)
print(output_plain(data))
# api_key=*** cache_ttl=3600s count=42 created_at=2025-02-07T00:00:00.000Z file_size=5.0MB payment=50000000msats price=$99.99 request_timeout=5.0s success_rate=95.5% user_name=alice
AFDATA Logging
AFDATA-compliant structured logging via Python’s logging module. Every log line is formatted using the library’s own output_json/output_plain/output_yaml functions. Span fields are carried via contextvars (async-safe), automatically flattened into each log line.
API
from agent_first_data import init_logging_json, init_logging_plain, init_logging_yaml
from agent_first_data.afdata_logging import AfdataHandler, get_logger, span
# Convenience initializers — set up the root logger with AFDATA output to stdout
init_logging_json(level="INFO") # Single-line JSONL (secrets redacted, original keys)
init_logging_plain(level="INFO") # Single-line logfmt (keys stripped, values formatted)
init_logging_yaml(level="INFO") # Multi-line YAML (keys stripped, values formatted)
# Low-level — create a handler for custom logger stacks
AfdataHandler(format="json") # format: "json" | "plain" | "yaml"
# Logger with default fields (returns logging.LoggerAdapter)
get_logger(name, **fields)
# Span context manager — adds fields to all log events within the block
span(**fields)
Setup
from agent_first_data import init_logging_json, init_logging_plain, init_logging_yaml
# JSON output for production (one JSONL line per event, secrets redacted)
init_logging_json("INFO")
# Plain logfmt for development (keys stripped, values formatted)
init_logging_plain("DEBUG")
# YAML for detailed inspection (multi-line, keys stripped, values formatted)
init_logging_yaml("DEBUG")
Log Output
Standard logging calls work unchanged. Output format depends on the init function used.
import logging
logger = logging.getLogger("myapp")
logger.info("Server started")
# JSON: {"timestamp_epoch_ms":1739000000000,"message":"Server started","target":"myapp","code":"info"}
# Plain: code=info message="Server started" target=myapp timestamp_epoch_ms=1739000000000
# YAML: ---
# code: "info"
# message: "Server started"
# target: "myapp"
# timestamp_epoch_ms: 1739000000000
logger.warning("DNS lookup failed")
# JSON: {"timestamp_epoch_ms":...,"message":"DNS lookup failed","target":"myapp","code":"warn"}
Span Support
Use the span context manager to add fields to all log events within the block. Spans nest and work with both sync and async code.
from agent_first_data import span
with span(request_id="abc-123"):
logger.info("Processing")
# {"timestamp_epoch_ms":...,"message":"Processing","target":"myapp","request_id":"abc-123","code":"info"}
with span(step="validate"):
logger.info("Validating input")
# {"timestamp_epoch_ms":...,"message":"Validating input","target":"myapp","request_id":"abc-123","step":"validate","code":"info"}
Logger with Default Fields
Use get_logger for per-component fields that appear on every log line:
from agent_first_data import get_logger
logger = get_logger("myapp.auth", component="auth")
logger.info("Token verified")
# {"timestamp_epoch_ms":...,"message":"Token verified","target":"myapp.auth","component":"auth","code":"info"}
Custom Code Override
The code field defaults to the log level. Override with an explicit field:
from agent_first_data import get_logger
logger = get_logger("myapp")
logger.info("Server ready", extra={"code": "log", "event": "startup"})
# {"timestamp_epoch_ms":...,"message":"Server ready","target":"myapp","code":"log","event":"startup"}
Output Fields
Every log line contains:
| Field | Type | Description |
|---|---|---|
timestamp_epoch_ms | number | Unix milliseconds |
message | string | Log message |
target | string | Logger name |
code | string | Level (debug/info/warn/error) or explicit override |
| span fields | any | From span() context manager |
| event fields | any | From extra= or get_logger fields |
Log Output Formats
All three formats use the library’s own output functions, so AFDATA suffix processing applies to log fields too:
| Format | Function | Keys | Values | Use case |
|---|---|---|---|---|
| JSON | init_logging_json | original (with suffix) | raw | production, log aggregation |
| Plain | init_logging_plain | stripped | formatted | development, compact scanning |
| YAML | init_logging_yaml | stripped | formatted | debugging, detailed inspection |
All formats automatically redact _secret fields in log output.
Output Formats
Three output formats for different use cases:
| Format | Structure | Keys | Values | Use case |
|---|---|---|---|---|
| JSON | single-line | original (with suffix) | raw | programs, logs |
| YAML | multi-line | stripped | formatted | human inspection |
| Plain | single-line logfmt | stripped | formatted | compact scanning |
All formats automatically redact _secret fields.
Supported Suffixes
- Duration:
_ms,_s,_ns,_us,_minutes,_hours,_days - Timestamps:
_epoch_ms,_epoch_s,_epoch_ns,_rfc3339 - Size:
_bytes(auto-scales to KB/MB/GB/TB),_size(config input, pass through) - Currency:
_msats,_sats,_btc,_usd_cents,_eur_cents,_jpy,_{code}_cents - Other:
_percent,_secret(auto-redacted in all formats)
Repository
This package is part of the agent-first-data repository, which also contains:
spec/— Full AFDATA specification with suffix definitions, protocol format rules, and cross-language test fixturesskills/— AI coding agent skill for working with AFDATA conventions
To run tests, clone the full repository (tests use shared cross-language fixtures from spec/fixtures/):
git clone https://github.com/agentfirstkit/agent-first-data
cd agent-first-data/python
python -m pytest
License
MIT