Open Source
Open source infrastructure for verifiable AI systems
CertifiedData publishes the components that external systems need to integrate with, verify against, and build on — without requiring access to the core platform.
Eight public repositories spanning verification, Agent Commerce, data safety, governance examples, templates, manifest examples, the public OpenAPI, and EU AI Act mapping.
How the layers fit together
Each public repo exposes one layer of the stack. The core platform is private.
| Layer | Repository |
|---|---|
| Trust & verification | certifieddata-public |
| Agent Commerce | certifieddata-agent-commerce-public |
| Data safety | pii-scan |
| Core platform | certifieddata-platform (private) |
Verification layer
certifieddata-public
PublicVerify certificates. Trust artifacts. Prove provenance.
The public verification and trust layer for the CertifiedData platform. Contains the verification SDK, certificate schema tools, and PII scanning integration. Use it to verify that a dataset or AI artifact was certified by CertifiedData — without needing a platform account.
What it includes
- —Ed25519 certificate signature verification
- —SHA-256 dataset hash validation
- —Certificate schema tools and type definitions
- —PII scanning integration (references pii-scan)
- —Published public key at /.well-known/certifieddata-public-key.pem
Agent Commerce layer
Execute. Settle. Prove it happened.
The public execution and settlement layer for Agent Commerce. Contains the OpenAPI contract, AsyncAPI event schema, payment SDKs, a mock server for local development, and test vectors. Use it to integrate AI agents with policy-governed, receipt-issued payment infrastructure.
What it includes
- —OpenAPI contract for all Agent Commerce endpoints
- —AsyncAPI event schema (authorization, settlement, receipt events)
- —Python and TypeScript SDKs
- —Mock server for local development without real payments
- —Test vectors for Ed25519 receipt verification
- —llms.txt for AI system discoverability
Data safety utility
pii-scan
PublicScan before you train. Know what's in your data.
A lightweight, standalone PII detection utility for datasets used in AI training pipelines. Identifies personal information across common data formats before datasets are certified or published. Integrates with the CertifiedData certification workflow.
What it includes
- —PII pattern detection across CSV, JSON, and Parquet
- —Configurable rule sets for GDPR, HIPAA, and custom patterns
- —Pre-certification scan integration
- —Zero external dependencies in core scanner
- —Suitable for CI/CD pipeline integration
Decision Ledger examples
Sign and verify AI decision records — zero accounts, zero vendor trust.
Runnable examples for the Decision Ledger. Includes a TypeScript verifier, sample payloads for loan approval / claim triage / agent action, and a GitHub Action that re-verifies every committed record on push. RFC 8785 JCS + Ed25519 throughout.
What it includes
- —TypeScript verifier using the public /.well-known/signing-keys.json registry
- —CLI to sign a demo decision against the public endpoint — no API key required
- —Three realistic sample payloads ready to fork
- —GitHub Action: re-verify every committed record in CI
- —RFC 8785 canonicalization + SHA-256 + Ed25519 — independently reproducible
Dataset Generation templates
Forkable dataset templates used by /generate.
Public template definitions consumed by the CertifiedData synthetic generator. JSON-Schema-validated, semver-versioned, domain-tagged. Fork to propose new templates or adapt existing ones for your own workloads — certificates record the template ID and version at generation time.
What it includes
- —JSON Schema contract for the template format
- —Reference templates: fraud detection, healthcare claims, HR workforce, retail lending
- —Zero-dependency validator (runs in CI with no installs)
- —PII classification per column (direct / quasi / sensitive / none)
- —Target row caps aligned to the generator's enforced limits
Manifest certification
Certify on every release — structured batch intake for CI/CD.
Example manifests, a JSON Schema, a cURL submitter, and a GitHub Action that certifies release artifacts on every tag push. Drop-in for teams who want automatic, verifiable provenance without building a custom client.
What it includes
- —JSON Schema for the manifest format (Draft 2020-12)
- —Examples: single dataset, batch artifacts, pipeline output with lineage
- —Shell submitter using curl + bearer token
- —GitHub Action: certify release artifacts on tag push
- —Policy-context field links manifests to Agent Commerce policy decisions
API contract
certifieddata-openapi
PublicOpenAPI 3.1 spec + Postman collection — unlock SDK codegen.
Machine-readable API contract for the CertifiedData public surface. Covers verification, decision ledger, agent commerce, dataset generation, and manifest certification. Use it for SDK codegen, contract tests, Prism mock servers, and integrations with Zapier / n8n / Retool.
What it includes
- —OpenAPI 3.1 spec with all stable public paths
- —Postman collection with pre-built requests for every endpoint
- —Spectral lint in CI so the spec stays accurate
- —Schema components aligned with real API response shapes
- —Bearer-auth + anonymous endpoints clearly separated
Compliance mapping
EU AI Act articles mapped to the primitives that satisfy them.
Public, machine-readable mapping from EU AI Act articles (9, 10, 12, 13, 15, 19, 50) to CertifiedData primitives. Intended as procurement evidence, auditor reference, and a living document the community can correct. Not legal advice.
What it includes
- —mapping.json — stable, versioned, machine-readable
- —One markdown file per covered article with plain-language obligation text
- —Primitives taxonomy: signed-certificate, decision-record, signed-receipt, manifest-certification, public-registry, signing-key-registry, canonicalization
- —Deployer-responsibility notes on every article — honest about what we don't cover
- —PR-friendly structure for corrections and additions
Why these components are public
Verification must be independent
A certificate is only trustworthy if it can be verified without relying on the platform that issued it. The verification SDK and public key are open so any system can confirm certificate integrity without a CertifiedData account.
Integration requires a contract
Developers building AI agents need a stable, versioned API contract and event schema before they can integrate. The Agent Commerce public repo provides that without requiring platform access or early sign-up.
Trust is built incrementally
Publishing the components that touch external systems — verification, execution, data safety — lets the community inspect and build confidence in the primitives before adopting the full platform.
Start building
Use the public repos to integrate, verify, and extend — then bring the full platform in when you need generation, certification, and the registry.