A declarative language for describing and generating realistic data. Vague treats ambiguity as a first-class primitive — declare the shape of valid data and let the runtime figure out how to populate it.
Vague is a data description model for APIs, not just a fake data tool.
Think of it as OpenAPI meets property-based testing: you describe what valid data looks like — its structure, constraints, distributions, and edge cases — and Vague handles generation. The same schema that generates test data can validate production data.
| What You Need | Traditional Tools | Vague |
|---|---|---|
| Intent — "80% of users are active" | Random selection | status: 0.8: "active" | 0.2: "inactive" |
| Constraints — "due date ≥ issued date" | Manual validation | assume due_date >= issued_date |
| Relationships — "payment references an invoice" | Manual wiring | invoice: any of invoices where .status == "open" |
| Edge cases — "test with Unicode exploits" | Manual creation | name: issuer.homoglyph("admin") |
| Validation — "does this data match the schema?" | Separate tool | Same .vague file with --validate-data |
The question isn't "which fake data library?" — it's "how do we formally describe what valid data looks like for our APIs?"
For a detailed comparison, see COMPARISON.md.
npm install vague-langOr install globally for CLI usage:
npm install -g vague-langCreate a .vague file:
schema Customer {
name: string,
status: 0.8: "active" | 0.2: "inactive"
}
schema Invoice {
customer: any of customers,
amount: decimal in 100..10000,
status: "draft" | "sent" | "paid",
assume amount > 0
}
dataset TestData {
customers: 50 of Customer,
invoices: 200 of Invoice
}
Generate JSON:
node dist/cli.js your-file.vagueFor a quick reference of all syntax, see SYNTAX.md.
// Equal probability
status: "draft" | "sent" | "paid"
// Weighted probability
status: 0.6: "paid" | 0.3: "pending" | 0.1: "draft"
// Mixed: unweighted options share remaining probability
status: 0.85: "Active" | "Archived" // "Archived" gets 15%
category: 0.6: "main" | "side" | "dessert" // "side" and "dessert" get 20% each
age: int in 18..65
price: decimal in 0.01..999.99
founded: date in 2000..2023
// Decimal with explicit precision
score: decimal(1) in 0..10 // 1 decimal place
amount: decimal(2) in 10..100 // 2 decimal places
line_items: 1..5 of LineItem // 1-5 items
employees: 100 of Employee // Exactly 100
schema Invoice {
issued_date: int in 1..28,
due_date: int in 1..90,
status: "draft" | "paid",
amount: int in 0..10000,
// Hard constraint
assume due_date >= issued_date,
// Conditional constraint
assume if status == "paid" {
amount == 0
}
}
Logical operators: and, or, not
schema Invoice {
// Reference any customer from the collection
customer: any of customers,
// Filtered reference
active_customer: any of customers where .status == "active"
}
schema LineItem {
// Inherit currency from parent invoice
currency: ^base_currency
}
schema Invoice {
base_currency: "USD" | "GBP" | "EUR",
line_items: 1..5 of LineItem
}
schema Invoice {
line_items: 1..10 of LineItem,
total: sum(line_items.amount),
item_count: count(line_items),
avg_price: avg(line_items.unit_price),
min_price: min(line_items.unit_price),
max_price: max(line_items.unit_price),
median_price: median(line_items.unit_price),
first_item: first(line_items.unit_price),
last_item: last(line_items.unit_price),
price_product: product(line_items.unit_price)
}
nickname: string? // Shorthand: sometimes null
notes: string | null // Explicit
status: amount_paid >= total ? "paid" : "pending"
grade: score >= 90 ? "A" : score >= 70 ? "B" : "C"
// Pattern matching for multi-way branching
display: match status {
"pending" => "Awaiting shipment",
"shipped" => "On the way",
"delivered" => "Complete"
}
// Returns null if no pattern matches
schema Account {
type: "personal" | "business",
companyNumber: string when type == "business" // Only exists for business accounts
}
schema Order {
size: "small" | "large",
items: (size == "large" ? 5..10 : 1..3) of LineItem
}
schema Payment {
invoice: any of invoices,
amount: int in 10..500
} then {
invoice.amount_paid += amount,
invoice.status = invoice.amount_paid >= invoice.total ? "paid" : "partial"
}
id: unique int in 1000..9999 // No duplicates in collection
schema Person {
age: private int in 0..105, // Generated but excluded from output
age_bracket: age < 18 ? "minor" : "adult" // Computed from private field
}
// Output: { "age_bracket": "adult" } -- no "age" field
pitch: [48, 52, 55, 60] // Cycles in order: 48, 52, 55, 60, 48...
color: ["red", "green", "blue"]
age: gaussian(35, 10, 18, 65) // mean, stddev, min, max
income: lognormal(10.5, 0.5) // mu, sigma
wait_time: exponential(0.5) // rate
daily_orders: poisson(5) // lambda
conversion: beta(2, 5) // alpha, beta
created_at: now() // Full ISO 8601 timestamp
today_date: today() // Date only
past: daysAgo(30) // 30 days ago
future: daysFromNow(90) // 90 days from now
random: datetime(2020, 2024) // Random datetime in range
between: dateBetween("2023-01-01", "2023-12-31")
id: sequence("INV-", 1001) // "INV-1001", "INV-1002", ...
order_num: sequenceInt("orders") // 1, 2, 3, ...
prev_value: previous("amount") // Reference previous record
// Case transformations
upper: uppercase(name) // "HELLO WORLD"
lower: lowercase(name) // "hello world"
capitalized: capitalize(name) // "Hello World"
// Case style conversions
slug: kebabCase(title) // "hello-world"
snake: snakeCase(title) // "hello_world"
camel: camelCase(title) // "helloWorld"
// String manipulation
trimmed: trim(" hello ") // "hello"
combined: concat(first, " ", last) // "John Doe"
part: substring(name, 0, 5) // First 5 characters
replaced: replace(name, "foo", "bar")
len: length(name) // String length
// Generate data that violates constraints (for testing error handling)
dataset Invalid violating {
bad_invoices: 100 of Invoice
}
Vague includes several plugins for generating realistic domain-specific data. For complete documentation, see SYNTAX.md.
| Plugin | Description | Example |
|---|---|---|
| faker | Realistic personal/business data | email(), fullName(), companyName() |
| issuer | Edge case testing values | issuer.homoglyph("admin"), issuer.maxInt() |
| regex | Pattern-based generation | regex("[A-Z]{3}-[0-9]{4}"), semver() |
| date | Day-of-week filtering | date.weekday(2024, 2025) |
| http | HTTP testing data | http.method(), http.statusCode(), env("API_KEY") |
| sql | SQL test data | sql.tableName(), sql.connectionString("postgres") |
| graphql | GraphQL test data | graphql.query(), graphql.error() |
The examples/ directory contains organized examples for learning and reference:
Getting Started:
data-description-model/- Start here: Intent encoding, constraint encoding, edge-case biasbasics/- Core language features (schemas, constraints, computed fields, cross-refs)
OpenAPI Integration:
openapi-importing/- Import schemas from OpenAPI specsopenapi-examples-generation/- Populate OpenAPI specs with generated examples
Real-World API Examples:
stripe/- Payment processing (invoices, charges, subscriptions)github/- GitHub API patterns (repos, issues, PRs)slack/- Slack webhook payloadsshopify/- E-commerce datacodat/- Accounting/fintech APIstwilio/- Communications platformgraphql/- GraphQL-specific patterns
Advanced Topics:
http-testing/- HTTP request/response patternscustom-plugins/- Creating custom pluginsvitest-fixtures/- Test data for Vitest
# Generate JSON to stdout
node dist/cli.js file.vague
# Save to file
node dist/cli.js file.vague -o output.json
# Pretty print
node dist/cli.js file.vague -p
# Reproducible output (seeded random)
node dist/cli.js file.vague --seed 123
# Watch mode - regenerate on file change
node dist/cli.js file.vague -o output.json -w
# CSV output
node dist/cli.js file.vague -f csv -o output.csv
# CSV with options
node dist/cli.js file.vague -f csv --csv-delimiter ";" -o output.csv
# Validate against OpenAPI spec
node dist/cli.js file.vague -v openapi.json -m '{"invoices": "Invoice"}'
# Validate only (exit code 1 on failure, useful for CI)
node dist/cli.js file.vague -v openapi.json -m '{"invoices": "Invoice"}' --validate-onlyGenerate realistic examples and embed them directly in your OpenAPI spec:
# Populate OpenAPI spec with inline examples
node dist/cli.js data.vague --oas-output api-with-examples.json --oas-source api.json
# Multiple examples per schema
node dist/cli.js data.vague --oas-output api.json --oas-source api.json --oas-example-count 3
# External file references instead of inline
node dist/cli.js data.vague --oas-output api.json --oas-source api.json --oas-externalAuto-detection maps collection names to schema names (e.g., invoices → Invoice).
| Option | Description |
|---|---|
-o, --output <file> |
Write output to file |
-f, --format <fmt> |
Output format: json (default), csv, ndjson |
-p, --pretty |
Pretty-print JSON |
-s, --seed <number> |
Seed for reproducible generation |
-w, --watch |
Watch input file and regenerate on changes |
-v, --validate <spec> |
Validate against OpenAPI spec |
-m, --mapping <json> |
Schema mapping {"collection": "SchemaName"} |
--validate-only |
Only validate, don't output data |
--validate-data <file> |
Validate external JSON data against Vague schema |
--schema <file> |
Schema file for data validation |
--csv-delimiter <char> |
CSV field delimiter (default: ,) |
--csv-no-header |
Omit CSV header row |
--csv-arrays <mode> |
Array handling: json, first, count |
--csv-nested <mode> |
Nested objects: flatten, json |
--infer <file> |
Infer Vague schema from JSON or CSV data |
--collection-name <name> |
Collection name for CSV inference |
--infer-delimiter <char> |
CSV delimiter for inference (default: ,) |
--dataset-name <name> |
Dataset name for inference |
--typescript |
Generate TypeScript definitions alongside output |
--ts-only |
Generate only TypeScript definitions (no .vague) |
--oas-source <spec> |
Source OpenAPI spec to populate with examples |
--oas-output <file> |
Output path for populated OpenAPI spec |
--oas-example-count <n> |
Number of examples per schema (default: 1) |
--oas-external |
Use external file references instead of inline |
--lint-spec <file> |
Lint OpenAPI spec with Spectral |
--lint-verbose |
Show detailed lint results (includes hints) |
--plugins <dir> |
Load plugins from directory (can be used multiple times) |
--no-auto-plugins |
Disable automatic plugin discovery |
--debug |
Enable debug logging |
--log-level <level> |
Set log level: none, error, warn, info, debug |
--verbose |
Show verbose output (e.g., discovered plugins) |
-h, --help |
Show help |
Create a vague.config.js in your project root for persistent settings:
// vague.config.js
export default {
seed: 42, // Reproducible output
format: 'json', // 'json', 'csv', or 'ndjson'
pretty: true, // Pretty-print JSON
plugins: [
'./my-plugin.js', // Local plugin
'vague-plugin-foo', // npm package
],
logging: {
level: 'info', // 'none', 'error', 'warn', 'info', 'debug'
components: ['generator', 'constraint'],
},
};Config files are auto-discovered by searching up from the current directory.
If generation fails with "Maximum constraint retries exceeded":
-
Check constraint compatibility: Ensure your constraints don't conflict
// BAD: Impossible constraint value: int in 1..10, assume value > 100 // GOOD: Compatible constraint value: int in 1..100, assume value > 50 -
Widen ranges: If constraints are too tight, generation may fail frequently
// BAD: Very narrow valid range age: int in 0..100, assume age >= 18 and age <= 21 // Only 4 valid values // GOOD: Use range directly age: int in 18..21 -
Use
--debugto diagnose: See which constraints are failingnode dist/cli.js file.vague --debug
If you get "No matching items found for reference":
-
Check generation order: Referenced collections must be generated first
dataset Data { customers: 10 of Customer, // Generated first invoices: 50 of Invoice // Can reference customers } -
Ensure filter matches: Check that
whereconditions can be satisfied// If no customers have status "vip", this will fail customer: any of customers where .status == "vip"
- Check plugin path is relative to config file location
- For npm packages, ensure they're installed:
npm install vague-plugin-foo - Use
--verboseto see discovered plugins
# Enable all debug output
node dist/cli.js file.vague --debug
# Filter by component
VAGUE_DEBUG=generator,constraint node dist/cli.js file.vaguenpm run build # Compile TypeScript
npm test # Run tests
npm run dev # Watch modesrc/
├── lexer/ # Tokenizer
├── parser/ # Recursive descent parser
├── ast/ # AST node definitions
├── interpreter/ # JSON generator
├── validator/ # Schema validation (Ajv)
├── openapi/ # OpenAPI import support
├── infer/ # Schema inference from data
├── csv/ # CSV input/output formatting
├── ndjson/ # NDJSON (newline-delimited JSON) formatting
├── config/ # Configuration file loading (vague.config.js)
├── logging/ # Debug logging utilities
├── plugins/ # Built-in plugins (faker, issuer, date, regex, http, sql, graphql)
├── spectral/ # OpenAPI linting with Spectral
├── index.ts # Library exports
└── cli.ts # CLI entry point
See TODO.md for planned features:
- Probabilistic constraints (
assume X with probability 0.7) - Conditional schema variants
- Constraint solving (SMT integration)
This project includes Claude Code skills that help Claude assist you more effectively when working with Vague files and OpenAPI specifications.
| Skill | Description |
|---|---|
vague |
Writing Vague (.vague) files - syntax, constraints, cross-references |
openapi |
Working with OpenAPI specs - validation, schemas, best practices |
Install the skills using OpenSkills:
npm i -g openskills
openskills install mcclowes/vagueThis installs the skills to your .claude/skills/ directory, making them available when you use Claude Code in this project.
Alternatively, copy the skills directly:
git clone https://github.com/mcclowes/vague.git
cp -r vague/.claude/skills/* ~/.claude/skills/See CONTRIBUTING.md for guidelines.
MIT License - see LICENSE