Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

mcclowes/vague

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Abstract representation of data and Vague

Vague

A declarative language for describing and generating realistic data. Vague treats ambiguity as a first-class primitive — declare the shape of valid data and let the runtime figure out how to populate it.

Why Vague?

Vague is a data description model for APIs, not just a fake data tool.

Think of it as OpenAPI meets property-based testing: you describe what valid data looks like — its structure, constraints, distributions, and edge cases — and Vague handles generation. The same schema that generates test data can validate production data.

What You Need Traditional Tools Vague
Intent — "80% of users are active" Random selection status: 0.8: "active" | 0.2: "inactive"
Constraints — "due date ≥ issued date" Manual validation assume due_date >= issued_date
Relationships — "payment references an invoice" Manual wiring invoice: any of invoices where .status == "open"
Edge cases — "test with Unicode exploits" Manual creation name: issuer.homoglyph("admin")
Validation — "does this data match the schema?" Separate tool Same .vague file with --validate-data

The question isn't "which fake data library?" — it's "how do we formally describe what valid data looks like for our APIs?"

For a detailed comparison, see COMPARISON.md.

Installation

npm install vague-lang

Or install globally for CLI usage:

npm install -g vague-lang

npm version

Quick Start

Create a .vague file:

schema Customer {
  name: string,
  status: 0.8: "active" | 0.2: "inactive"
}

schema Invoice {
  customer: any of customers,
  amount: decimal in 100..10000,
  status: "draft" | "sent" | "paid",

  assume amount > 0
}

dataset TestData {
  customers: 50 of Customer,
  invoices: 200 of Invoice
}

Generate JSON:

node dist/cli.js your-file.vague

Syntax Cheat Sheet

For a quick reference of all syntax, see SYNTAX.md.

Language Features

Superposition (Random Choice)

// Equal probability
status: "draft" | "sent" | "paid"

// Weighted probability
status: 0.6: "paid" | 0.3: "pending" | 0.1: "draft"

// Mixed: unweighted options share remaining probability
status: 0.85: "Active" | "Archived"         // "Archived" gets 15%
category: 0.6: "main" | "side" | "dessert"  // "side" and "dessert" get 20% each

Ranges

age: int in 18..65
price: decimal in 0.01..999.99
founded: date in 2000..2023

// Decimal with explicit precision
score: decimal(1) in 0..10       // 1 decimal place
amount: decimal(2) in 10..100    // 2 decimal places

Collections

line_items: 1..5 of LineItem    // 1-5 items
employees: 100 of Employee       // Exactly 100

Constraints

schema Invoice {
  issued_date: int in 1..28,
  due_date: int in 1..90,
  status: "draft" | "paid",
  amount: int in 0..10000,

  // Hard constraint
  assume due_date >= issued_date,

  // Conditional constraint
  assume if status == "paid" {
    amount == 0
  }
}

Logical operators: and, or, not

Cross-Record References

schema Invoice {
  // Reference any customer from the collection
  customer: any of customers,

  // Filtered reference
  active_customer: any of customers where .status == "active"
}

Parent References

schema LineItem {
  // Inherit currency from parent invoice
  currency: ^base_currency
}

schema Invoice {
  base_currency: "USD" | "GBP" | "EUR",
  line_items: 1..5 of LineItem
}

Computed Fields

schema Invoice {
  line_items: 1..10 of LineItem,

  total: sum(line_items.amount),
  item_count: count(line_items),
  avg_price: avg(line_items.unit_price),
  min_price: min(line_items.unit_price),
  max_price: max(line_items.unit_price),
  median_price: median(line_items.unit_price),
  first_item: first(line_items.unit_price),
  last_item: last(line_items.unit_price),
  price_product: product(line_items.unit_price)
}

Nullable Fields

nickname: string?           // Shorthand: sometimes null
notes: string | null        // Explicit

Ternary Expressions

status: amount_paid >= total ? "paid" : "pending"
grade: score >= 90 ? "A" : score >= 70 ? "B" : "C"

Match Expressions

// Pattern matching for multi-way branching
display: match status {
  "pending" => "Awaiting shipment",
  "shipped" => "On the way",
  "delivered" => "Complete"
}

// Returns null if no pattern matches

Conditional Fields

schema Account {
  type: "personal" | "business",
  companyNumber: string when type == "business"  // Only exists for business accounts
}

Dynamic Cardinality

schema Order {
  size: "small" | "large",
  items: (size == "large" ? 5..10 : 1..3) of LineItem
}

Side Effects (then blocks)

schema Payment {
  invoice: any of invoices,
  amount: int in 10..500
} then {
  invoice.amount_paid += amount,
  invoice.status = invoice.amount_paid >= invoice.total ? "paid" : "partial"
}

Unique Values

id: unique int in 1000..9999    // No duplicates in collection

Private Fields

schema Person {
  age: private int in 0..105,                    // Generated but excluded from output
  age_bracket: age < 18 ? "minor" : "adult"    // Computed from private field
}
// Output: { "age_bracket": "adult" } -- no "age" field

Ordered Sequences

pitch: [48, 52, 55, 60]   // Cycles in order: 48, 52, 55, 60, 48...
color: ["red", "green", "blue"]

Statistical Distributions

age: gaussian(35, 10, 18, 65)     // mean, stddev, min, max
income: lognormal(10.5, 0.5)      // mu, sigma
wait_time: exponential(0.5)       // rate
daily_orders: poisson(5)          // lambda
conversion: beta(2, 5)            // alpha, beta

Date Functions

created_at: now()                 // Full ISO 8601 timestamp
today_date: today()               // Date only
past: daysAgo(30)                 // 30 days ago
future: daysFromNow(90)           // 90 days from now
random: datetime(2020, 2024)      // Random datetime in range
between: dateBetween("2023-01-01", "2023-12-31")

Sequential Generation

id: sequence("INV-", 1001)        // "INV-1001", "INV-1002", ...
order_num: sequenceInt("orders")  // 1, 2, 3, ...
prev_value: previous("amount")    // Reference previous record

String Transformations

// Case transformations
upper: uppercase(name)             // "HELLO WORLD"
lower: lowercase(name)             // "hello world"
capitalized: capitalize(name)      // "Hello World"

// Case style conversions
slug: kebabCase(title)             // "hello-world"
snake: snakeCase(title)            // "hello_world"
camel: camelCase(title)            // "helloWorld"

// String manipulation
trimmed: trim("  hello  ")         // "hello"
combined: concat(first, " ", last) // "John Doe"
part: substring(name, 0, 5)        // First 5 characters
replaced: replace(name, "foo", "bar")
len: length(name)                  // String length

Negative Testing

// Generate data that violates constraints (for testing error handling)
dataset Invalid violating {
  bad_invoices: 100 of Invoice
}

Built-in Plugins

Vague includes several plugins for generating realistic domain-specific data. For complete documentation, see SYNTAX.md.

Plugin Description Example
faker Realistic personal/business data email(), fullName(), companyName()
issuer Edge case testing values issuer.homoglyph("admin"), issuer.maxInt()
regex Pattern-based generation regex("[A-Z]{3}-[0-9]{4}"), semver()
date Day-of-week filtering date.weekday(2024, 2025)
http HTTP testing data http.method(), http.statusCode(), env("API_KEY")
sql SQL test data sql.tableName(), sql.connectionString("postgres")
graphql GraphQL test data graphql.query(), graphql.error()

Examples

The examples/ directory contains organized examples for learning and reference:

Getting Started:

  • data-description-model/ - Start here: Intent encoding, constraint encoding, edge-case bias
  • basics/ - Core language features (schemas, constraints, computed fields, cross-refs)

OpenAPI Integration:

  • openapi-importing/ - Import schemas from OpenAPI specs
  • openapi-examples-generation/ - Populate OpenAPI specs with generated examples

Real-World API Examples:

  • stripe/ - Payment processing (invoices, charges, subscriptions)
  • github/ - GitHub API patterns (repos, issues, PRs)
  • slack/ - Slack webhook payloads
  • shopify/ - E-commerce data
  • codat/ - Accounting/fintech APIs
  • twilio/ - Communications platform
  • graphql/ - GraphQL-specific patterns

Advanced Topics:

  • http-testing/ - HTTP request/response patterns
  • custom-plugins/ - Creating custom plugins
  • vitest-fixtures/ - Test data for Vitest

CLI Usage

# Generate JSON to stdout
node dist/cli.js file.vague

# Save to file
node dist/cli.js file.vague -o output.json

# Pretty print
node dist/cli.js file.vague -p

# Reproducible output (seeded random)
node dist/cli.js file.vague --seed 123

# Watch mode - regenerate on file change
node dist/cli.js file.vague -o output.json -w

# CSV output
node dist/cli.js file.vague -f csv -o output.csv

# CSV with options
node dist/cli.js file.vague -f csv --csv-delimiter ";" -o output.csv

# Validate against OpenAPI spec
node dist/cli.js file.vague -v openapi.json -m '{"invoices": "Invoice"}'

# Validate only (exit code 1 on failure, useful for CI)
node dist/cli.js file.vague -v openapi.json -m '{"invoices": "Invoice"}' --validate-only

OpenAPI Example Population

Generate realistic examples and embed them directly in your OpenAPI spec:

# Populate OpenAPI spec with inline examples
node dist/cli.js data.vague --oas-output api-with-examples.json --oas-source api.json

# Multiple examples per schema
node dist/cli.js data.vague --oas-output api.json --oas-source api.json --oas-example-count 3

# External file references instead of inline
node dist/cli.js data.vague --oas-output api.json --oas-source api.json --oas-external

Auto-detection maps collection names to schema names (e.g., invoicesInvoice).

CLI Options

Option Description
-o, --output <file> Write output to file
-f, --format <fmt> Output format: json (default), csv, ndjson
-p, --pretty Pretty-print JSON
-s, --seed <number> Seed for reproducible generation
-w, --watch Watch input file and regenerate on changes
-v, --validate <spec> Validate against OpenAPI spec
-m, --mapping <json> Schema mapping {"collection": "SchemaName"}
--validate-only Only validate, don't output data
--validate-data <file> Validate external JSON data against Vague schema
--schema <file> Schema file for data validation
--csv-delimiter <char> CSV field delimiter (default: ,)
--csv-no-header Omit CSV header row
--csv-arrays <mode> Array handling: json, first, count
--csv-nested <mode> Nested objects: flatten, json
--infer <file> Infer Vague schema from JSON or CSV data
--collection-name <name> Collection name for CSV inference
--infer-delimiter <char> CSV delimiter for inference (default: ,)
--dataset-name <name> Dataset name for inference
--typescript Generate TypeScript definitions alongside output
--ts-only Generate only TypeScript definitions (no .vague)
--oas-source <spec> Source OpenAPI spec to populate with examples
--oas-output <file> Output path for populated OpenAPI spec
--oas-example-count <n> Number of examples per schema (default: 1)
--oas-external Use external file references instead of inline
--lint-spec <file> Lint OpenAPI spec with Spectral
--lint-verbose Show detailed lint results (includes hints)
--plugins <dir> Load plugins from directory (can be used multiple times)
--no-auto-plugins Disable automatic plugin discovery
--debug Enable debug logging
--log-level <level> Set log level: none, error, warn, info, debug
--verbose Show verbose output (e.g., discovered plugins)
-h, --help Show help

Configuration File

Create a vague.config.js in your project root for persistent settings:

// vague.config.js
export default {
  seed: 42,              // Reproducible output
  format: 'json',        // 'json', 'csv', or 'ndjson'
  pretty: true,          // Pretty-print JSON
  plugins: [
    './my-plugin.js',    // Local plugin
    'vague-plugin-foo',  // npm package
  ],
  logging: {
    level: 'info',       // 'none', 'error', 'warn', 'info', 'debug'
    components: ['generator', 'constraint'],
  },
};

Config files are auto-discovered by searching up from the current directory.

Troubleshooting

Constraint failures (100 retries exceeded)

If generation fails with "Maximum constraint retries exceeded":

  1. Check constraint compatibility: Ensure your constraints don't conflict

    // BAD: Impossible constraint
    value: int in 1..10,
    assume value > 100
    
    // GOOD: Compatible constraint
    value: int in 1..100,
    assume value > 50
    
  2. Widen ranges: If constraints are too tight, generation may fail frequently

    // BAD: Very narrow valid range
    age: int in 0..100,
    assume age >= 18 and age <= 21  // Only 4 valid values
    
    // GOOD: Use range directly
    age: int in 18..21
    
  3. Use --debug to diagnose: See which constraints are failing

    node dist/cli.js file.vague --debug

Cross-reference "No matching items" errors

If you get "No matching items found for reference":

  1. Check generation order: Referenced collections must be generated first

    dataset Data {
      customers: 10 of Customer,    // Generated first
      invoices: 50 of Invoice       // Can reference customers
    }
    
  2. Ensure filter matches: Check that where conditions can be satisfied

    // If no customers have status "vip", this will fail
    customer: any of customers where .status == "vip"
    

Plugin not found

  1. Check plugin path is relative to config file location
  2. For npm packages, ensure they're installed: npm install vague-plugin-foo
  3. Use --verbose to see discovered plugins

Debug logging

# Enable all debug output
node dist/cli.js file.vague --debug

# Filter by component
VAGUE_DEBUG=generator,constraint node dist/cli.js file.vague

Development

npm run build     # Compile TypeScript
npm test          # Run tests
npm run dev       # Watch mode

Project Structure

src/
├── lexer/       # Tokenizer
├── parser/      # Recursive descent parser
├── ast/         # AST node definitions
├── interpreter/ # JSON generator
├── validator/   # Schema validation (Ajv)
├── openapi/     # OpenAPI import support
├── infer/       # Schema inference from data
├── csv/         # CSV input/output formatting
├── ndjson/      # NDJSON (newline-delimited JSON) formatting
├── config/      # Configuration file loading (vague.config.js)
├── logging/     # Debug logging utilities
├── plugins/     # Built-in plugins (faker, issuer, date, regex, http, sql, graphql)
├── spectral/    # OpenAPI linting with Spectral
├── index.ts     # Library exports
└── cli.ts       # CLI entry point

Roadmap

See TODO.md for planned features:

  • Probabilistic constraints (assume X with probability 0.7)
  • Conditional schema variants
  • Constraint solving (SMT integration)

Working with Claude

This project includes Claude Code skills that help Claude assist you more effectively when working with Vague files and OpenAPI specifications.

Available Skills

Skill Description
vague Writing Vague (.vague) files - syntax, constraints, cross-references
openapi Working with OpenAPI specs - validation, schemas, best practices

Installation via OpenSkills

Install the skills using OpenSkills:

npm i -g openskills
openskills install mcclowes/vague

This installs the skills to your .claude/skills/ directory, making them available when you use Claude Code in this project.

Manual Installation

Alternatively, copy the skills directly:

git clone https://github.com/mcclowes/vague.git
cp -r vague/.claude/skills/* ~/.claude/skills/

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE

About

A constraint and probability-based programming language

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •