ValidateLite

ValidateLite: A lightweight, scenario-driven data validation tool for modern data practitioners.

Whether you're a data scientist cleaning a messy CSV, a data engineer building robust pipelines, or a developer needing a quick check, ValidateLite provides powerful, focused commands for your use case:

vlite check: For quick, ad-hoc data checks. Need to verify if a column is unique or not null right now? The check command gets you an answer in seconds, zero config required.
vlite schema: For robust, repeatable, and automated validation. Define your data's contract in a JSON schema and let ValidateLite verify everything from data types and ranges to complex type-conversion feasibility.

Who is it for?

For the Data Scientist: Preparing Data for Analysis

You have a messy dataset (legacy_data.csv) where everything is a string. Before you can build a model, you need to clean it up and convert columns to their proper types (integer, float, date). How much work will it be?

Instead of writing complex cleaning scripts first, use vlite schema to assess the feasibility of the cleanup.

1. Define Your Target Schema (rules.json)

Create a schema file that describes the current type and the desired type.

{
  "legacy_users": {
    "rules": [
      {
        "field": "user_id",
        "type": "string",
        "desired_type": "integer",
        "required": true
      },
      {
        "field": "salary",
        "type": "string",
        "desired_type": "float(10,2)",
        "required": true
      },
      {
        "field": "bio",
        "type": "string",
        "desired_type": "string(500)",
        "required": false
      }
    ]
  }
}

2. Run the Validation

vlite schema --conn legacy_data.csv --rules rules.json

ValidateLite will generate a report telling you exactly what can and cannot be converted, saving you hours of guesswork.

FIELD VALIDATION RESULTS
========================

Field: user_id
  ✓ Field exists (string)
  ✓ Not Null constraint
  ✗ Type Conversion Validation (string → integer): 15 incompatible records found

Field: salary
  ✓ Field exists (string)
  ✗ Type Conversion Validation (string → float(10,2)): 8 incompatible records found

Field: bio
  ✓ Field exists (string)
  ✓ Length Constraint Validation (string → string(500)): PASSED

For the Data Engineer: Ensuring Data Integrity in CI/CD

You need to prevent breaking schema changes and bad data from ever reaching production. Embed ValidateLite into your CI/CD pipeline to act as a quality gate.

Example Workflow (.github/workflows/ci.yml)

This workflow automatically validates the database schema on every pull request.

jobs:
  validate-db-schema:
    name: Validate Database Schema
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.9'

      - name: Install ValidateLite
        run: pip install validatelite

      - name: Run Schema Validation
        run: |
          vlite schema --conn "mysql://${{ secrets.DB_USER }}:${{ secrets.DB_PASS }}@${{ secrets.DB_HOST }}/sales" \
                       --rules ./schemas/customers_schema.json \
                       --fail-on-error

This same approach can be used to monitor data quality at every stage of your ETL/ELT pipelines, preventing "garbage in, garbage out."

Quick Start: Ad-Hoc Checks with `check`

For temporary, one-off validation needs, the check command is your best friend. You can run multiple rules on any supported data source (files or databases) directly from the command line.

1. Install (if you haven't already):

pip install validatelite

2. Run a check:

# Check for nulls and uniqueness in a CSV file
vlite check --conn "customers.csv" --table customers \
  --rule "not_null(id)" \
  --rule "unique(email)"

# Check value ranges and formats in a database table
vlite check --conn "mysql://user:pass@host/db" --table customers \
  --rule "range(age, 18, 99)" \
  --rule "enum(status, 'active', 'inactive')"

Learn More

Usage Guide (docs/USAGE.md): Learn about all commands, data sources, rule types, and advanced features like the Desired Type system.
Configuration Reference (docs/CONFIG_REFERENCE.md): See how to configure the tool via toml files.
Contributing Guide (CONTRIBUTING.md): We welcome contributions!

📝 Development Blog

Follow the journey of building ValidateLite through our development blog posts:

📄 License

This project is licensed under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github/workflows		.github/workflows
cli		cli
config		config
core		core
docs		docs
examples		examples
scripts		scripts
shared		shared
test_data		test_data
tests		tests
.cursorignore		.cursorignore
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
cli_main.py		cli_main.py
docker-compose.test.yml		docker-compose.test.yml
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-dev.in		requirements-dev.in
requirements-dev.txt		requirements-dev.txt
requirements.in		requirements.in
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ValidateLite

Who is it for?

For the Data Scientist: Preparing Data for Analysis

For the Data Engineer: Ensuring Data Integrity in CI/CD

Quick Start: Ad-Hoc Checks with `check`

Learn More

📝 Development Blog

📄 License

About

Uh oh!

Releases 7

Packages

Contributors 3

Uh oh!

Languages

License

litedatum/validatelite

Folders and files

Latest commit

History

Repository files navigation

ValidateLite

Who is it for?

For the Data Scientist: Preparing Data for Analysis

For the Data Engineer: Ensuring Data Integrity in CI/CD

Quick Start: Ad-Hoc Checks with check

Learn More

📝 Development Blog

📄 License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 3

Uh oh!

Languages

Quick Start: Ad-Hoc Checks with `check`

Packages