A production-ready data stack implementation combining Meltano for ELT orchestration, DuckDB for high-performance analytics, and dbt Core for data transformations.
CSV Data → Meltano (Extract/Load) → DuckDB → dbt Core (Transform) → Analytics Tables
Data Flow:
- Extract: Meltano's tap-csvreads sample employee data
- Load: target-duckdbloads raw data into DuckDB
- Transform: dbt models create staging views and analytics tables
- Validate: Data quality tests ensure integrity
- Meltano 3.8.0 - DataOps platform for ELT pipelines
- DuckDB 1.3.2 - High-performance in-process analytics database
- dbt Core 1.10.4 - Data transformation framework
- dbt-duckdb 1.9.4 - DuckDB adapter for dbt
# Clone and navigate to project
cd claude-data-stack-mcp
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
# Install all dependencies
pip install -r requirements.txt# Step 1: Extract and Load with Meltano
meltano run tap-csv target-duckdb
# Step 2: Transform with dbt
cd transform
DBT_PROFILES_DIR=./profiles/duckdb dbt run --profile data_stack --project-dir .
# Step 3: Validate with data quality tests
DBT_PROFILES_DIR=./profiles/duckdb dbt test --profile data_stack --project-dir .# Check transformed data
python -c "
import duckdb
conn = duckdb.connect('data/warehouse/data_stack.duckdb')
print('=== Department Stats ===')
result = conn.execute('SELECT * FROM main.agg_department_stats').fetchall()
for row in result: print(row)
"├── data/
│   ├── sample_data.csv          # Sample employee data
│   └── warehouse/               # DuckDB database files
├── transform/                   # dbt project
│   ├── models/
│   │   ├── staging/             # Staging models
│   │   │   ├── stg_employees.sql
│   │   │   └── sources.yml
│   │   └── marts/               # Analytics models
│   │       ├── dim_employees.sql
│   │       └── agg_department_stats.sql
│   ├── profiles/duckdb/         # Project-contained profiles
│   └── dbt_project.yml
├── meltano.yml                  # Meltano configuration
├── requirements.txt             # Python dependencies
└── README.md                    # This file
- stg_employees: Clean, typed employee data from raw CSV
- dim_employees: Employee dimension with salary tiers (Junior/Mid-Level/Senior)
- agg_department_stats: Department-level aggregations (count, avg/min/max salary, total payroll)
- Unique constraints: Employee IDs must be unique
- Not null constraints: Employee IDs cannot be null
# Browse available extractors
meltano discover extractors
# Add a new extractor (e.g., PostgreSQL)
meltano add extractor tap-postgres
# Configure in meltano.yml and run
meltano run tap-postgres target-duckdb-- transform/models/marts/new_model.sql
{{ config(materialized='table') }}
select
    department,
    count(*) as employee_count,
    avg(annual_salary) as avg_salary
from {{ ref('stg_employees') }}
group by department# Test individual dbt models
cd transform
DBT_PROFILES_DIR=./profiles/duckdb dbt run --models stg_employees --profile data_stack --project-dir .
# Run only marts models
DBT_PROFILES_DIR=./profiles/duckdb dbt run --models marts --profile data_stack --project-dir .
# Generate documentation
DBT_PROFILES_DIR=./profiles/duckdb dbt docs generate --profile data_stack --project-dir .- Extractor: tap-csvconfigured fordata/sample_data.csv
- Loader: target-duckdbconfigured fordata/warehouse/data_stack.duckdb
- Environments: dev, staging, prod
- Profile: data_stackwith project-contained profiles
- Target: DuckDB database in data/warehouse/
- Materializations: Views for staging, tables for marts
"Table does not exist" errors:
- Ensure Meltano ELT step completed successfully
- Check data/warehouse/data_stack.duckdbexists
dbt profile errors:
- Verify you're in the transform/directory
- Use DBT_PROFILES_DIR=./profiles/duckdbflag
Python dependency conflicts:
- Use fresh virtual environment
- Ensure Python 3.13+ compatibility
# Check Meltano configuration
meltano config list
# Validate dbt setup
cd transform
DBT_PROFILES_DIR=./profiles/duckdb dbt debug --profile data_stack
# Inspect DuckDB directly
python -c "import duckdb; conn = duckdb.connect('data/warehouse/data_stack.duckdb'); print(conn.execute('SHOW TABLES').fetchall())"- Add More Data Sources: Integrate APIs, databases, or files using Meltano's extensive extractor library
- Expand Transformations: Create more sophisticated dbt models for advanced analytics
- Add Orchestration: Integrate with Airflow, Prefect, or other orchestration tools
- Enable Monitoring: Add data quality monitoring and alerting
- Scale Storage: Migrate to cloud data warehouses (Snowflake, BigQuery, etc.)
NEW: Claude Code MCP server for intelligent dbt assistance!
# Start dbt MCP server for Claude Code integration
./scripts/start_dbt_mcp.shCapabilities:
- dbt CLI Operations: dbt_run,dbt_test,dbt_compile,dbt_build
- Project Discovery: Model listing, metadata analysis, lineage exploration
- Database Querying: Direct SQL execution against DuckDB warehouse
- Real-time Assistance: Context-aware dbt project support
Documentation:
- 🚀 Quick Start Guide - 5-minute setup
- 📚 API Reference - Complete tool documentation
- 💡 Usage Examples - Practical workflows
- 🔧 Integration Guide - Detailed configuration
This data stack has been systematically implemented and validated through comprehensive testing. All components are using the latest compatible versions and following best practices. Enhanced with Claude Code MCP integration for intelligent development assistance.