-
Notifications
You must be signed in to change notification settings - Fork 16.6k
Open
Labels
Description
LLMSchemaCompareOperator / @task.llm_schema_compare
Cross-system schema drift detection powered by LLM reasoning.
What
Compare schemas across different database systems (PostgreSQL, Snowflake, S3 Parquet, etc.) and identify mismatches that would break data loading. The LLM handles complex cross-system type mapping that simple equality checks miss (e.g., varchar(255) vs string, timestamp vs timestamptz).
Design
- Accepts multiple
data_sources(ordb_conn_ids+table_names) for cross-system comparison - Schema introspection from each source via the appropriate hook (DbApiHook, S3Hook, etc.)
- System prompt includes schema context from all sources with clear labeling (database name, dialect)
reasoning_mode=Truestrongly recommended — complex cross-system type mapping benefits from step-by-step analysiscontext_strategy="full"for thorough analysis (includes constraints, indexes, clustering keys)- Structured output: list of mismatches, severity, suggested migration actions
Use Cases
- Detect breaking schema changes before ETL runs
- Generate migration plans during maintenance windows
- Validate schema consistency across data warehouse replicas
- Compare source system schemas against downstream expectations
Example
schema_drift = LLMSchemaCompareOperator(
task_id="detect_schema_drift",
data_sources=[customer_s3, customer_postgres, customer_snowflake],
prompt="Identify schema mismatches that would break data loading between systems",
reasoning_mode=True,
context_strategy="full",
llm_conn_id="openai_default",
)
# Decorator version
@task.llm_schema_compare(
db_conn_ids=["postgres_source", "snowflake_target"],
table_names=["customers"],
)
def check_migration_readiness():
is_maintenance = check_migration_window()
if is_maintenance:
return "Compare schemas and generate migration plan for maintenance window"
return "Compare schemas and flag breaking changes — no migrations allowed"Dependencies
- LLMOperator (merged)
- Multi-datasource support (for cross-database introspection)
Phase
Phase 3
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
In progress