Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@nightscape
Copy link
Contributor

Fixes lineage tracking for CTEs (Common Table Expressions) so they properly trace back to their underlying source tables instead of being treated as opaque tables.

Problem

When using a CTE like:

let employees_usa = (from employees | filter country == "USA")
from employees_usa | select {name, salary}

The lineage would incorrectly show:

inputs:
- name: employees_usa
  table: [employees_usa]  # Wrong - CTE treated as opaque

Solution

Modified lineage_of_table_decl in the resolver to check if a table declaration is a CTE (TableExpr::RelationVar) and, if so, trace back to the underlying source tables.

The lineage now correctly shows:

inputs:
- name: employees_usa
  table: [default_db, employees]  # Correct - traces to source

For CTEs with multiple inputs (UNIONs, JOINs), all source tables are included:

inputs:
- name: combined
  table: [default_db, employees]
- name: combined
  table: [default_db, contractors]

Changes

  • prqlc/prqlc/src/semantic/resolver/inference.rs: Modified lineage_of_table_decl to trace CTE lineage to underlying source tables
  • prqlc/prqlc/src/semantic/resolver/mod.rs: Added unit tests for simple CTEs and CTEs with UNIONs
  • Updated snapshot: integration__queries__debug_lineage__genre_counts.snap reflects the corrected lineage behavior

Testing

  • Added test_cte_lineage_traces_to_source_table - verifies simple CTE lineage
  • Added test_cte_lineage_with_union_traces_to_all_source_tables - verifies UNION CTE lineage
  • All existing tests pass

nightscape and others added 3 commits November 26, 2025 23:46
Previously, CTEs (let statements) were treated as opaque tables in
lineage tracking. The lineage would show `table: [cte_name]` instead of
tracing back to the actual source tables.

Now, lineage properly traces through CTEs to their underlying source
tables. For simple CTEs, this shows the original table. For CTEs with
UNIONs or JOINs, all source tables are included in the lineage.
- Simplify lineage_of_table_decl from nested if-let to two match expressions
- Add UNIO to typos allowlist (prevents UNIONs -> UNIONNs corruption)
- Add test_cte_lineage_traces_to_source_table mentioned in PR description

Co-Authored-By: Claude <[email protected]>
@max-sixty
Copy link
Member

I tried to simplify a bit, with Claude's help — feel free to revert if you preferred the former though

- Add test_direct_table_lineage_uses_table_itself to exercise the non-CTE
  code path in lineage_of_table_decl (line 80, 92-96 in inference.rs)
- Fix UNIONNs typo in test comment

Co-Authored-By: Claude <[email protected]>
@nightscape
Copy link
Contributor Author

nightscape commented Dec 4, 2025

@max-sixty
I'm fine with your version, thanks for the test 👍
Good to merge?

@max-sixty max-sixty merged commit a6c4404 into PRQL:main Dec 5, 2025
34 of 35 checks passed
@nightscape nightscape deleted the fix-lineage-with-ctes branch December 5, 2025 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants