-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Fixes for CTE (de)serialization compatibility with older versions #19393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
|
At least the pain is only temporary 🙈 |
Mytherin
added a commit
that referenced
this pull request
Oct 17, 2025
Follow-up from #19393 There are a number of issues still caused by serializing CTE nodes - this PR makes it so that we only serialize CTE nodes when MATERIALIZED is explicitly defined, and serialize only the CommonTableExpressionMap otherwise. In addition, we never deserialize CTENodes anymore - and always reconstruct them from the CommonTableExpressionMap.
Y--
pushed a commit
to motherduckdb/public-duckdb
that referenced
this pull request
Oct 17, 2025
…ckdb#19393) The CTE (de)serialization code in v1.4 has a number of issues: * It is writing `CTEMaterialize` for CTENodes which is not supported in older DuckDB versions, causing forwards compatibility to break and older versions not to be able to read DuckDB files written by v1.4 when they contain CTEs that are explicitly labeled as `MATERIALIZED` or `NOT MATERIALIZED` * It is not correctly de-duplicating CTENodes from the CommonTableExpressionMap, causing some old CTEs to not be readable anymore (as explained here - duckdb#19351) This has all already been fundamentally fixed in main in duckdb#19351. However, in order to also fix this for following v1.4 versions (v1.4.2) - this PR patches the serialization code in a lower risk manner. Effectively: * We no longer write `CTEMaterialize` for CTENodes. Instead, we only write the CTENode if it should be materialized. Otherwise, we don't write it to the file. * When reading a `QueryNode`, we immediately perform "re-duplication" by extracting CTENodes from the `CommonTableExpressionMap. This fixes an issue where we were not correctly re-duplicating at all levels. * When deserializing a CTENode, we perform de-duplication of CTEs within its child. This fixes an issue where the above re-duplication could cause the same CTE to appear multiple times depending on from which version we are de-serializing. All of this is only necessary for v1.4 - and when we merge v1.4 into main this code should be deleted. (cherry picked from commit d8ae68f)
Y--
pushed a commit
to motherduckdb/public-duckdb
that referenced
this pull request
Oct 17, 2025
This cherry picks the functionality from duckdb#19420 Follow-up from duckdb#19393 There are a number of issues still caused by serializing CTE nodes - this PR makes it so that we only serialize CTE nodes when MATERIALIZED is explicitly defined, and serialize only the CommonTableExpressionMap otherwise. In addition, we never deserialize CTENodes anymore - and always reconstruct them from the CommonTableExpressionMap. --------- Co-authored-by: Mark <[email protected]>
github-actions bot
pushed a commit
to duckdb/duckdb-r
that referenced
this pull request
Oct 21, 2025
Fixes for CTE (de)serialization compatibility with older versions (duckdb/duckdb#19393) BUGFIX: Silent failure to write row groups with large lists (duckdb/duckdb#19376) Throw if non-`VARCHAR` key is passed to `json_object` (duckdb/duckdb#19365) add test tag support [vfs integration tests p1] (duckdb/duckdb#19331)
github-actions bot
added a commit
to duckdb/duckdb-r
that referenced
this pull request
Oct 21, 2025
Fixes for CTE (de)serialization compatibility with older versions (duckdb/duckdb#19393) BUGFIX: Silent failure to write row groups with large lists (duckdb/duckdb#19376) Throw if non-`VARCHAR` key is passed to `json_object` (duckdb/duckdb#19365) add test tag support [vfs integration tests p1] (duckdb/duckdb#19331) Co-authored-by: krlmlr <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The CTE (de)serialization code in v1.4 has a number of issues:
CTEMaterializefor CTENodes which is not supported in older DuckDB versions, causing forwards compatibility to break and older versions not to be able to read DuckDB files written by v1.4 when they contain CTEs that are explicitly labeled asMATERIALIZEDorNOT MATERIALIZEDThis has all already been fundamentally fixed in main in #19351. However, in order to also fix this for following v1.4 versions (v1.4.2) - this PR patches the serialization code in a lower risk manner. Effectively:
CTEMaterializefor CTENodes. Instead, we only write the CTENode if it should be materialized. Otherwise, we don't write it to the file.QueryNode, we immediately perform "re-duplication" by extracting CTENodes from the `CommonTableExpressionMap. This fixes an issue where we were not correctly re-duplicating at all levels.All of this is only necessary for v1.4 - and when we merge v1.4 into main this code should be deleted.