-
Notifications
You must be signed in to change notification settings - Fork 247
fix(postprocess): ensure columns required by sorting are properly redirected #5464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(postprocess): ensure columns required by sorting are properly redirected #5464
Conversation
…irected Signed-off-by: Luka Peschke <[email protected]>
Signed-off-by: Luka Peschke <[email protected]>
Signed-off-by: Luka Peschke <[email protected]>
|
thanks!! I agree this could be done earlier, but also given the constrains this seems quite reasonable? I'm generally up for working in an imperfect local minimum; hopefully later we can refactor. to make it easier to review: can we reduce the size of the diff by using a couple of standard inline tests for the SQL compilation rather than the full integration tests? the full integration tests are not for individual features; otherwise we'd have 100K lines of them :) |
…tegration tests Signed-off-by: Luka Peschke <[email protected]>
|
@max-sixty Thanks for the quick response! I had a second thought about it, and I think you're right, although the code is almost the same as in the anchoring step, it shouldn't change much in the future, so having it like this should be enough for now :) I also updated the tests to be inline tests |
| let mut new_name = old_name; | ||
| if let Some(new) = &mut new_name { | ||
| if used_new_names.contains(new) { | ||
| *new = self.ctx.anchor.col_name.gen(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukapeschke if you have the patience, can we adjust the test so we hit this line too? I don't think we hit it currently
either way let's merge, feel free to ping either way
Signed-off-by: Luka Peschke <[email protected]>
85407ff to
111f1c5
Compare
|
@max-sixty I've twisted the tests and tried to hit the bug fixed by #2960 but I have not been able to hit the uncovered line, so I've simplified the code to remove the |
|
thank you @lukapeschke ! |
… from the matching table_ref It was done by converting the CTE's table id into an RIId for now, but those can differ in case some CTEs get erased. This resulted in the wrong CId redirect being updated. Fixing this by finding a relation instance whose table_ref's source matches the CTE's table_id instead. Follow up of PRQL#5464 Signed-off-by: Luka Peschke <[email protected]>
This is a tricky one. Basically, what happens on
mainright now is that when a sort step in a CTE uses a column that is not part of the final columns in the main pipeline, CId redirection does not work, which results on theORDER BYclause being done on a column of a table that might not be available in the final statement.For example, the following PRQL
Results in the following SQL right now on main:
This is because
fold_column_sortswhich is called byfold_cteduring the postprocess step will add the required columns sort to intermediateSelectstatements to ensure they are available in the main pipeline (in this case, it addstable2._expr0). However, sincecid_redirectsare not updated, a column from an anterior table is used for the finalORDER BYstatement.I'm not quite happy with the approach though, I feel like this should be done earlier, probably in the anchor step. However that would require pushing of sort columns to also happen during that step, which is a bigger change.
@kgutwin @max-sixty It'd be great to have your input on this one, as you are much more familiar with this code than I am 🙂