-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Spark 3.5, Arrow: Support for Row lineage when using the Parquet Vectorized reader #12928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
amogh-jahagirdar
merged 6 commits into
apache:main
from
amogh-jahagirdar:vectorized-parquet-row-lineage
Jun 5, 2025
Merged
Changes from 1 commit
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
908c14f
Spark, Arrow: Support for Row lineage when doing Vectorized Parquet r…
amogh-jahagirdar 896a93b
fixes
amogh-jahagirdar 11dac99
remove unused methods
amogh-jahagirdar 2c4f57f
bit more cleanup
amogh-jahagirdar e63f33e
make sure we're closing intermediate batches while reading
amogh-jahagirdar 544302b
Add a test which tests many records, cleanup inline comments
amogh-jahagirdar File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
make sure we're closing intermediate batches while reading
- Loading branch information
commit e63f33eec679ead24b5ce3fd77d3da3e1382060b
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to change it if you feel strongly about it, but I mostly just followed the increment pattern of i += 1 already in this class (and this package it looks like). If we do change it, I'd change it for the other instances in this class just to keep things consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually didn't realize that we have so many places that do
i += 1
in for loops. It's not a big deal and I don't feel strong about it but it would be great to fix this throughout the codebase in a separate PR