-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Enforce delta.checkpoint.writeStatsAsJson
and delta.checkpoint.writeStatsAsStruct
option in Delta Lake
#13331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d331f3e
to
3eb81a3
Compare
3eb81a3
to
28bc1d0
Compare
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Outdated
Show resolved
Hide resolved
...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointSchemaManager.java
Outdated
Show resolved
Hide resolved
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Show resolved
Hide resolved
...n/java/io/trino/tests/product/deltalake/TestDeltaLakeDatabricksCheckpointsCompatibility.java
Show resolved
Hide resolved
28bc1d0
to
fa3662c
Compare
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Outdated
Show resolved
Hide resolved
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Outdated
Show resolved
Hide resolved
69e7bf7
to
ed7f6fe
Compare
Fixing CI failures. |
ed7f6fe
to
79af9a1
Compare
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Outdated
Show resolved
Hide resolved
682e9f2
to
fccc7ae
Compare
Still work in progress, but let me push to check CI results. |
ed10956
to
307e863
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found commits that should not be merged: 3 commit(s) that need to be squashed.
...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointSchemaManager.java
Show resolved
Hide resolved
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Show resolved
Hide resolved
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Outdated
Show resolved
Hide resolved
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Show resolved
Hide resolved
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Outdated
Show resolved
Hide resolved
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Outdated
Show resolved
Hide resolved
...n/java/io/trino/tests/product/deltalake/TestDeltaLakeDatabricksCheckpointsCompatibility.java
Show resolved
Hide resolved
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Show resolved
Hide resolved
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Show resolved
Hide resolved
9d0e9f4
to
9e91970
Compare
Generally looks good to me, but take a look at the test failures and merge conflicts. |
e62fa9c
to
82e50e6
Compare
The job is suspended #13703
There are CI failures in Delta tests. |
Let me request review after fixing the failures. |
db03308
to
f7b48a9
Compare
return (long) floatToRawIntBits((float) (double) jsonValue); | ||
} | ||
if (type == DOUBLE) { | ||
return jsonValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(double)
-- verify it's a double.
} | ||
if (type instanceof DecimalType) { | ||
BigDecimal decimal; | ||
checkArgument(jsonValue instanceof String || jsonValue instanceof Double, "Value must be instance of String or Double"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to accept both forms?
return Decimals.encodeScaledValue(decimal, ((DecimalType) type).getScale()); | ||
} | ||
if (type instanceof VarcharType) { | ||
return jsonValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jsonValue is likely String and we need a Slice
} | ||
|
||
if (isShortDecimal(type)) { | ||
return Decimals.encodeShortScaledValue(decimal, ((DecimalType) type).getScale()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use enhanced instanceof instead of a cast here
BlockBuilder singleRowBlockWriter = blockBuilder.beginBlockEntry(); | ||
for (int i = 0; i < values.size(); ++i) { | ||
Type fieldType = fieldTypes.get(i); | ||
Object fieldValue = jsonValueToTrinoValue(fieldType, values.get(rowType.getFields().get(i).getName().orElseThrow())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a message in orElseThrow
BlockBuilder singleRowBlockWriter = blockBuilder.beginBlockEntry(); | ||
for (int i = 0; i < values.size(); ++i) { | ||
Type fieldType = fieldTypes.get(i); | ||
Object fieldValue = jsonValueToTrinoValue(fieldType, values.get(rowType.getFields().get(i).getName().orElseThrow())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we validate that values
contains no other entries than the ones we expected?
a1dfa67
to
8dfcbdd
Compare
...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java
Outdated
Show resolved
Hide resolved
8dfcbdd
to
4f6ee1d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have follow up issues for row type stats, and maybe one for being able to set these properties from Trino?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just skimming
Map<String, Object> jsonValues = new HashMap<>(); | ||
for (Map.Entry<String, Object> value : values.entrySet()) { | ||
Type type = columnTypeMapping.get(value.getKey()); | ||
// TODO: Add support for row type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
Enforce
delta.checkpoint.writeStatsAsJson
anddelta.checkpoint.writeStatsAsStruct
option in Delta LakeFixes #12031
Documentation
(x) No documentation is needed.
Release notes
(x) Release notes entries required with the following suggested text: