-
Notifications
You must be signed in to change notification settings - Fork 96
[SQL][SQLLIB] Use microsecond resolution for TIMESTAMP #5522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR updates SQL TIMESTAMP precision from milliseconds (3 digits) to microseconds (6 digits), representing a breaking change to the timestamp semantics. The change also updates internal representations and renames ambiguous API methods in the sqllib to be more explicit about their time units.
Changes:
- Updated TIMESTAMP precision from TIMESTAMP(3) to TIMESTAMP(6) throughout the codebase
- Renamed sqllib functions to be explicit about time units (e.g.,
Timestamp::new→Timestamp::from_microseconds/Timestamp::from_milliseconds) - Converted internal time representations from milliseconds to microseconds across the SQL compiler and runtime library
Reviewed changes
Copilot reviewed 52 out of 52 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| TemporalFilterTests.java | Updated test expectations for temporal filter intervals to use microseconds (multiplied by 1000) |
| TableParser.java | Updated imports and interval parsing functions to use new Short/Long interval literal classes |
| ProfilingTests.java | Updated timestamp operations to use explicit from_milliseconds constructor |
| VariantTests.java | Updated imports and interval literal constructors to use renamed classes |
| TimeTests.java | Added microsecond unit constant and converted interval/timestamp test expectations to microseconds |
| RegressionTests.java | Added test case for microsecond timestamp addition |
| MiscTests.java | Updated expected output for interval calculations to include microsecond precision |
| PostgresTimestampTests.java | Updated interval-to-microsecond conversions in timestamp diff calculations |
| PostgresIntervalTests.java | Adjusted interval test limits to match smaller range after precision change |
| PostgresDateTests.java | Updated imports and interval literal constructors |
| MetadataTests.java | Updated timestamp precision metadata from 3 to 6 |
| BoundedTypeTests.java | Updated max timestamp value string to include microseconds |
| Utilities.java | Removed deprecated roundMillis helper function |
| DBSPTypeTimestamp.java | Updated precision constant and default value constructor |
| DBSPTypeMonthsInterval.java | Renamed class to DBSPTypeLongInterval |
| DBSPTypeMillisInterval.java | Renamed class to DBSPTypeShortInterval |
| DBSPTimestampLiteral.java | Added microsecond support and factory methods for both milliseconds and microseconds |
| DBSPLiteral.java | Updated interval literal class references |
| DBSPIntervalMonthsLiteral.java | Renamed to DBSPLongIntervalLiteral with factory methods |
| DBSPIntervalMillisLiteral.java | Renamed to DBSPShortIntervalLiteral with microsecond support |
| DBSPTimeAddSub.java | Updated type references for renamed interval types |
| DBSPBinaryExpression.java | Updated type checks for renamed interval types |
| Simplify.java | Removed millisecond rounding call |
| InnerVisitor.java | Updated visitor method names for renamed interval types |
| InnerRewriteVisitor.java | Updated rewrite methods for renamed types and constructors |
| ReduceExpressionsRule.java | Disabled Calcite simplifications for timestamp/time/interval types |
| SqlToRelCompiler.java | Refactored to pass RelBuilder explicitly to optimize method |
| ConvertletTable.java | Changed integer division to standard division for interval calculations |
| TypeCompiler.java | Updated type conversions for renamed interval types |
| ExpressionCompiler.java | Updated interval literal creation to multiply milliseconds by 1000 for microseconds |
| CalciteToDBSPCompiler.java | Updated interval type checks and default interval creation |
| ToRustInnerVisitor.java | Updated Rust code generation to use new factory methods |
| RustSqlRuntimeLibrary.java | Updated window bound type check |
| ToJsonInnerVisitor.java | Updated JSON serialization for renamed types |
| test_illegal_tbl.py | Updated timestamp test values to include microseconds |
| test_date_time_fn.py | Updated timestamp expectations in date/time function tests |
| test_cmp_operators.py | Updated timestamp literals in comparison operator tests |
| test_cast.py | Updated timestamp cast test expectations |
| datetime.md | Updated documentation for timestamp precision |
| casts.md | Added documentation for interval-to-numeric casts |
| storage-test-compat/lib.rs | Updated test data generation to use explicit factory methods |
| tuple_proptest.rs | Updated property test strategies to use new factory methods |
| timestamp.rs | Complete refactor: changed internal representation from milliseconds to microseconds |
| interval.rs | Updated ShortInterval to use microseconds internally, added explicit factory methods |
| casts.rs | Updated cast functions to work with microseconds |
| redis/test.rs | Updated test data to use new factory methods |
| kafka/ft/test.rs | Updated Kafka metadata test timestamps |
| kafka/ft/input.rs | Updated Kafka timestamp conversion |
| datagen.rs | Updated datagen test expectations |
| data.rs | Updated test data construction |
| json/input.rs | Updated JSON test data |
| avro/serializer.rs | Updated Avro test data |
.../org/dbsp/sqlCompiler/compiler/frontend/calciteCompiler/optimizer/ReduceExpressionsRule.java
Outdated
Show resolved
Hide resolved
3df7516 to
3f01108
Compare
| pub struct Timestamp { | ||
| // since unix epoch | ||
| milliseconds: i64, | ||
| microseconds: i64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you can just do that. This type gets serialized as a data stored in disks/checkpoints in our pipelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what is the solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what's the easiest way but at the very least it seems like ..
- you need to assume there are now timestamps with milliseconds (that can comefrom disk) and microsecond precisions (that get stored to disk now) we need to support and handle both for the forseeable future
- this is a breaking storage version change you probably have to increase the storage version
- You may be able to rewrite the rkyv serialization taking into account the storage version and convert the milisecond timestamps to microseconds when you read them from disk (I had to do the same thing for TupX so you can look there how it was done)
fwiw this will not go through CI, I just confirmed it breaks the storage-test-compat tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will push a commit which implements the serialization differently depending on the storage version
ryzhyk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am yet to check how this interacts with connectors.
...QL-compiler/src/main/java/org/dbsp/sqlCompiler/compiler/backend/rust/ToRustInnerVisitor.java
Show resolved
Hide resolved
...QL-compiler/src/main/java/org/dbsp/sqlCompiler/compiler/backend/rust/ToRustInnerVisitor.java
Show resolved
Hide resolved
|
Looks good, please also write a new set of files for this version in crates/storage-compat-test and make sure the tests pass see instructions https://github.com/feldera/feldera/blob/main/crates/storage-test-compat/README.md |
|
@gz I have amended my last commit to add the golden files for v5. |
|
did you push them (it doesnt look like they are in the branch? |
|
Sorry, looks like I hadn't; I have now |
|
This is pretty bad; by disabling Calcite evaluation for some constant expressions of some types we get plans which we cannot compile directly (e.g., window bounds which do not look constant). So this PR does not really work. I will investigate what can be done, but it's really messy. |
2160fa0 to
90289d6
Compare
|
I have implemented a simple constant-folding optimization in the Calcite IR for interval values, which is sufficient to make all our tests pass. The existing Calcite optimization is incorrect, since it ignores fractions of milliseconds. |
Signed-off-by: Mihai Budiu <[email protected]>
Fixes #5516
Had to fight with Calcite, which does not understand that timestamps can have more precision than milliseconds.
This can be considered a breaking change:
Timestamp::newwas renamed toTimestamp::from_microsecondsandTimestamp::from_milliseconds. The previous name was very ambiguous.