feat: add BigQuery dialect #895
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This isn't tested yet, I just wanted to make sure I'm heading in the right direction. This may not be 100% comprehensive with all the language differences, but it should be pretty close, at least for the major functions.
Function node
The function node is probably going to be the hairiest to support, since it's an untyped string match, and requires some arg reordering, but wasn't bad. Do you see any issues with the direct arg references?
Types
There were a few untyped strings I ran across that would be easier to translate if they were a string union. I should've kept a better list, but these are the ones I remember:
Maybe the function names in the function node could be too?
Casing
DuckDB by convention uses lowercase, BigQuery uses uppercase. I wrote the override functions in uppercase, not sure if those should still all be lowercase. I don't feel strongly about it.
Unsupported errors
Right now I'm checking for unsupported cases and throwing errors, following the example of a single throw in the DuckDB visitor. Not sure if there's a pattern otherwise to follow.
Testing
We moved the visitor out, but not the tests. Should these sql tests be moved into per visitor files? I can add specific BigQuery tests once we decide on that organization.
https://github.com/uwdata/mosaic/tree/c381a4422193464aad796d741ad371dd460863ca/packages/mosaic/sql/test
AST manipulation
There are a few operations that could be supported, but not while walking through
toString. One example isSEMI|ANTI JOIN. BigQuery and Snowflake don't support them, but they can be trivially added by adding aWHEREcondition. The database engine shouldn't leak into the AST creation, where individual nodes have to run asupportsSemiJointype of scenario, which is the other place I can think to add it. Maybe an optional method likemanipulateAstcould be exposed, and executed if the visitor needed to. There's nothing urgent on this, as Mosaic just barely got join support yesterday, so I wouldn't spend any time on it unless there was a materialized view use case for the coordinator. In the meantime, I filed an issue with Google to add syntax support. (https://issuetracker.google.com/issues/446157721)