Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@derekperkins
Copy link
Collaborator

@derekperkins derekperkins commented Sep 19, 2025

This isn't tested yet, I just wanted to make sure I'm heading in the right direction. This may not be 100% comprehensive with all the language differences, but it should be pretty close, at least for the major functions.

Function node

The function node is probably going to be the hairiest to support, since it's an untyped string match, and requires some arg reordering, but wasn't bad. Do you see any issues with the direct arg references?

Types

There were a few untyped strings I ran across that would be easier to translate if they were a string union. I should've kept a better list, but these are the ones I remember:

  • CastNode.cast
  • IntervalNode.name

Maybe the function names in the function node could be too?

Casing

DuckDB by convention uses lowercase, BigQuery uses uppercase. I wrote the override functions in uppercase, not sure if those should still all be lowercase. I don't feel strongly about it.

Unsupported errors

Right now I'm checking for unsupported cases and throwing errors, following the example of a single throw in the DuckDB visitor. Not sure if there's a pattern otherwise to follow.

Testing

We moved the visitor out, but not the tests. Should these sql tests be moved into per visitor files? I can add specific BigQuery tests once we decide on that organization.
https://github.com/uwdata/mosaic/tree/c381a4422193464aad796d741ad371dd460863ca/packages/mosaic/sql/test

AST manipulation

There are a few operations that could be supported, but not while walking through toString. One example is SEMI|ANTI JOIN. BigQuery and Snowflake don't support them, but they can be trivially added by adding a WHERE condition. The database engine shouldn't leak into the AST creation, where individual nodes have to run a supportsSemiJoin type of scenario, which is the other place I can think to add it. Maybe an optional method like manipulateAst could be exposed, and executed if the visitor needed to. There's nothing urgent on this, as Mosaic just barely got join support yesterday, so I wouldn't spend any time on it unless there was a materialized view use case for the coordinator. In the meantime, I filed an issue with Google to add syntax support. (https://issuetracker.google.com/issues/446157721)

@jheer
Copy link
Member

jheer commented Sep 19, 2025

This looks like a great start. I don't see any obvious red flags. Throwing an error for unsupported cases is good; we could consider adding a specialized Error subclass for unsupported query constructs if that would be helpful.

As for testing, I'd recommend leaving the existing tests alone for now, but creating a new dedicated file/folder for BigQuery tests. We could then follow suit for DuckDB later on if it makes sense. That said, this package is of course DuckDB-centric, so I'm ok creating separate test folders for other dialects and leaving the DuckDB variants as the "core" use case for now.

@jheer
Copy link
Member

jheer commented Sep 19, 2025

Regarding untyped strings:

  • CastNode: I think we have to allow general strings here as databases can add custom types. We'll never be comprehensive. We could add types for common types and include a tail string & Record<never, never>, but that's mostly just for type-ahead support, which doesn't seem that critical given existing helper methods.
  • IntervalNode: I think adding types here would be great!

@derekperkins
Copy link
Collaborator Author

creating a new dedicated file/folder for BigQuery tests

I'll go ahead and add a new folder. Should I copy 100% of the tests, or only the tests for the overridden methods?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants