Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

r1b
Copy link

@r1b r1b commented Dec 8, 2024

This builds on the work in #1130

  • Export a JSON type
  • Implement the interface for the base JSON type
    • Support indexing to generate JSONPath expressions
    • Handle json_serializer / json_deserializer dialect args
  • Handle dialect-specific considerations
    • JSON is not a valid query parameter type, use STRING
    • Render PARSE_JSON around JSON-as-STRING bind expressions
    • Comparator support for BQ family of JSON -> SQL conversion functions, incl. LAX variants
    • Support JSONPath modes (preview feature)
  • Tests

TODO:

  • Add docs
  • Maybe support JSONIndexType?
    • I really struggled to get this working / not sure if its necessary - will comment with details
    • This would be necessary to support arbitrary expressions in indexing operations (as described in the docs)
  • Maybe remove the JSONPath mode support?
    • This was just for fun, but maybe we don't want to put this here until the feature is GA

References:


For context, I work on a system that exclusively uses the SQL expression language, so I may have some blind spots around support on the ORM side - let me know if anything is missing there.

Fixes #399 🦕

@r1b r1b requested review from a team as code owners December 8, 2024 16:54
@r1b r1b requested a review from suzmue December 8, 2024 16:54
@product-auto-label product-auto-label bot added the size: l Pull request size is large. label Dec 8, 2024
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. label Dec 8, 2024
@chalmerlowe
Copy link
Collaborator

@r1b

I appreciate your interest in our project and am grateful for you effort to put together this PR.

I see that it attempts to build upon the work that was found in PR #1130.

I also see that we have then attempted to add a number of other elements above and beyond that.

We are currently at 350+ lines of new code. I have not had a chance to dive deeply into this code, so my apologies. I have several questions (some of these are more rhetorical so don't feel as though you need to respond to each of them ... this is mostly me thinking out loud).

  • Does ALL of this have to be in a single PR?
  • Do ALL of these edits create a single atomic unit of change?
  • Could we simplify the process of creating, reviewing, editing, and approving this work by limiting it to small discrete units of change per PR? (I am the sole full-time maintainer of this task and need to try and balance my commitments appropriately).

If we were to break this into small units of change:

  • what are the most important elements? (what are optional)
  • what order makes sense for implementation? (most important OR foundational work first, least important work last).
  • do we have appropriate tests for each unit of change?

What are your thoughts?

@r1b
Copy link
Author

r1b commented Dec 10, 2024

Yep, happy to break this up (more of a conversation-starter vs expecting to get all of this in :)) - I think the right sequencing is something like:

  • Add the JSON type (what we have in Add JSON type compiler support #1130, with added tests)
  • Add the json_serializer / json_deserializer dialect kwargs (expected for all JSON implementors)

This will unblock users who want to reference a JSON column in table definitions / query results w/o indexing

  • Add JSONPath indexing, comparator support

This will allow users to generate queries that reach into the JSON column to pull out subfields + generate appropriate derived types for those expressions

^ For my own use case, these two are satisfactory


  • (Optional) Support JSON query parameters

Optional because this is probably a niche use case / requires workarounds for missing dbapi support - instead, we could just disallow this.

This will allow users to pass a JSON object as a query parameter, with appropriate type conversions when generating SQL

  • (Optional) Support JSONPath modes

Optional because this is a preview feature - OK to wait until this is GA / requested by user

This will allow users to use lax / lax recursive modes in generated JSON path expressions

@r1b
Copy link
Author

r1b commented Dec 10, 2024

Closing in favor of #1147 - I pulled out a smaller PR from the rough sequencing above

@r1b r1b closed this Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for JSON datatype
3 participants