-
Notifications
You must be signed in to change notification settings - Fork 270
feat: Add a new function to sync table with metastore #998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
"""Create or update a table. | ||
If detailed table info is given (parameter table and columns), it will just use | ||
them to create/update the table. Otherwise, it will try to get the table | ||
infofrom the metastore first and then create/update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
info from
schema_name, table_name, metastore_id, is_delete: bool = False | ||
): | ||
"""Sync table info from metastore. Delete the table if is_delete is True. | ||
def sync_table_from_metastore(schema_name, table_name, metastore_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since the code between refresh and sync are so similar, can you update the name to be more consistent?
like
sync_table_by_table_id
sync_table_by_name
metastore_loader.sync_create_or_update_table( | ||
schema.name, table.name, session=session | ||
) | ||
metastore_loader.sync_table(schema.name, table.name, session=session) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if table is deleted, and we call session.refresh(table)
?
also the frontend code assumes a table is always returned, can you add some checks on the frontend too? (like remove it from redux store, show messages, etc)
Let's address the follow ups in another PR, to handle edge cases like refreshing a deleted table |
## [0.21.0](https://github.expedia.biz/eg-analytics-platform/querybook/compare/0.20.3...0.21.0) (2022-09-15) ### Features * Add a new function to sync table with metastore (pinterest#998) ([6ffe0ff](https://github.expedia.biz/eg-analytics-platform/querybook/commit/6ffe0ff48d99045f0b73b1831978b0b752d8e940)) * allow user to bypass LIMIT (pinterest#1000) ([e9d11ed](https://github.expedia.biz/eg-analytics-platform/querybook/commit/e9d11edc8c8a96102ab2d7d9487f95c669c22a82)) * improve charting axis and value display (pinterest#999) ([3b36df2](https://github.expedia.biz/eg-analytics-platform/querybook/commit/3b36df2bd9aaa4874715a3754824a458c08e9e6f)) * improved query limit (pinterest#995) ([64f6e60](https://github.expedia.biz/eg-analytics-platform/querybook/commit/64f6e60ef0517a4c6f315882d94bc088bdaa4842)) * Merge branch 'upstream/master' ([88abf71](https://github.expedia.biz/eg-analytics-platform/querybook/commit/88abf71f93301c09ea80af8412d95c3b877cfc8d)) * pass execution type to executor client (pinterest#992) ([5f3a961](https://github.expedia.biz/eg-analytics-platform/querybook/commit/5f3a9613aabbcc8f9eb47d5e197ff41f1dd55915)) * show 404 page when table gets deleted (pinterest#1003) ([37a9434](https://github.expedia.biz/eg-analytics-platform/querybook/commit/37a94345884f934ab9c7d11a4ae088b4ecafdb63)) ### Bug Fixes * add acl check for metastore table sync (pinterest#1004) ([fdb5df0](https://github.expedia.biz/eg-analytics-platform/querybook/commit/fdb5df09480d9c0beb2f1ef38d755dae5c55ab96)) * disable new features in read only mode (pinterest#1002) ([5fa45d7](https://github.expedia.biz/eg-analytics-platform/querybook/commit/5fa45d7e292ecf34dd3260f23bc468eed3fa7e83)) * Hide export option on scheduler pop-up if there are no exporters available (pinterest#1001) ([d8a3112](https://github.expedia.biz/eg-analytics-platform/querybook/commit/d8a31125e4700b1acb4d7f5349e202786db32f0f)) * lint error doesn't disappear when switching to templating query (pinterest#994) ([a859d7a](https://github.expedia.biz/eg-analytics-platform/querybook/commit/a859d7a267e3ea7bb187d4c771b4aff81609a5c5)) * raise rate limit for the sync api (pinterest#1007) ([0251ffd](https://github.expedia.biz/eg-analytics-platform/querybook/commit/0251ffdd23cd25ab8ee5003f47926848d15a802e)) * transfer schedule's ownership along with datadoc's ownership (pinterest#1005) ([8732d97](https://github.expedia.biz/eg-analytics-platform/querybook/commit/8732d9793ad6d3dbdd809bb23dea9535c7842acf))
For syncing a table with metastore, the caller may not know if it's an updating or deleting, it should be the sync function's responsibility to keep it sync with metastore, whether it's creating, updating or deleting.
Also the
get_all_table_names_in_schema
method used for checking table existence is actually not very efficient for those schema with a huge amount of tables, it could cost 1 to 2 seconds. So this new function will useget_table_and_columns
to check, and its response will also be used for creating/updating the table as it will be needed anyway if it's not deleting.