feat: add dry_run parameter to `read_gbq()`, `read_gbq_table()` and `read_gbq_query()` #1674

sycai · 2025-04-30T02:50:03Z

~~If a table reference is fed to read_gbq() with dry_run set to True, we will use SELECT * FROM {table_ref} for dry run~~

For read_gbq(), and read_gbq_table() calls that do not ultimately lead to SQL conversions, we use the table metadata for dry run stats report.

tswast · 2025-05-01T16:23:53Z

If a table reference is fed to read_gbq() with dry_run set to True, we will use SELECT * FROM {table_ref} for dry run

👎 That's a bit misleading. There are some code paths that do fallback to query (e.g. if max_results) is set. Those should have a dry run because they do immediately run a query. But for a deferred operation, I don't think dry run makes sense. Instead, let's populate what we can from the table metadata and have some indicator that no query is actually run.

Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

…bigquery-dataframes into sycai_dry_run

sycai · 2025-05-01T20:04:48Z

If a table reference is fed to read_gbq() with dry_run set to True, we will use SELECT * FROM {table_ref} for dry run

👎 That's a bit misleading. There are some code paths that do fallback to query (e.g. if max_results) is set. Those should have a dry run because they do immediately run a query. But for a deferred operation, I don't think dry run makes sense. Instead, let's populate what we can from the table metadata and have some indicator that no query is actually run.

Sounds good. Code updated. Now read_gbq_table dry run looks like this: https://screenshot.googleplex.com/AHaxiSsafniVFRN

tswast · 2025-05-01T21:03:58Z

bigframes/session/dry_runs.py

+    col_dtypes = dtypes.bf_type_from_type_kind(table.schema)
+    index.append("tableColumnCount")
+    values.append(len(col_dtypes))
+    index.append("tableColumnTypes")


It's not super easy for end user too predict if something will result in a query or just read the table directly. Could we try to align these names so that they don't need as much logic to handle one case over the other?

tswast

Thanks!

feat: add dry_run parameter to read_gbq() and read_gbq_query()

1bc248e

product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Apr 30, 2025

sycai and others added 4 commits April 30, 2025 17:53

fix lint

073c341

Merge branch 'main' into sycai_dry_run

c0f5163

Merge branch 'main' into sycai_dry_run

8c8a723

Merge branch 'main' into sycai_dry_run

7b30eb1

sycai requested a review from tswast April 30, 2025 22:27

sycai marked this pull request as ready for review April 30, 2025 22:27

sycai requested review from a team as code owners April 30, 2025 22:27

blunderbuss-gcf bot assigned jiaxunwu Apr 30, 2025

release-please bot and others added 6 commits May 1, 2025 19:59

chore(main): release 2.2.0 (#1643)

2bceff7

Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>

create a different stats report for reading gbq tables

42edce4

Merge branch 'main' into sycai_dry_run

d59c4bd

fix lint

4d6a59a

🦉 Updates from OwlBot post-processor

3624795

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

Merge branch 'sycai_dry_run' of https://github.com/googleapis/python-…

089d586

…bigquery-dataframes into sycai_dry_run

tswast reviewed May 1, 2025

View reviewed changes

sycai and others added 5 commits May 1, 2025 22:36

rename column count and column dtypes

669c041

Merge branch 'main' into sycai_dry_run

ac8c05c

fix typo

7dc90d9

Merge branch 'main' into sycai_dry_run

698b409

format code

80e08bf

sycai requested a review from tswast May 1, 2025 22:46

sycai changed the title ~~feat: add dry_run parameter to read_gbq() and read_gbq_query()~~ feat: add dry_run parameter to read_gbq(), read_gbq_table() and read_gbq_query() May 1, 2025

Merge branch 'main' into sycai_dry_run

3fe8b8e

tswast approved these changes May 5, 2025

View reviewed changes

sycai merged commit 4c5dee5 into main May 5, 2025
24 checks passed

sycai deleted the sycai_dry_run branch May 5, 2025 20:05

release-please bot mentioned this pull request May 5, 2025

chore(main): release 2.3.0 #1682

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add dry_run parameter to `read_gbq()`, `read_gbq_table()` and `read_gbq_query()` #1674

feat: add dry_run parameter to `read_gbq()`, `read_gbq_table()` and `read_gbq_query()` #1674

Uh oh!

sycai commented Apr 30, 2025 •

edited

Loading

Uh oh!

tswast commented May 1, 2025

Uh oh!

sycai commented May 1, 2025 •

edited

Loading

Uh oh!

tswast May 1, 2025

Uh oh!

sycai May 1, 2025

Uh oh!

tswast left a comment

Uh oh!

Uh oh!

Uh oh!

feat: add dry_run parameter to read_gbq(), read_gbq_table() and read_gbq_query() #1674

feat: add dry_run parameter to read_gbq(), read_gbq_table() and read_gbq_query() #1674

Uh oh!

Conversation

sycai commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tswast commented May 1, 2025

Uh oh!

sycai commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tswast May 1, 2025

Choose a reason for hiding this comment

Uh oh!

sycai May 1, 2025

Choose a reason for hiding this comment

Uh oh!

tswast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

feat: add dry_run parameter to `read_gbq()`, `read_gbq_table()` and `read_gbq_query()` #1674

feat: add dry_run parameter to `read_gbq()`, `read_gbq_table()` and `read_gbq_query()` #1674

sycai commented Apr 30, 2025 •

edited

Loading

sycai commented May 1, 2025 •

edited

Loading