You can explore BigQuery query results by using Colab Enterprise notebooks in BigQuery.
In this tutorial, you query data from a BigQuery public dataset and explore the query results in a notebook.
Required permissions
To create and run notebooks, you need the following Identity and Access Management (IAM) roles:
- BigQuery User (
roles/bigquery.user
) - Notebook Runtime User (
roles/aiplatform.notebookRuntimeUser
) - Code Creator (
roles/dataform.codeCreator
)
Open query results in a notebook
You can run a SQL query and then use a notebook to explore the data. This approach is useful if you want to modify the data in BigQuery before working with it, or if you need only a subset of the fields in the table.
In the Google Cloud console, go to the BigQuery page.
In the Type to search field, enter
bigquery-public-data
.If the project is not shown, enter
bigquery
in the search field, and then click Search to all projects to match the search string with the existing projects.Select bigquery-public-data > ml_datasets > penguins.
For the penguins table, click
View actions, and then click Query.Add an asterisk (
*
) for field selection to the generated query, so that it reads like the following example:SELECT * FROM `bigquery-public-data.ml_datasets.penguins` LIMIT 1000;
Click
Run.In the Query results section, click Explore data, and then click Explore with Python notebook.
Prepare the notebook for use
Prepare the notebook for use by connecting to a runtime and setting application default values.
- In the notebook header, click Connect to connect to the default runtime.
- In the Setup code block, click Run cell.
Explore the data
- To load the penguins data into a BigQuery DataFrame and show the results, click Run cell in the code block in the Result set loaded from BigQuery job as a DataFrame section.
- To get descriptive metrics for the data, click Run cell in the code block in the Show descriptive statistics using describe() section.
- Optional: Use other Python functions or packages to explore and analyze the data.
The following code sample shows using
bigframes.pandas
to analyze data, and bigframes.ml
to create a linear regression model from penguins data in a
BigQuery DataFrame: