-
Notifications
You must be signed in to change notification settings - Fork 139
Closed
Labels
api: bigqueryIssues related to the googleapis/python-bigquery-sqlalchemy API.Issues related to the googleapis/python-bigquery-sqlalchemy API.priority: p2Moderately-important priority. Fix may not be included in next release.Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Description
pyarrow is an optional dependency of google-cloud-bigquery, but it's made mandatory by python-bigquery-sqlalchemy.
As pyarrow is quite large on disk — 100 MB on x86_64 Linux — I don't want to install it when it's unused in my application (AFAICT I don't need google-cloud-bigquery-storage
either but that's not huge).
I suggest:
- Removing the direct dependencies on
pyarrow
andgoogle-cloud-bigquery-storage
- Add a
bqstorage
extra that depends ongoogle-cloud-bigquery[bqstorage]
. That'll respect upstream's version bounds without introducing local bounds that could cause conflicts for users. - Document that users wanting improved performance with large result sets should install
bigquery-sqlalchemy[bqstorage]
.
There's an existing PR at #470 but it looks like it has stalled out, so I'm filing this issue to provide a blueprint for someone who's able to do this work.
Metadata
Metadata
Assignees
Labels
api: bigqueryIssues related to the googleapis/python-bigquery-sqlalchemy API.Issues related to the googleapis/python-bigquery-sqlalchemy API.priority: p2Moderately-important priority. Fix may not be included in next release.Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.Error or flaw in code with unintended results or allowing sub-optimal usage patterns.