-
Couldn't load subscription status.
- Fork 2.4k
Add support for BigQuery regional location #1917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@fabriziodemaria, thanks for your PR! By analyzing the history of the files in this pull request, we identified @mikekap, @DeaconDesperado and @mbruggmann to be potential reviewers. |
|
👍 As you say tests is good to add. Suggest one test with location and also double check that without location is covered properly in existing tests. |
tox.ini
Outdated
| cdh,hdp: hdfs>=2.0.4,<3.0.0 | ||
| postgres: psycopg2<3.0 | ||
| gcloud: google-api-python-client>=1.4.0,<2.0 | ||
| gcloud: testfixtures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an upper bound?
5baffbb to
9e62108
Compare
luigi/contrib/bigquery.py
Outdated
| if dataset.location is not None: | ||
| fetched_location = response.get('location', '') | ||
| if not fetched_location: | ||
| fetched_location = 'undefined' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want 'undefined' as the default value then we should specify that instead of ''...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using proper Python None would seem even simpler.
test/contrib/bigquery_gcloud_test.py
Outdated
| from contrib import gcs_test | ||
| from nose.plugins.attrib import attr | ||
|
|
||
| from testfixtures import should_raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can avoid the new testfixtures dependency - nose has assertRaises
|
|
||
| PROJECT_ID = gcs_test.PROJECT_ID | ||
| # In order to run this test, you should set your GCS/BigQuert project/bucket. | ||
| # Unfortunately there's no mock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be exciting to have automatically generated mocks to speed up local testing... In the manner of http://martinfowler.com/bliki/SelfInitializingFake.html or similar...
|
@fabriziodemaria @mrunesson @ulzha, have you guys checked that the bulid works well after this is merged? I see some failures on the |
I'm not sure, but I believe things started to break after #1917 got merged.
|
Yes, I saw a green status. Something maybe nondeterministic? Going to have to inspect... Yes we do run py27-gcloud locally (in a Dockerized easily usable setup... which I hope we get to opensource soon...) |
|
@ulzha, you're right about green status. I can explain why: Fabrizio opened this pull-request from his own fabriziodemaria/luigi repo, so Travis didn't unencrypt the gcs-credentials as it would have done if the PR was sent from spotify/luigi repository. I implemented this complicated machinery to work some months before leaving spotify. See 375a470 :) Also, kudos on setting up the dockerized stuff! Making luigi builds more stable would be so awesome! |
I'm not sure, but I believe things started to break after #1917 got merged.
Restore the gcloud tests that were disabled in spotify#1917. During local execution, py27-gcloud succeeds, while py34-gcloud fails. To run locally: export GCS_TEST_PROJECT_ID=macro-mile-158613 \ GCS_TEST_BUCKET=macro-mile-158613 \ DATAPROC_TEST_PROJECT_ID=macro-mile-158613 tox -e py27-gcloud
Description
This PR aims at providing support for regional location for BigQuery (BQ) datasets.
More information about regional location here: https://cloud.google.com/bigquery/docs/managing_jobs_datasets_projects
The intended behaviour follows:
Motivation and Context
Users should have control over the regional location for BigQuery datasets.
Have you tested this? If so, how?
I have tested this by uploading tables on a BigQuery project setup for testing.
TODO