Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Bigquery] Empty arrays are not supported as query parameter values  #2678

@jscinoz

Description

@jscinoz

It is perfectly valid to run queries with an empty array as a parameter. Our application has one such query that takes two ARRAY<INT64> parameters; either of which (but never both) could potentially be empty. Said query is trivial and is quoted in its entirety below:

SELECT
  (SELECT AS STRUCT FT.*) AS form_transaction,
  IF(FS.id IS NULL, NULL, (SELECT AS STRUCT FS.*)) AS form_session
FROM
  dev1_1208.FormTransaction AS FT
LEFT JOIN
  dev1_1208.FormSession AS FS
ON
  FS.form_transaction_id = FT.id
WHERE
  FT.id IN UNNEST(@transactionIds) OR
  FS.id IN UNNEST(@sessionIds)

Running this query with explicit array literals (i.e. building the query string with these values pre-substituted, rather than via proper query parameters) works fine, even if one of the arrays is empty (e.g. FS.id IN UNNEST([])). Executing this same request against the Bigquery REST API manually via the API Explorer correctly handles empty array parameters also; so this issue is certainly within google-cloud-bigquery itself - it's not a backend API limitation.

Attempting to execute this query via google-cloud-bigquery fails when handling the response from the Bigquery API with a NullPointerException in QueryParameterValue.fromPb upon executing the query, at the following line (427):

valueBuilder.setValue(valuePb.getValue());
java.lang.NullPointerException
	at com.google.cloud.bigquery.QueryParameterValue.fromPb(QueryParameterValue.java:427)
	at com.google.cloud.bigquery.QueryJobConfiguration$Builder.<init>(QueryJobConfiguration.java:143)
	at com.google.cloud.bigquery.QueryJobConfiguration$Builder.<init>(QueryJobConfiguration.java:85)
	at com.google.cloud.bigquery.QueryJobConfiguration.fromPb(QueryJobConfiguration.java:813)
	at com.google.cloud.bigquery.JobConfiguration.fromPb(JobConfiguration.java:140)
	at com.google.cloud.bigquery.JobInfo$BuilderImpl.<init>(JobInfo.java:184)
	at com.google.cloud.bigquery.Job.fromPb(Job.java:307)
	at com.google.cloud.bigquery.BigQueryImpl.query(BigQueryImpl.java:595)
	at com.google.cloud.bigquery.BigQueryImpl.query(BigQueryImpl.java:567)
	at com.avoka.transact.insights.common.model.BaseQuery$DoSyncQuery.call(BaseQuery.java:489)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at com.google.apphosting.runtime.ApiProxyImpl$CurrentRequestThreadFactory$1$1.run(ApiProxyImpl.java:1249)
	at java.security.AccessController.doPrivileged(Native Method)
	at com.google.apphosting.runtime.ApiProxyImpl$CurrentRequestThreadFactory$1.run(ApiProxyImpl.java:1243)
	at java.lang.Thread.run(Thread.java:745)
	at com.google.apphosting.runtime.ApiProxyImpl$CurrentRequestThread.run(ApiProxyImpl.java:1210)

This occurs due to a number of reasons, detailed as follows:

QueryParameterValue.toValuePb does not set arrayValues when the provided array is empty:

if (arrayValues != null && !arrayValues.isEmpty()) {
  valuePb.setArrayValues(
      Lists.transform(arrayValues, QueryParameterValue.TO_VALUE_PB_FUNCTION));
}

google-cloud-bigquery defers to the older google-api-services-bigquery client library internally, which seems to strip empty arrays from the returned Job object, in HttpBigQueryRpc.create(Job, ...):

  @Override
  public Job create(Job job, Map<Option, ?> options) {
    try {
      String projectId = job.getJobReference() != null
          ? job.getJobReference().getProjectId() : this.options.getProjectId();

      // job.getConfiguration().getQuery().getQueryParameters() at this point is:
      //  [{name=transactionIds, parameterType={arrayType={type=INT64}, type=ARRAY}, parameterValue={arrayValues=[{value=688}, {value=689}, {value=690}, {value=686}, {value=687}]}}, {name=sessionIds, parameterType={arrayType={type=INT64}, type=ARRAY}, parameterValue={arrayValues=[]}}]
   
      return bigquery.jobs()
          .insert(projectId, job)
          .setFields(Option.FIELDS.getString(options))
          .execute();
      // getConfiguration().getQuery().getQueryParameters() on the returned job is as follows. Note that the 'arrayValues' field of the 'sessionIds' parameter has been dropped:
      // [{"name":"transactionIds","parameterType":{"arrayType":{"type":"INT64"},"type":"ARRAY"},"parameterValue":{"arrayValues":[{"value":"688"},{"value":"689"},{"value":"690"},{"value":"686"},{"value":"687"}]}}, {"name":"sessionIds","parameterType":{"arrayType":{"type":"INT64"},"type":"ARRAY"}}]
    } catch (IOException ex) {
      throw translate(ex);
    }
  }

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the BigQuery API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions