Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Arrow conversion errors do not indicate problematic dataframe column.  #1621

@karkinissan

Description

@karkinissan

The current error when converting a dataframe to a pyarrow array in the dataframe_to_arrow method does not provide the user with the name of the column that the error encountered in.

File ~\miniconda3\envs\playground\Lib\site-packages\google\cloud\bigquery\_pandas_helpers.py:704, in dataframe_to_parquet(dataframe, bq_schema, filepath, parquet_compression, parquet_use_compliant_nested_type)
    697 kwargs = (
    698     {"use_compliant_nested_type": parquet_use_compliant_nested_type}
    699     if _helpers.PYARROW_VERSIONS.use_compliant_nested_type
    700     else {}
    701 )
    703 bq_schema = schema._to_schema_fields(bq_schema)
--> 704 arrow_table = dataframe_to_arrow(dataframe, bq_schema)
    705 pyarrow.parquet.write_table(
    706     arrow_table,
    707     filepath,
    708     compression=parquet_compression,
    709     **kwargs,
    710 )

File ~\miniconda3\envs\playground\Lib\site-packages\google\cloud\bigquery\_pandas_helpers.py:647, in dataframe_to_arrow(dataframe, bq_schema)
    644 for bq_field in bq_schema:
    645     arrow_names.append(bq_field.name)
    646     arrow_arrays.append(
--> 647         bq_to_arrow_array(get_column_or_index(dataframe, bq_field.name), bq_field)
    648     )
    649     arrow_fields.append(bq_to_arrow_field(bq_field, arrow_arrays[-1].type))
    651 if all((field is not None for field in arrow_fields)):

File ~\miniconda3\envs\playground\Lib\site-packages\google\cloud\bigquery\_pandas_helpers.py:362, in bq_to_arrow_array(series, bq_field)
    360     return pyarrow.StructArray.from_pandas(series, type=arrow_type)
    361 try:
--> 362     return pyarrow.Array.from_pandas(series, type=arrow_type)
    363 except Exception as e: 
    364     _LOGGER.error(f"Error in column: {series.name}")

File ~\miniconda3\envs\playground\Lib\site-packages\pyarrow\array.pxi:1044, in pyarrow.lib.Array.from_pandas()
File ~\miniconda3\envs\playground\Lib\site-packages\pyarrow\array.pxi:316, in pyarrow.lib.array()
File ~\miniconda3\envs\playground\Lib\site-packages\pyarrow\array.pxi:83, in pyarrow.lib._ndarray_to_array()
File ~\miniconda3\envs\playground\Lib\site-packages\pyarrow\error.pxi:123, in pyarrow.lib.check_status()

ArrowTypeError: object of type <class 'str'> cannot be converted to int

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: bigqueryIssues related to the googleapis/python-bigquery API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions