-
Notifications
You must be signed in to change notification settings - Fork 60
Closed
Labels
api: bigqueryIssues related to the googleapis/python-bigquery-dataframes API.Issues related to the googleapis/python-bigquery-dataframes API.
Description
Environment details
OS type and version: macOs Sequoia Version 15.6
"pip" version: uv 0.8.13
Python: 3.10.11
bigframes==2.18.0
google-cloud-bigquery==3.36.0
pandas==2.3.2
pyarrow==15.0.2
sqlglot==27.11.0
Steps to reproduce
-
Create pandas dataframe with a column backed by ChunkedArray
This can be done by concating columns of type string[pyarrows] -
Load pandas dataframe into bigframes
e.g. by passing into the constructor or with read_pandas
Code example
import bigframes.pandas as bpd
import pandas as pd
s = pd.Series(['a', 'b'], dtype="string[pyarrow]")
df1 = pd.DataFrame({"col": s})
df2 = pd.DataFrame({"col": s})
df = pd.concat([df1, df2])
bpd.DataFrame(df)
Stack trace
Traceback (most recent call last):
File "/dir/example.py", line 11, in <module>
bpd.get_global_session().read_pandas(df).to_pandas()
File "/dir/.venv/lib/python3.10/site-packages/bigframes/core/log_adapter.py", line 175, in wrapper
return method(*args, **kwargs)
File "/dir/.venv/lib/python3.10/site-packages/bigframes/session/__init__.py", line 1006, in read_pandas
return self._read_pandas(pandas_dataframe, write_engine=write_engine)
File "/dir/.venv/lib/python3.10/site-packages/bigframes/core/log_adapter.py", line 175, in wrapper
return method(*args, **kwargs)
File "/dir/.venv/lib/python3.10/site-packages/bigframes/session/__init__.py", line 1040, in _read_pandas
return self._read_pandas_inline(pandas_dataframe)
File "/dir/.venv/lib/python3.10/site-packages/bigframes/core/log_adapter.py", line 175, in wrapper
return method(*args, **kwargs)
File "/dir/.venv/lib/python3.10/site-packages/bigframes/session/__init__.py", line 1059, in _read_pandas_inline
local_block = blocks.Block.from_local(pandas_dataframe, self)
File "/dir/.venv/lib/python3.10/site-packages/bigframes/core/blocks.py", line 227, in from_local
managed_data = local_data.ManagedArrowTable.from_pandas(pd_data)
File "/dir/.venv/lib/python3.10/site-packages/bigframes/core/local_data.py", line 75, in from_pandas
new_arr, bf_type = _adapt_pandas_series(col)
File "/dir/.venv/lib/python3.10/site-packages/bigframes/core/local_data.py", line 280, in _adapt_pandas_series
return _adapt_arrow_array(pa.array(series))
File "/dir/.venv/lib/python3.10/site-packages/bigframes/core/local_data.py", line 308, in _adapt_arrow_array
if array.offset != 0: # Offset arrays don't have all operations implemented
AttributeError: 'pyarrow.lib.ChunkedArray' object has no attribute 'offset'
Workaround
Casting the pandas dataframe before passing to bigframes
df = pd.concat([df1, df2])
df['col'] = df['col'].astype('object')
bpd.DataFrame(df)
Metadata
Metadata
Assignees
Labels
api: bigqueryIssues related to the googleapis/python-bigquery-dataframes API.Issues related to the googleapis/python-bigquery-dataframes API.