When updating dask-sql to version 2024.5.0, it is required to have dask-expr installed.
In my machine I have installed:
pandas 2.2.3
dask 2024.9.0
dask-expr 1.1.14
dask_sql 2024.5.0
I am getting the error:
.venv/lib/python3.10/site-packages/dask/utils.py", line 1241, in __call__ return getattr(__obj, self.method)(*args, **kwargs) TypeError: data type 'boolean' not understood
when running:
import pandas as pd
import dask.dataframe as dd
from dask_sql import Context
data = {
"column8": []
}
df = pd.DataFrame(data)
ddf = dd.from_pandas(df, npartitions=1)
c = Context()
c.create_table("tablename", ddf)
query = """
WITH
sampled_table AS (
SELECT "column8" AS "NEW_NAME"
FROM tablename t
),
table2 AS (
SELECT "NEW_NAME" AS output1, COUNT(*) AS output2
FROM sampled_table t
GROUP BY "NEW_NAME"
),
outputtable AS (
SELECT *
FROM table2 t
WHERE output1 IS NOT NULL
)
SELECT *
FROM outputtable"""
result = c.sql(query)
print(result.compute())
That code was working prior to the update (dask==2024.1.1 and dask-sql==2024.3.0).