Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 70c7dc8

Browse files
Complete Fetch Phase (for INLINE disposition and JSON_ARRAY format) (#594)
* [squash from exec-sea] bring over execution phase changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess test Signed-off-by: varun-edachali-dbx <[email protected]> * add docstring Signed-off-by: varun-edachali-dbx <[email protected]> * remvoe exec func in sea backend Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess files Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess models Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess sea backend tests Signed-off-by: varun-edachali-dbx <[email protected]> * cleanup Signed-off-by: varun-edachali-dbx <[email protected]> * re-introduce get_schema_desc Signed-off-by: varun-edachali-dbx <[email protected]> * remove SeaResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * clean imports and attributes Signed-off-by: varun-edachali-dbx <[email protected]> * pass CommandId to ExecResp Signed-off-by: varun-edachali-dbx <[email protected]> * remove changes in types Signed-off-by: varun-edachali-dbx <[email protected]> * add back essential types (ExecResponse, from_sea_state) Signed-off-by: varun-edachali-dbx <[email protected]> * fix fetch types Signed-off-by: varun-edachali-dbx <[email protected]> * excess imports Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff by maintaining logs Signed-off-by: varun-edachali-dbx <[email protected]> * fix int test types Signed-off-by: varun-edachali-dbx <[email protected]> * [squashed from exec-sea] init execution func Signed-off-by: varun-edachali-dbx <[email protected]> * remove irrelevant changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove ResultSetFilter functionality Signed-off-by: varun-edachali-dbx <[email protected]> * remove more irrelevant changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove more irrelevant changes Signed-off-by: varun-edachali-dbx <[email protected]> * even more irrelevant changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove sea response as init option Signed-off-by: varun-edachali-dbx <[email protected]> * exec test example scripts Signed-off-by: varun-edachali-dbx <[email protected]> * formatting (black) Signed-off-by: varun-edachali-dbx <[email protected]> * [squashed from sea-exec] merge sea stuffs Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess removed docstring Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess changes in backend Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess imports Signed-off-by: varun-edachali-dbx <[email protected]> * remove accidentally removed _get_schema_desc Signed-off-by: varun-edachali-dbx <[email protected]> * remove unnecessary init with sea_response tests Signed-off-by: varun-edachali-dbx <[email protected]> * rmeove unnecessary changes Signed-off-by: varun-edachali-dbx <[email protected]> * formatting (black) Signed-off-by: varun-edachali-dbx <[email protected]> * move guid_to_hex_id import to utils Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff in guid utils import Signed-off-by: varun-edachali-dbx <[email protected]> * improved models and filters from cloudfetch-sea branch Signed-off-by: varun-edachali-dbx <[email protected]> * move arrow_schema_bytes back into ExecuteResult Signed-off-by: varun-edachali-dbx <[email protected]> * maintain log Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary assignment Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary tuple response Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-ncessary verbose mocking Signed-off-by: varun-edachali-dbx <[email protected]> * filters stuff (align with JDBC) Signed-off-by: varun-edachali-dbx <[email protected]> * move Queue construction to ResultSert Signed-off-by: varun-edachali-dbx <[email protected]> * move description to List[Tuple] Signed-off-by: varun-edachali-dbx <[email protected]> * frmatting (black) Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff (remove explicit tuple conversion) Signed-off-by: varun-edachali-dbx <[email protected]> * remove has_more_rows from ExecuteResponse Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary has_more_rows aclc Signed-off-by: varun-edachali-dbx <[email protected]> * default has_more_rows to True Signed-off-by: varun-edachali-dbx <[email protected]> * return has_more_rows from ExecResponse conversion during GetRespMetadata Signed-off-by: varun-edachali-dbx <[email protected]> * remove unnecessary replacement Signed-off-by: varun-edachali-dbx <[email protected]> * better mocked backend naming Signed-off-by: varun-edachali-dbx <[email protected]> * remove has_more_rows test in ExecuteResponse Signed-off-by: varun-edachali-dbx <[email protected]> * introduce replacement of original has_more_rows read test Signed-off-by: varun-edachali-dbx <[email protected]> * call correct method in test_use_arrow_schema Signed-off-by: varun-edachali-dbx <[email protected]> * call correct method in test_fall_back_to_hive_schema Signed-off-by: varun-edachali-dbx <[email protected]> * re-introduce result response read test Signed-off-by: varun-edachali-dbx <[email protected]> * simplify test Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess fetch_results mocks Signed-off-by: varun-edachali-dbx <[email protected]> * more minimal changes to thrift_backend tests Signed-off-by: varun-edachali-dbx <[email protected]> * move back to old table types Signed-off-by: varun-edachali-dbx <[email protected]> * remove outdated arrow_schema_bytes return Signed-off-by: varun-edachali-dbx <[email protected]> * backend from cloudfetch-sea Signed-off-by: varun-edachali-dbx <[email protected]> * remove filtering, metadata ops Signed-off-by: varun-edachali-dbx <[email protected]> * raise NotImplementedErrror for metadata ops Signed-off-by: varun-edachali-dbx <[email protected]> * align SeaResultSet with new structure Signed-off-by: varun-edachali-dbx <[email protected]> * correct sea res set tests Signed-off-by: varun-edachali-dbx <[email protected]> * add metadata commands Signed-off-by: varun-edachali-dbx <[email protected]> * formatting (black) Signed-off-by: varun-edachali-dbx <[email protected]> * add metadata command unit tests Signed-off-by: varun-edachali-dbx <[email protected]> * minimal fetch phase intro Signed-off-by: varun-edachali-dbx <[email protected]> * working JSON + INLINE Signed-off-by: varun-edachali-dbx <[email protected]> * change to valid table name Signed-off-by: varun-edachali-dbx <[email protected]> * rmeove redundant queue init Signed-off-by: varun-edachali-dbx <[email protected]> * large query results Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes covered by #588 Signed-off-by: varun-edachali-dbx <[email protected]> * simplify test module Signed-off-by: varun-edachali-dbx <[email protected]> * logging -> debug level Signed-off-by: varun-edachali-dbx <[email protected]> * change table name in log Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary backend cahnges Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-needed GetChunksResponse Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-needed GetChunksResponse only relevant in Fetch phase Signed-off-by: varun-edachali-dbx <[email protected]> * reduce code duplication in response parsing Signed-off-by: varun-edachali-dbx <[email protected]> * reduce code duplication Signed-off-by: varun-edachali-dbx <[email protected]> * more clear docstrings Signed-off-by: varun-edachali-dbx <[email protected]> * introduce strongly typed ChunkInfo Signed-off-by: varun-edachali-dbx <[email protected]> * remove is_volume_operation from response Signed-off-by: varun-edachali-dbx <[email protected]> * add is_volume_op and more ResultData fields Signed-off-by: varun-edachali-dbx <[email protected]> * add test scripts Signed-off-by: varun-edachali-dbx <[email protected]> * Revert "Merge branch 'sea-migration' into exec-models-sea" This reverts commit 8bd12d8, reversing changes made to 030edf8. * Revert "Merge branch 'exec-models-sea' into exec-phase-sea" This reverts commit be1997e, reversing changes made to 37813ba. * change logging level Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove _get_schema_bytes (for now) Signed-off-by: varun-edachali-dbx <[email protected]> * redundant comments Signed-off-by: varun-edachali-dbx <[email protected]> * remove fetch phase methods Signed-off-by: varun-edachali-dbx <[email protected]> * reduce code repetititon + introduce gaps after multi line pydocs Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * move description extraction to helper func Signed-off-by: varun-edachali-dbx <[email protected]> * formatting (black) Signed-off-by: varun-edachali-dbx <[email protected]> * add more unit tests Signed-off-by: varun-edachali-dbx <[email protected]> * streamline unit tests Signed-off-by: varun-edachali-dbx <[email protected]> * test getting the list of allowed configurations Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff Signed-off-by: varun-edachali-dbx <[email protected]> * house constants in enums for readability and immutability Signed-off-by: varun-edachali-dbx <[email protected]> * add note on hybrid disposition Signed-off-by: varun-edachali-dbx <[email protected]> * remove redundant note on arrow_schema_bytes Signed-off-by: varun-edachali-dbx <[email protected]> * align SeaResultSet with ext-links-sea Signed-off-by: varun-edachali-dbx <[email protected]> * remove redundant methods Signed-off-by: varun-edachali-dbx <[email protected]> * update unit tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove accidental venv changes Signed-off-by: varun-edachali-dbx <[email protected]> * add fetchmany_arrow and fetchall_arrow Signed-off-by: varun-edachali-dbx <[email protected]> * remove accidental changes in sea backend tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove irrelevant changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary test changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes in thrift backend tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove unimplemented methods test Signed-off-by: varun-edachali-dbx <[email protected]> * remove unimplemented method tests Signed-off-by: varun-edachali-dbx <[email protected]> * modify example scripts to include fetch calls Signed-off-by: varun-edachali-dbx <[email protected]> * fix sea connector tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove unimplemented methods test Signed-off-by: varun-edachali-dbx <[email protected]> * remove invalid import Signed-off-by: varun-edachali-dbx <[email protected]> * better align queries with JDBC impl Signed-off-by: varun-edachali-dbx <[email protected]> * line breaks after multi-line PRs Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * fix: introduce ExecuteResponse import Signed-off-by: varun-edachali-dbx <[email protected]> * remove unimplemented metadata methods test, un-necessary imports Signed-off-by: varun-edachali-dbx <[email protected]> * introduce unit tests for metadata methods Signed-off-by: varun-edachali-dbx <[email protected]> * remove verbosity in ResultSetFilter docstring Co-authored-by: jayant <[email protected]> * remove un-necessary info in ResultSetFilter docstring Signed-off-by: varun-edachali-dbx <[email protected]> * remove explicit type checking, string literals around forward annotations Signed-off-by: varun-edachali-dbx <[email protected]> * house SQL commands in constants Signed-off-by: varun-edachali-dbx <[email protected]> * introduce unit tests for altered functionality Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * run small queries with SEA during integration tests Signed-off-by: varun-edachali-dbx <[email protected]> * run some tests for sea Signed-off-by: varun-edachali-dbx <[email protected]> * remove catalog requirement in get_tables Signed-off-by: varun-edachali-dbx <[email protected]> * move filters.py to SEA utils Signed-off-by: varun-edachali-dbx <[email protected]> * ensure SeaResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * prevent circular imports Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * remove cast, throw error if not SeaResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * pass param as TSparkParameterValue Signed-off-by: varun-edachali-dbx <[email protected]> * make SEA backend methods return SeaResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * use spec-aligned Exceptions in SEA backend Signed-off-by: varun-edachali-dbx <[email protected]> * remove defensive row type check Signed-off-by: varun-edachali-dbx <[email protected]> * introduce type conversion for primitive types for JSON + INLINE Signed-off-by: varun-edachali-dbx <[email protected]> * remove SEA running on metadata queries (known failures Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary docstrings Signed-off-by: varun-edachali-dbx <[email protected]> * align expected types with databricks sdk Signed-off-by: varun-edachali-dbx <[email protected]> * link rest api reference to validate types Signed-off-by: varun-edachali-dbx <[email protected]> * remove test_catalogs_returns_arrow_table test metadata commands not expected to pass Signed-off-by: varun-edachali-dbx <[email protected]> * fix fetchall_arrow and fetchmany_arrow Signed-off-by: varun-edachali-dbx <[email protected]> * remove thrift aligned test_cancel_during_execute from SEA tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes in example scripts Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary chagnes in example scripts Signed-off-by: varun-edachali-dbx <[email protected]> * _convert_json_table -> _create_json_table Signed-off-by: varun-edachali-dbx <[email protected]> * remove accidentally removed test Signed-off-by: varun-edachali-dbx <[email protected]> * remove new unit tests (to be re-added based on new arch) Signed-off-by: varun-edachali-dbx <[email protected]> * remove changes in sea_result_set functionality (to be re-added) Signed-off-by: varun-edachali-dbx <[email protected]> * introduce more integration tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove SEA tests in parameterized queries Signed-off-by: varun-edachali-dbx <[email protected]> * remove partial parameter fix changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary timestamp tests (pass with minor disparity) Signed-off-by: varun-edachali-dbx <[email protected]> * slightly stronger typing of _convert_json_types Signed-off-by: varun-edachali-dbx <[email protected]> * stronger typing of json utility func s Signed-off-by: varun-edachali-dbx <[email protected]> * stronger typing of fetch*_json Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused helper methods in SqlType Signed-off-by: varun-edachali-dbx <[email protected]> * line breaks after multi line pydocs, remove excess logs Signed-off-by: varun-edachali-dbx <[email protected]> * line breaks after multi line pydocs, reduce diff of redundant changes Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff of redundant changes Signed-off-by: varun-edachali-dbx <[email protected]> * mandate ResultData in SeaResultSet constructor Signed-off-by: varun-edachali-dbx <[email protected]> * return empty JsonQueue in case of empty response test ref: test_create_table_will_return_empty_result_set Signed-off-by: varun-edachali-dbx <[email protected]> * remove string literals around SeaDatabricksClient declaration Signed-off-by: varun-edachali-dbx <[email protected]> * move conversion module into dedicated utils Signed-off-by: varun-edachali-dbx <[email protected]> * clean up _convert_decimal, introduce scale and precision as kwargs Signed-off-by: varun-edachali-dbx <[email protected]> * use stronger typing in convert_value (object instead of Any) Signed-off-by: varun-edachali-dbx <[email protected]> * make Manifest mandatory Signed-off-by: varun-edachali-dbx <[email protected]> * mandatory Manifest, clean up statement_id typing Signed-off-by: varun-edachali-dbx <[email protected]> * stronger typing for fetch*_json Signed-off-by: varun-edachali-dbx <[email protected]> * make description non Optional, correct docstring, optimize col conversion Signed-off-by: varun-edachali-dbx <[email protected]> * fix type issues Signed-off-by: varun-edachali-dbx <[email protected]> * make description mandatory, not Optional Signed-off-by: varun-edachali-dbx <[email protected]> * n_valid_rows -> num_rows Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess print statement Signed-off-by: varun-edachali-dbx <[email protected]> * remove empty bytes in SeaResultSet for arrow_schema_bytes Signed-off-by: varun-edachali-dbx <[email protected]> * move SeaResultSetQueueFactory and JsonQueue into separate SEA module Signed-off-by: varun-edachali-dbx <[email protected]> * move sea result set into backend/sea package Signed-off-by: varun-edachali-dbx <[email protected]> * improve docstrings Signed-off-by: varun-edachali-dbx <[email protected]> * correct docstrings, ProgrammingError -> ValueError Signed-off-by: varun-edachali-dbx <[email protected]> * let type of rows by List[List[str]] for clarity Signed-off-by: varun-edachali-dbx <[email protected]> * select Queue based on format in manifest Signed-off-by: varun-edachali-dbx <[email protected]> * make manifest mandatory Signed-off-by: varun-edachali-dbx <[email protected]> * stronger type checking in JSON helper functions in Sea Result Set Signed-off-by: varun-edachali-dbx <[email protected]> * assign empty array to data array if None Signed-off-by: varun-edachali-dbx <[email protected]> * stronger typing in JsonQueue Signed-off-by: varun-edachali-dbx <[email protected]> --------- Signed-off-by: varun-edachali-dbx <[email protected]>
1 parent 45585d4 commit 70c7dc8

19 files changed

+1390
-281
lines changed

examples/experimental/tests/test_sea_async_query.py

Lines changed: 57 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -51,12 +51,20 @@ def test_sea_async_query_with_cloud_fetch():
5151
f"Successfully opened SEA session with ID: {connection.get_session_id_hex()}"
5252
)
5353

54-
# Execute a simple query asynchronously
54+
# Execute a query that generates large rows to force multiple chunks
55+
requested_row_count = 5000
5556
cursor = connection.cursor()
57+
query = f"""
58+
SELECT
59+
id,
60+
concat('value_', repeat('a', 10000)) as test_value
61+
FROM range(1, {requested_row_count} + 1) AS t(id)
62+
"""
63+
5664
logger.info(
57-
"Executing asynchronous query with cloud fetch: SELECT 1 as test_value"
65+
f"Executing asynchronous query with cloud fetch to generate {requested_row_count} rows"
5866
)
59-
cursor.execute_async("SELECT 1 as test_value")
67+
cursor.execute_async(query)
6068
logger.info(
6169
"Asynchronous query submitted successfully with cloud fetch enabled"
6270
)
@@ -69,8 +77,25 @@ def test_sea_async_query_with_cloud_fetch():
6977

7078
logger.info("Query is no longer pending, getting results...")
7179
cursor.get_async_execution_result()
80+
81+
results = [cursor.fetchone()]
82+
results.extend(cursor.fetchmany(10))
83+
results.extend(cursor.fetchall())
84+
actual_row_count = len(results)
85+
86+
logger.info(
87+
f"Requested {requested_row_count} rows, received {actual_row_count} rows"
88+
)
89+
90+
# Verify total row count
91+
if actual_row_count != requested_row_count:
92+
logger.error(
93+
f"FAIL: Row count mismatch. Expected {requested_row_count}, got {actual_row_count}"
94+
)
95+
return False
96+
7297
logger.info(
73-
"Successfully retrieved asynchronous query results with cloud fetch enabled"
98+
"PASS: Received correct number of rows with cloud fetch and all fetch methods work correctly"
7499
)
75100

76101
# Close resources
@@ -130,12 +155,20 @@ def test_sea_async_query_without_cloud_fetch():
130155
f"Successfully opened SEA session with ID: {connection.get_session_id_hex()}"
131156
)
132157

133-
# Execute a simple query asynchronously
158+
# For non-cloud fetch, use a smaller row count to avoid exceeding inline limits
159+
requested_row_count = 100
134160
cursor = connection.cursor()
161+
query = f"""
162+
SELECT
163+
id,
164+
concat('value_', repeat('a', 100)) as test_value
165+
FROM range(1, {requested_row_count} + 1) AS t(id)
166+
"""
167+
135168
logger.info(
136-
"Executing asynchronous query without cloud fetch: SELECT 1 as test_value"
169+
f"Executing asynchronous query without cloud fetch to generate {requested_row_count} rows"
137170
)
138-
cursor.execute_async("SELECT 1 as test_value")
171+
cursor.execute_async(query)
139172
logger.info(
140173
"Asynchronous query submitted successfully with cloud fetch disabled"
141174
)
@@ -148,8 +181,24 @@ def test_sea_async_query_without_cloud_fetch():
148181

149182
logger.info("Query is no longer pending, getting results...")
150183
cursor.get_async_execution_result()
184+
results = [cursor.fetchone()]
185+
results.extend(cursor.fetchmany(10))
186+
results.extend(cursor.fetchall())
187+
actual_row_count = len(results)
188+
189+
logger.info(
190+
f"Requested {requested_row_count} rows, received {actual_row_count} rows"
191+
)
192+
193+
# Verify total row count
194+
if actual_row_count != requested_row_count:
195+
logger.error(
196+
f"FAIL: Row count mismatch. Expected {requested_row_count}, got {actual_row_count}"
197+
)
198+
return False
199+
151200
logger.info(
152-
"Successfully retrieved asynchronous query results with cloud fetch disabled"
201+
"PASS: Received correct number of rows without cloud fetch and all fetch methods work correctly"
153202
)
154203

155204
# Close resources

examples/experimental/tests/test_sea_sync_query.py

Lines changed: 28 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -49,13 +49,27 @@ def test_sea_sync_query_with_cloud_fetch():
4949
f"Successfully opened SEA session with ID: {connection.get_session_id_hex()}"
5050
)
5151

52-
# Execute a simple query
52+
# Execute a query that generates large rows to force multiple chunks
53+
requested_row_count = 10000
5354
cursor = connection.cursor()
55+
query = f"""
56+
SELECT
57+
id,
58+
concat('value_', repeat('a', 10000)) as test_value
59+
FROM range(1, {requested_row_count} + 1) AS t(id)
60+
"""
61+
62+
logger.info(
63+
f"Executing synchronous query with cloud fetch to generate {requested_row_count} rows"
64+
)
65+
cursor.execute(query)
66+
results = [cursor.fetchone()]
67+
results.extend(cursor.fetchmany(10))
68+
results.extend(cursor.fetchall())
69+
actual_row_count = len(results)
5470
logger.info(
55-
"Executing synchronous query with cloud fetch: SELECT 1 as test_value"
71+
f"{actual_row_count} rows retrieved against {requested_row_count} requested"
5672
)
57-
cursor.execute("SELECT 1 as test_value")
58-
logger.info("Query executed successfully with cloud fetch enabled")
5973

6074
# Close resources
6175
cursor.close()
@@ -114,13 +128,18 @@ def test_sea_sync_query_without_cloud_fetch():
114128
f"Successfully opened SEA session with ID: {connection.get_session_id_hex()}"
115129
)
116130

117-
# Execute a simple query
131+
# For non-cloud fetch, use a smaller row count to avoid exceeding inline limits
132+
requested_row_count = 100
118133
cursor = connection.cursor()
119-
logger.info(
120-
"Executing synchronous query without cloud fetch: SELECT 1 as test_value"
134+
logger.info("Executing synchronous query without cloud fetch: SELECT 100 rows")
135+
cursor.execute(
136+
"SELECT id, 'test_value_' || CAST(id as STRING) as test_value FROM range(1, 101)"
121137
)
122-
cursor.execute("SELECT 1 as test_value")
123-
logger.info("Query executed successfully with cloud fetch disabled")
138+
139+
results = [cursor.fetchone()]
140+
results.extend(cursor.fetchmany(10))
141+
results.extend(cursor.fetchall())
142+
logger.info(f"{len(results)} rows retrieved against 100 requested")
124143

125144
# Close resources
126145
cursor.close()

src/databricks/sql/backend/sea/backend.py

Lines changed: 20 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717

1818
if TYPE_CHECKING:
1919
from databricks.sql.client import Cursor
20-
from databricks.sql.result_set import SeaResultSet
20+
from databricks.sql.backend.sea.result_set import SeaResultSet
2121

2222
from databricks.sql.backend.databricks_client import DatabricksClient
2323
from databricks.sql.backend.types import (
@@ -251,7 +251,7 @@ def close_session(self, session_id: SessionId) -> None:
251251
logger.debug("SeaDatabricksClient.close_session(session_id=%s)", session_id)
252252

253253
if session_id.backend_type != BackendType.SEA:
254-
raise ProgrammingError("Not a valid SEA session ID")
254+
raise ValueError("Not a valid SEA session ID")
255255
sea_session_id = session_id.to_sea_session_id()
256256

257257
request_data = DeleteSessionRequest(
@@ -290,7 +290,7 @@ def get_allowed_session_configurations() -> List[str]:
290290

291291
def _extract_description_from_manifest(
292292
self, manifest: ResultManifest
293-
) -> Optional[List]:
293+
) -> List[Tuple]:
294294
"""
295295
Extract column description from a manifest object, in the format defined by
296296
the spec: https://peps.python.org/pep-0249/#description
@@ -299,15 +299,12 @@ def _extract_description_from_manifest(
299299
manifest: The ResultManifest object containing schema information
300300
301301
Returns:
302-
Optional[List]: A list of column tuples or None if no columns are found
302+
List[Tuple]: A list of column tuples
303303
"""
304304

305305
schema_data = manifest.schema
306306
columns_data = schema_data.get("columns", [])
307307

308-
if not columns_data:
309-
return None
310-
311308
columns = []
312309
for col_data in columns_data:
313310
# Format: (name, type_code, display_size, internal_size, precision, scale, null_ok)
@@ -323,7 +320,7 @@ def _extract_description_from_manifest(
323320
)
324321
)
325322

326-
return columns if columns else None
323+
return columns
327324

328325
def _results_message_to_execute_response(
329326
self, response: GetStatementResponse
@@ -429,7 +426,7 @@ def execute_command(
429426
"""
430427

431428
if session_id.backend_type != BackendType.SEA:
432-
raise ProgrammingError("Not a valid SEA session ID")
429+
raise ValueError("Not a valid SEA session ID")
433430

434431
sea_session_id = session_id.to_sea_session_id()
435432

@@ -508,9 +505,11 @@ def cancel_command(self, command_id: CommandId) -> None:
508505
"""
509506

510507
if command_id.backend_type != BackendType.SEA:
511-
raise ProgrammingError("Not a valid SEA command ID")
508+
raise ValueError("Not a valid SEA command ID")
512509

513510
sea_statement_id = command_id.to_sea_statement_id()
511+
if sea_statement_id is None:
512+
raise ValueError("Not a valid SEA command ID")
514513

515514
request = CancelStatementRequest(statement_id=sea_statement_id)
516515
self.http_client._make_request(
@@ -531,9 +530,11 @@ def close_command(self, command_id: CommandId) -> None:
531530
"""
532531

533532
if command_id.backend_type != BackendType.SEA:
534-
raise ProgrammingError("Not a valid SEA command ID")
533+
raise ValueError("Not a valid SEA command ID")
535534

536535
sea_statement_id = command_id.to_sea_statement_id()
536+
if sea_statement_id is None:
537+
raise ValueError("Not a valid SEA command ID")
537538

538539
request = CloseStatementRequest(statement_id=sea_statement_id)
539540
self.http_client._make_request(
@@ -560,6 +561,8 @@ def get_query_state(self, command_id: CommandId) -> CommandState:
560561
raise ValueError("Not a valid SEA command ID")
561562

562563
sea_statement_id = command_id.to_sea_statement_id()
564+
if sea_statement_id is None:
565+
raise ValueError("Not a valid SEA command ID")
563566

564567
request = GetStatementRequest(statement_id=sea_statement_id)
565568
response_data = self.http_client._make_request(
@@ -592,9 +595,11 @@ def get_execution_result(
592595
"""
593596

594597
if command_id.backend_type != BackendType.SEA:
595-
raise ProgrammingError("Not a valid SEA command ID")
598+
raise ValueError("Not a valid SEA command ID")
596599

597600
sea_statement_id = command_id.to_sea_statement_id()
601+
if sea_statement_id is None:
602+
raise ValueError("Not a valid SEA command ID")
598603

599604
# Create the request model
600605
request = GetStatementRequest(statement_id=sea_statement_id)
@@ -608,18 +613,18 @@ def get_execution_result(
608613
response = GetStatementResponse.from_dict(response_data)
609614

610615
# Create and return a SeaResultSet
611-
from databricks.sql.result_set import SeaResultSet
616+
from databricks.sql.backend.sea.result_set import SeaResultSet
612617

613618
execute_response = self._results_message_to_execute_response(response)
614619

615620
return SeaResultSet(
616621
connection=cursor.connection,
617622
execute_response=execute_response,
618623
sea_client=self,
619-
buffer_size_bytes=cursor.buffer_size_bytes,
620-
arraysize=cursor.arraysize,
621624
result_data=response.result,
622625
manifest=response.manifest,
626+
buffer_size_bytes=cursor.buffer_size_bytes,
627+
arraysize=cursor.arraysize,
623628
)
624629

625630
# == Metadata Operations ==
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
from __future__ import annotations
2+
3+
from abc import ABC
4+
from typing import List, Optional, Tuple
5+
6+
from databricks.sql.backend.sea.backend import SeaDatabricksClient
7+
from databricks.sql.backend.sea.models.base import ResultData, ResultManifest
8+
from databricks.sql.backend.sea.utils.constants import ResultFormat
9+
from databricks.sql.exc import ProgrammingError
10+
from databricks.sql.utils import ResultSetQueue
11+
12+
13+
class SeaResultSetQueueFactory(ABC):
14+
@staticmethod
15+
def build_queue(
16+
sea_result_data: ResultData,
17+
manifest: ResultManifest,
18+
statement_id: str,
19+
description: List[Tuple] = [],
20+
max_download_threads: Optional[int] = None,
21+
sea_client: Optional[SeaDatabricksClient] = None,
22+
lz4_compressed: bool = False,
23+
) -> ResultSetQueue:
24+
"""
25+
Factory method to build a result set queue for SEA backend.
26+
27+
Args:
28+
sea_result_data (ResultData): Result data from SEA response
29+
manifest (ResultManifest): Manifest from SEA response
30+
statement_id (str): Statement ID for the query
31+
description (List[List[Any]]): Column descriptions
32+
max_download_threads (int): Maximum number of download threads
33+
sea_client (SeaDatabricksClient): SEA client for fetching additional links
34+
lz4_compressed (bool): Whether the data is LZ4 compressed
35+
36+
Returns:
37+
ResultSetQueue: The appropriate queue for the result data
38+
"""
39+
40+
if manifest.format == ResultFormat.JSON_ARRAY.value:
41+
# INLINE disposition with JSON_ARRAY format
42+
return JsonQueue(sea_result_data.data)
43+
elif manifest.format == ResultFormat.ARROW_STREAM.value:
44+
# EXTERNAL_LINKS disposition
45+
raise NotImplementedError(
46+
"EXTERNAL_LINKS disposition is not implemented for SEA backend"
47+
)
48+
raise ProgrammingError("Invalid result format")
49+
50+
51+
class JsonQueue(ResultSetQueue):
52+
"""Queue implementation for JSON_ARRAY format data."""
53+
54+
def __init__(self, data_array: Optional[List[List[str]]]):
55+
"""Initialize with JSON array data."""
56+
self.data_array = data_array or []
57+
self.cur_row_index = 0
58+
self.num_rows = len(self.data_array)
59+
60+
def next_n_rows(self, num_rows: int) -> List[List[str]]:
61+
"""Get the next n rows from the data array."""
62+
length = min(num_rows, self.num_rows - self.cur_row_index)
63+
slice = self.data_array[self.cur_row_index : self.cur_row_index + length]
64+
self.cur_row_index += length
65+
return slice
66+
67+
def remaining_rows(self) -> List[List[str]]:
68+
"""Get all remaining rows from the data array."""
69+
slice = self.data_array[self.cur_row_index :]
70+
self.cur_row_index += len(slice)
71+
return slice

0 commit comments

Comments
 (0)