Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Complete Fetch Phase (for INLINE disposition and JSON_ARRAY format) #594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 228 commits into from
Jul 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
228 commits
Select commit Hold shift + click to select a range
138c2ae
[squash from exec-sea] bring over execution phase changes
varun-edachali-dbx Jun 9, 2025
3e3ab94
remove excess test
varun-edachali-dbx Jun 9, 2025
4a78165
add docstring
varun-edachali-dbx Jun 9, 2025
0dac4aa
remvoe exec func in sea backend
varun-edachali-dbx Jun 9, 2025
1b794c7
remove excess files
varun-edachali-dbx Jun 9, 2025
da5a6fe
remove excess models
varun-edachali-dbx Jun 9, 2025
686ade4
remove excess sea backend tests
varun-edachali-dbx Jun 9, 2025
31e6c83
cleanup
varun-edachali-dbx Jun 9, 2025
69ea238
re-introduce get_schema_desc
varun-edachali-dbx Jun 9, 2025
66d7517
remove SeaResultSet
varun-edachali-dbx Jun 9, 2025
71feef9
clean imports and attributes
varun-edachali-dbx Jun 9, 2025
ae9862f
pass CommandId to ExecResp
varun-edachali-dbx Jun 9, 2025
d8aa69e
remove changes in types
varun-edachali-dbx Jun 9, 2025
db139bc
add back essential types (ExecResponse, from_sea_state)
varun-edachali-dbx Jun 9, 2025
b977b12
fix fetch types
varun-edachali-dbx Jun 9, 2025
da615c0
excess imports
varun-edachali-dbx Jun 9, 2025
0da04a6
reduce diff by maintaining logs
varun-edachali-dbx Jun 9, 2025
ea9d456
fix int test types
varun-edachali-dbx Jun 9, 2025
8985c62
[squashed from exec-sea] init execution func
varun-edachali-dbx Jun 9, 2025
d9bcdbe
remove irrelevant changes
varun-edachali-dbx Jun 9, 2025
ee9fa1c
remove ResultSetFilter functionality
varun-edachali-dbx Jun 9, 2025
24c6152
remove more irrelevant changes
varun-edachali-dbx Jun 9, 2025
67fd101
remove more irrelevant changes
varun-edachali-dbx Jun 9, 2025
271fcaf
even more irrelevant changes
varun-edachali-dbx Jun 9, 2025
bf26ea3
remove sea response as init option
varun-edachali-dbx Jun 9, 2025
ed7cf91
exec test example scripts
varun-edachali-dbx Jun 9, 2025
dae15e3
formatting (black)
varun-edachali-dbx Jun 9, 2025
db5bbea
[squashed from sea-exec] merge sea stuffs
varun-edachali-dbx Jun 9, 2025
d5d3699
remove excess changes
varun-edachali-dbx Jun 9, 2025
6137a3d
remove excess removed docstring
varun-edachali-dbx Jun 9, 2025
75b0773
remove excess changes in backend
varun-edachali-dbx Jun 9, 2025
4494dcd
remove excess imports
varun-edachali-dbx Jun 9, 2025
4d0aeca
remove accidentally removed _get_schema_desc
varun-edachali-dbx Jun 9, 2025
7cece5e
remove unnecessary init with sea_response tests
varun-edachali-dbx Jun 9, 2025
8977c06
rmeove unnecessary changes
varun-edachali-dbx Jun 9, 2025
0216d7a
formatting (black)
varun-edachali-dbx Jun 9, 2025
d97463b
move guid_to_hex_id import to utils
varun-edachali-dbx Jun 9, 2025
139e246
reduce diff in guid utils import
varun-edachali-dbx Jun 9, 2025
4cb15fd
improved models and filters from cloudfetch-sea branch
varun-edachali-dbx Jun 9, 2025
e3ee4e4
move arrow_schema_bytes back into ExecuteResult
varun-edachali-dbx Jun 9, 2025
f448a8f
maintain log
varun-edachali-dbx Jun 9, 2025
82ca1ee
remove un-necessary assignment
varun-edachali-dbx Jun 9, 2025
e96a078
remove un-necessary tuple response
varun-edachali-dbx Jun 9, 2025
27158b1
remove un-ncessary verbose mocking
varun-edachali-dbx Jun 9, 2025
dee47f7
filters stuff (align with JDBC)
varun-edachali-dbx Jun 10, 2025
d3200c4
move Queue construction to ResultSert
varun-edachali-dbx Jun 10, 2025
8a014f0
move description to List[Tuple]
varun-edachali-dbx Jun 10, 2025
39c41ab
frmatting (black)
varun-edachali-dbx Jun 10, 2025
2cd04df
reduce diff (remove explicit tuple conversion)
varun-edachali-dbx Jun 10, 2025
067a019
remove has_more_rows from ExecuteResponse
varun-edachali-dbx Jun 10, 2025
48c83e0
remove un-necessary has_more_rows aclc
varun-edachali-dbx Jun 10, 2025
281a9e9
default has_more_rows to True
varun-edachali-dbx Jun 10, 2025
192901d
return has_more_rows from ExecResponse conversion during GetRespMetadata
varun-edachali-dbx Jun 10, 2025
55f5c45
remove unnecessary replacement
varun-edachali-dbx Jun 10, 2025
edc36b5
better mocked backend naming
varun-edachali-dbx Jun 10, 2025
81280e7
remove has_more_rows test in ExecuteResponse
varun-edachali-dbx Jun 10, 2025
c1d3be2
introduce replacement of original has_more_rows read test
varun-edachali-dbx Jun 10, 2025
5ee4136
call correct method in test_use_arrow_schema
varun-edachali-dbx Jun 10, 2025
b881ab0
call correct method in test_fall_back_to_hive_schema
varun-edachali-dbx Jun 10, 2025
53bf715
re-introduce result response read test
varun-edachali-dbx Jun 10, 2025
45a32be
simplify test
varun-edachali-dbx Jun 10, 2025
e3fe299
remove excess fetch_results mocks
varun-edachali-dbx Jun 10, 2025
e8038d3
more minimal changes to thrift_backend tests
varun-edachali-dbx Jun 10, 2025
2f6ec19
move back to old table types
varun-edachali-dbx Jun 10, 2025
73bc282
remove outdated arrow_schema_bytes return
varun-edachali-dbx Jun 10, 2025
e385d5b
backend from cloudfetch-sea
varun-edachali-dbx Jun 11, 2025
484064e
remove filtering, metadata ops
varun-edachali-dbx Jun 11, 2025
030edf8
raise NotImplementedErrror for metadata ops
varun-edachali-dbx Jun 11, 2025
4e07f1e
align SeaResultSet with new structure
varun-edachali-dbx Jun 11, 2025
65e7c6b
correct sea res set tests
varun-edachali-dbx Jun 11, 2025
30f8266
add metadata commands
varun-edachali-dbx Jun 11, 2025
033ae73
formatting (black)
varun-edachali-dbx Jun 11, 2025
33821f4
add metadata command unit tests
varun-edachali-dbx Jun 11, 2025
71b451a
minimal fetch phase intro
varun-edachali-dbx Jun 11, 2025
170f339
Merge branch 'exec-resp-norm' into fetch-json-inline
varun-edachali-dbx Jun 11, 2025
40f79b5
Merge branch 'sea-res-set' into fetch-json-inline
varun-edachali-dbx Jun 11, 2025
c038d5a
working JSON + INLINE
varun-edachali-dbx Jun 11, 2025
3e22c6c
change to valid table name
varun-edachali-dbx Jun 11, 2025
716304b
rmeove redundant queue init
varun-edachali-dbx Jun 11, 2025
e96e5b8
large query results
varun-edachali-dbx Jun 11, 2025
787f1f7
Merge branch 'sea-migration' into sea-test-scripts
varun-edachali-dbx Jun 11, 2025
165c4f3
remove un-necessary changes
varun-edachali-dbx Jun 11, 2025
a6e40d0
simplify test module
varun-edachali-dbx Jun 11, 2025
52e3088
logging -> debug level
varun-edachali-dbx Jun 11, 2025
641c09b
change table name in log
varun-edachali-dbx Jun 11, 2025
8bd12d8
Merge branch 'sea-migration' into exec-models-sea
varun-edachali-dbx Jun 11, 2025
ffded6e
remove un-necessary changes
varun-edachali-dbx Jun 11, 2025
227f6b3
remove un-necessary backend cahnges
varun-edachali-dbx Jun 11, 2025
68657a3
remove un-needed GetChunksResponse
varun-edachali-dbx Jun 11, 2025
3940eec
remove un-needed GetChunksResponse
varun-edachali-dbx Jun 11, 2025
37813ba
reduce code duplication in response parsing
varun-edachali-dbx Jun 11, 2025
267c9f4
reduce code duplication
varun-edachali-dbx Jun 11, 2025
2967119
more clear docstrings
varun-edachali-dbx Jun 11, 2025
47fd60d
introduce strongly typed ChunkInfo
varun-edachali-dbx Jun 11, 2025
982fdf2
remove is_volume_operation from response
varun-edachali-dbx Jun 12, 2025
9e14d48
add is_volume_op and more ResultData fields
varun-edachali-dbx Jun 12, 2025
be1997e
Merge branch 'exec-models-sea' into exec-phase-sea
varun-edachali-dbx Jun 12, 2025
e8e8ee7
Merge branch 'sea-test-scripts' into exec-phase-sea
varun-edachali-dbx Jun 12, 2025
05ee4e7
add test scripts
varun-edachali-dbx Jun 12, 2025
3ffa898
Merge branch 'exec-models-sea' into metadata-sea
varun-edachali-dbx Jun 12, 2025
2952d8d
Revert "Merge branch 'sea-migration' into exec-models-sea"
varun-edachali-dbx Jun 12, 2025
89e2aa0
Merge branch 'exec-phase-sea' into metadata-sea
varun-edachali-dbx Jun 12, 2025
cbace3f
Revert "Merge branch 'exec-models-sea' into exec-phase-sea"
varun-edachali-dbx Jun 12, 2025
c075b07
change logging level
varun-edachali-dbx Jun 12, 2025
c62f76d
remove un-necessary changes
varun-edachali-dbx Jun 12, 2025
199402e
remove excess changes
varun-edachali-dbx Jun 12, 2025
8ac574b
remove excess changes
varun-edachali-dbx Jun 12, 2025
398ca70
Merge branch 'sea-migration' into exec-phase-sea
varun-edachali-dbx Jun 12, 2025
b1acc5b
remove _get_schema_bytes (for now)
varun-edachali-dbx Jun 12, 2025
ef2a7ee
redundant comments
varun-edachali-dbx Jun 12, 2025
699942d
Merge branch 'sea-migration' into exec-phase-sea
varun-edachali-dbx Jun 12, 2025
af8f74e
remove fetch phase methods
varun-edachali-dbx Jun 12, 2025
5540c5c
reduce code repetititon + introduce gaps after multi line pydocs
varun-edachali-dbx Jun 12, 2025
efe3881
remove unused imports
varun-edachali-dbx Jun 12, 2025
36ab59b
move description extraction to helper func
varun-edachali-dbx Jun 12, 2025
1d57c99
formatting (black)
varun-edachali-dbx Jun 12, 2025
df6dac2
add more unit tests
varun-edachali-dbx Jun 12, 2025
ad0e527
streamline unit tests
varun-edachali-dbx Jun 12, 2025
ed446a0
test getting the list of allowed configurations
varun-edachali-dbx Jun 12, 2025
38e4b5c
reduce diff
varun-edachali-dbx Jun 12, 2025
94879c0
reduce diff
varun-edachali-dbx Jun 12, 2025
1809956
house constants in enums for readability and immutability
varun-edachali-dbx Jun 13, 2025
da5260c
add note on hybrid disposition
varun-edachali-dbx Jun 13, 2025
0385ffb
remove redundant note on arrow_schema_bytes
varun-edachali-dbx Jun 16, 2025
23963fc
align SeaResultSet with ext-links-sea
varun-edachali-dbx Jun 16, 2025
dd43715
remove redundant methods
varun-edachali-dbx Jun 16, 2025
34a7f66
update unit tests
varun-edachali-dbx Jun 16, 2025
715cc13
remove accidental venv changes
varun-edachali-dbx Jun 16, 2025
a0705bc
add fetchmany_arrow and fetchall_arrow
varun-edachali-dbx Jun 17, 2025
1b90c4a
Merge branch 'metadata-sea' into fetch-json-inline
varun-edachali-dbx Jun 17, 2025
f7c11b9
remove accidental changes in sea backend tests
varun-edachali-dbx Jun 17, 2025
349c021
Merge branch 'exec-phase-sea' into metadata-sea
varun-edachali-dbx Jun 17, 2025
6229848
remove irrelevant changes
varun-edachali-dbx Jun 17, 2025
fd52356
remove un-necessary test changes
varun-edachali-dbx Jun 17, 2025
64e58b0
remove un-necessary changes in thrift backend tests
varun-edachali-dbx Jun 17, 2025
2903473
remove unimplemented methods test
varun-edachali-dbx Jun 17, 2025
b300709
Merge branch 'metadata-sea' into fetch-json-inline
varun-edachali-dbx Jun 17, 2025
021ff4c
remove unimplemented method tests
varun-edachali-dbx Jun 17, 2025
adecd53
modify example scripts to include fetch calls
varun-edachali-dbx Jun 17, 2025
bfc1f01
fix sea connector tests
varun-edachali-dbx Jun 17, 2025
0a2cdfd
remove unimplemented methods test
varun-edachali-dbx Jun 17, 2025
90bb09c
Merge branch 'sea-migration' into exec-phase-sea
varun-edachali-dbx Jun 17, 2025
cd22389
remove invalid import
varun-edachali-dbx Jun 17, 2025
82e0f8b
Merge branch 'sea-migration' into exec-phase-sea
varun-edachali-dbx Jun 17, 2025
e64b81b
Merge branch 'exec-phase-sea' into metadata-sea
varun-edachali-dbx Jun 17, 2025
27564ca
Merge branch 'metadata-sea' into fetch-json-inline
varun-edachali-dbx Jun 17, 2025
5ab9bbe
better align queries with JDBC impl
varun-edachali-dbx Jun 18, 2025
1ab6e87
line breaks after multi-line PRs
varun-edachali-dbx Jun 18, 2025
f469c24
remove unused imports
varun-edachali-dbx Jun 18, 2025
68ec65f
fix: introduce ExecuteResponse import
varun-edachali-dbx Jun 18, 2025
ffd478e
Merge branch 'sea-migration' into metadata-sea
varun-edachali-dbx Jun 18, 2025
f6d873d
remove unimplemented metadata methods test, un-necessary imports
varun-edachali-dbx Jun 18, 2025
28675f5
introduce unit tests for metadata methods
varun-edachali-dbx Jun 18, 2025
3578659
remove verbosity in ResultSetFilter docstring
varun-edachali-dbx Jun 20, 2025
8713023
remove un-necessary info in ResultSetFilter docstring
varun-edachali-dbx Jun 20, 2025
22dc252
remove explicit type checking, string literals around forward annotat…
varun-edachali-dbx Jun 20, 2025
390f592
house SQL commands in constants
varun-edachali-dbx Jun 20, 2025
28308fe
Merge branch 'metadata-sea' into fetch-json-inline
varun-edachali-dbx Jun 23, 2025
2712d1c
introduce unit tests for altered functionality
varun-edachali-dbx Jun 23, 2025
984e8ee
remove unused imports
varun-edachali-dbx Jun 23, 2025
0ce144d
remove unused imports
varun-edachali-dbx Jun 23, 2025
50cc1e2
run small queries with SEA during integration tests
varun-edachali-dbx Jun 24, 2025
242307a
run some tests for sea
varun-edachali-dbx Jun 24, 2025
35f1ef0
remove catalog requirement in get_tables
varun-edachali-dbx Jun 26, 2025
a515d26
move filters.py to SEA utils
varun-edachali-dbx Jun 26, 2025
59b1330
ensure SeaResultSet
varun-edachali-dbx Jun 26, 2025
293e356
Merge branch 'sea-migration' into metadata-sea
varun-edachali-dbx Jun 26, 2025
dd40beb
prevent circular imports
varun-edachali-dbx Jun 26, 2025
14057ac
remove unused imports
varun-edachali-dbx Jun 26, 2025
a4d5bdb
remove cast, throw error if not SeaResultSet
varun-edachali-dbx Jun 26, 2025
156421a
Merge branch 'metadata-sea' into fetch-json-inline
varun-edachali-dbx Jun 26, 2025
eb1a9b4
pass param as TSparkParameterValue
varun-edachali-dbx Jun 26, 2025
e9b1314
make SEA backend methods return SeaResultSet
varun-edachali-dbx Jun 26, 2025
8ede414
use spec-aligned Exceptions in SEA backend
varun-edachali-dbx Jun 26, 2025
09a1b11
remove defensive row type check
varun-edachali-dbx Jun 26, 2025
5e01e7b
Merge branch 'metadata-sea' into fetch-json-inline
varun-edachali-dbx Jun 26, 2025
21c389d
introduce type conversion for primitive types for JSON + INLINE
varun-edachali-dbx Jun 27, 2025
734321a
Merge branch 'sea-migration' into fetch-json-inline
varun-edachali-dbx Jun 27, 2025
9f0f969
remove SEA running on metadata queries (known failures
varun-edachali-dbx Jun 27, 2025
04a1936
remove un-necessary docstrings
varun-edachali-dbx Jun 27, 2025
278b8cd
align expected types with databricks sdk
varun-edachali-dbx Jun 27, 2025
91b7f7f
link rest api reference to validate types
varun-edachali-dbx Jun 27, 2025
7a5ae13
remove test_catalogs_returns_arrow_table test
varun-edachali-dbx Jun 27, 2025
f1776f3
fix fetchall_arrow and fetchmany_arrow
varun-edachali-dbx Jun 27, 2025
6143331
remove thrift aligned test_cancel_during_execute from SEA tests
varun-edachali-dbx Jun 27, 2025
8949d0c
Merge branch 'sea-migration' into fetch-json-inline
varun-edachali-dbx Jun 27, 2025
5eaded4
remove un-necessary changes in example scripts
varun-edachali-dbx Jun 27, 2025
eeed9a1
remove un-necessary chagnes in example scripts
varun-edachali-dbx Jun 27, 2025
f233886
_convert_json_table -> _create_json_table
varun-edachali-dbx Jun 27, 2025
68ac437
remove accidentally removed test
varun-edachali-dbx Jun 27, 2025
7fd0845
remove new unit tests (to be re-added based on new arch)
varun-edachali-dbx Jun 27, 2025
ea7ff73
remove changes in sea_result_set functionality (to be re-added)
varun-edachali-dbx Jun 27, 2025
563da71
introduce more integration tests
varun-edachali-dbx Jun 27, 2025
a018273
remove SEA tests in parameterized queries
varun-edachali-dbx Jun 27, 2025
c0e98f4
remove partial parameter fix changes
varun-edachali-dbx Jun 27, 2025
7343035
remove un-necessary timestamp tests
varun-edachali-dbx Jun 27, 2025
ec500b6
slightly stronger typing of _convert_json_types
varun-edachali-dbx Jun 27, 2025
0b3e91d
stronger typing of json utility func s
varun-edachali-dbx Jun 27, 2025
7664e44
stronger typing of fetch*_json
varun-edachali-dbx Jun 27, 2025
db7b8e5
remove unused helper methods in SqlType
varun-edachali-dbx Jun 27, 2025
f75f2b5
line breaks after multi line pydocs, remove excess logs
varun-edachali-dbx Jun 27, 2025
e2d4ef5
line breaks after multi line pydocs, reduce diff of redundant changes
varun-edachali-dbx Jun 27, 2025
21e3078
reduce diff of redundant changes
varun-edachali-dbx Jun 27, 2025
bb015e6
mandate ResultData in SeaResultSet constructor
varun-edachali-dbx Jun 27, 2025
bb948a0
return empty JsonQueue in case of empty response
varun-edachali-dbx Jun 27, 2025
921a8c1
remove string literals around SeaDatabricksClient declaration
varun-edachali-dbx Jun 27, 2025
cc5203d
move conversion module into dedicated utils
varun-edachali-dbx Jul 1, 2025
cc86832
clean up _convert_decimal, introduce scale and precision as kwargs
varun-edachali-dbx Jul 1, 2025
3f1fd93
use stronger typing in convert_value (object instead of Any)
varun-edachali-dbx Jul 1, 2025
0bdf8f9
make Manifest mandatory
varun-edachali-dbx Jul 1, 2025
28b4d7b
mandatory Manifest, clean up statement_id typing
varun-edachali-dbx Jul 1, 2025
245aa77
stronger typing for fetch*_json
varun-edachali-dbx Jul 1, 2025
7d21ad1
make description non Optional, correct docstring, optimize col conver…
varun-edachali-dbx Jul 1, 2025
4d10dcc
fix type issues
varun-edachali-dbx Jul 1, 2025
14c5625
make description mandatory, not Optional
varun-edachali-dbx Jul 1, 2025
31a0e52
n_valid_rows -> num_rows
varun-edachali-dbx Jul 1, 2025
e86e755
remove excess print statement
varun-edachali-dbx Jul 1, 2025
7035098
remove empty bytes in SeaResultSet for arrow_schema_bytes
varun-edachali-dbx Jul 1, 2025
4566cb1
move SeaResultSetQueueFactory and JsonQueue into separate SEA module
varun-edachali-dbx Jul 1, 2025
72a5cd3
move sea result set into backend/sea package
varun-edachali-dbx Jul 1, 2025
511c449
improve docstrings
varun-edachali-dbx Jul 1, 2025
c21ff5e
correct docstrings, ProgrammingError -> ValueError
varun-edachali-dbx Jul 1, 2025
72100b9
let type of rows by List[List[str]] for clarity
varun-edachali-dbx Jul 1, 2025
aab33a1
select Queue based on format in manifest
varun-edachali-dbx Jul 2, 2025
9a6db30
make manifest mandatory
varun-edachali-dbx Jul 2, 2025
0135d33
stronger type checking in JSON helper functions in Sea Result Set
varun-edachali-dbx Jul 2, 2025
cc9db8b
assign empty array to data array if None
varun-edachali-dbx Jul 2, 2025
bb135fc
stronger typing in JsonQueue
varun-edachali-dbx Jul 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 57 additions & 8 deletions examples/experimental/tests/test_sea_async_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,20 @@ def test_sea_async_query_with_cloud_fetch():
f"Successfully opened SEA session with ID: {connection.get_session_id_hex()}"
)

# Execute a simple query asynchronously
# Execute a query that generates large rows to force multiple chunks
requested_row_count = 5000
cursor = connection.cursor()
query = f"""
SELECT
id,
concat('value_', repeat('a', 10000)) as test_value
FROM range(1, {requested_row_count} + 1) AS t(id)
"""

logger.info(
"Executing asynchronous query with cloud fetch: SELECT 1 as test_value"
f"Executing asynchronous query with cloud fetch to generate {requested_row_count} rows"
)
cursor.execute_async("SELECT 1 as test_value")
cursor.execute_async(query)
logger.info(
"Asynchronous query submitted successfully with cloud fetch enabled"
)
Expand All @@ -69,8 +77,25 @@ def test_sea_async_query_with_cloud_fetch():

logger.info("Query is no longer pending, getting results...")
cursor.get_async_execution_result()

results = [cursor.fetchone()]
results.extend(cursor.fetchmany(10))
results.extend(cursor.fetchall())
actual_row_count = len(results)

logger.info(
f"Requested {requested_row_count} rows, received {actual_row_count} rows"
)

# Verify total row count
if actual_row_count != requested_row_count:
logger.error(
f"FAIL: Row count mismatch. Expected {requested_row_count}, got {actual_row_count}"
)
return False

logger.info(
"Successfully retrieved asynchronous query results with cloud fetch enabled"
"PASS: Received correct number of rows with cloud fetch and all fetch methods work correctly"
)

# Close resources
Expand Down Expand Up @@ -130,12 +155,20 @@ def test_sea_async_query_without_cloud_fetch():
f"Successfully opened SEA session with ID: {connection.get_session_id_hex()}"
)

# Execute a simple query asynchronously
# For non-cloud fetch, use a smaller row count to avoid exceeding inline limits
requested_row_count = 100
cursor = connection.cursor()
query = f"""
SELECT
id,
concat('value_', repeat('a', 100)) as test_value
FROM range(1, {requested_row_count} + 1) AS t(id)
"""

logger.info(
"Executing asynchronous query without cloud fetch: SELECT 1 as test_value"
f"Executing asynchronous query without cloud fetch to generate {requested_row_count} rows"
)
cursor.execute_async("SELECT 1 as test_value")
cursor.execute_async(query)
logger.info(
"Asynchronous query submitted successfully with cloud fetch disabled"
)
Expand All @@ -148,8 +181,24 @@ def test_sea_async_query_without_cloud_fetch():

logger.info("Query is no longer pending, getting results...")
cursor.get_async_execution_result()
results = [cursor.fetchone()]
results.extend(cursor.fetchmany(10))
results.extend(cursor.fetchall())
actual_row_count = len(results)

logger.info(
f"Requested {requested_row_count} rows, received {actual_row_count} rows"
)

# Verify total row count
if actual_row_count != requested_row_count:
logger.error(
f"FAIL: Row count mismatch. Expected {requested_row_count}, got {actual_row_count}"
)
return False

logger.info(
"Successfully retrieved asynchronous query results with cloud fetch disabled"
"PASS: Received correct number of rows without cloud fetch and all fetch methods work correctly"
)

# Close resources
Expand Down
37 changes: 28 additions & 9 deletions examples/experimental/tests/test_sea_sync_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,27 @@ def test_sea_sync_query_with_cloud_fetch():
f"Successfully opened SEA session with ID: {connection.get_session_id_hex()}"
)

# Execute a simple query
# Execute a query that generates large rows to force multiple chunks
requested_row_count = 10000
cursor = connection.cursor()
query = f"""
SELECT
id,
concat('value_', repeat('a', 10000)) as test_value
FROM range(1, {requested_row_count} + 1) AS t(id)
"""

logger.info(
f"Executing synchronous query with cloud fetch to generate {requested_row_count} rows"
)
cursor.execute(query)
results = [cursor.fetchone()]
results.extend(cursor.fetchmany(10))
results.extend(cursor.fetchall())
actual_row_count = len(results)
logger.info(
"Executing synchronous query with cloud fetch: SELECT 1 as test_value"
f"{actual_row_count} rows retrieved against {requested_row_count} requested"
)
cursor.execute("SELECT 1 as test_value")
logger.info("Query executed successfully with cloud fetch enabled")

# Close resources
cursor.close()
Expand Down Expand Up @@ -114,13 +128,18 @@ def test_sea_sync_query_without_cloud_fetch():
f"Successfully opened SEA session with ID: {connection.get_session_id_hex()}"
)

# Execute a simple query
# For non-cloud fetch, use a smaller row count to avoid exceeding inline limits
requested_row_count = 100
cursor = connection.cursor()
logger.info(
"Executing synchronous query without cloud fetch: SELECT 1 as test_value"
logger.info("Executing synchronous query without cloud fetch: SELECT 100 rows")
cursor.execute(
"SELECT id, 'test_value_' || CAST(id as STRING) as test_value FROM range(1, 101)"
)
cursor.execute("SELECT 1 as test_value")
logger.info("Query executed successfully with cloud fetch disabled")

results = [cursor.fetchone()]
results.extend(cursor.fetchmany(10))
results.extend(cursor.fetchall())
logger.info(f"{len(results)} rows retrieved against 100 requested")

# Close resources
cursor.close()
Expand Down
35 changes: 20 additions & 15 deletions src/databricks/sql/backend/sea/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

if TYPE_CHECKING:
from databricks.sql.client import Cursor
from databricks.sql.result_set import SeaResultSet
from databricks.sql.backend.sea.result_set import SeaResultSet

from databricks.sql.backend.databricks_client import DatabricksClient
from databricks.sql.backend.types import (
Expand Down Expand Up @@ -251,7 +251,7 @@ def close_session(self, session_id: SessionId) -> None:
logger.debug("SeaDatabricksClient.close_session(session_id=%s)", session_id)

if session_id.backend_type != BackendType.SEA:
raise ProgrammingError("Not a valid SEA session ID")
raise ValueError("Not a valid SEA session ID")
sea_session_id = session_id.to_sea_session_id()

request_data = DeleteSessionRequest(
Expand Down Expand Up @@ -290,7 +290,7 @@ def get_allowed_session_configurations() -> List[str]:

def _extract_description_from_manifest(
self, manifest: ResultManifest
) -> Optional[List]:
) -> List[Tuple]:
"""
Extract column description from a manifest object, in the format defined by
the spec: https://peps.python.org/pep-0249/#description
Expand All @@ -299,15 +299,12 @@ def _extract_description_from_manifest(
manifest: The ResultManifest object containing schema information

Returns:
Optional[List]: A list of column tuples or None if no columns are found
List[Tuple]: A list of column tuples
"""

schema_data = manifest.schema
columns_data = schema_data.get("columns", [])

if not columns_data:
return None

columns = []
for col_data in columns_data:
# Format: (name, type_code, display_size, internal_size, precision, scale, null_ok)
Expand All @@ -323,7 +320,7 @@ def _extract_description_from_manifest(
)
)

return columns if columns else None
return columns

def _results_message_to_execute_response(
self, response: GetStatementResponse
Expand Down Expand Up @@ -429,7 +426,7 @@ def execute_command(
"""

if session_id.backend_type != BackendType.SEA:
raise ProgrammingError("Not a valid SEA session ID")
raise ValueError("Not a valid SEA session ID")

sea_session_id = session_id.to_sea_session_id()

Expand Down Expand Up @@ -508,9 +505,11 @@ def cancel_command(self, command_id: CommandId) -> None:
"""

if command_id.backend_type != BackendType.SEA:
raise ProgrammingError("Not a valid SEA command ID")
raise ValueError("Not a valid SEA command ID")

sea_statement_id = command_id.to_sea_statement_id()
if sea_statement_id is None:
raise ValueError("Not a valid SEA command ID")

request = CancelStatementRequest(statement_id=sea_statement_id)
self.http_client._make_request(
Expand All @@ -531,9 +530,11 @@ def close_command(self, command_id: CommandId) -> None:
"""

if command_id.backend_type != BackendType.SEA:
raise ProgrammingError("Not a valid SEA command ID")
raise ValueError("Not a valid SEA command ID")

sea_statement_id = command_id.to_sea_statement_id()
if sea_statement_id is None:
raise ValueError("Not a valid SEA command ID")

request = CloseStatementRequest(statement_id=sea_statement_id)
self.http_client._make_request(
Expand All @@ -560,6 +561,8 @@ def get_query_state(self, command_id: CommandId) -> CommandState:
raise ValueError("Not a valid SEA command ID")

sea_statement_id = command_id.to_sea_statement_id()
if sea_statement_id is None:
raise ValueError("Not a valid SEA command ID")

request = GetStatementRequest(statement_id=sea_statement_id)
response_data = self.http_client._make_request(
Expand Down Expand Up @@ -592,9 +595,11 @@ def get_execution_result(
"""

if command_id.backend_type != BackendType.SEA:
raise ProgrammingError("Not a valid SEA command ID")
raise ValueError("Not a valid SEA command ID")

sea_statement_id = command_id.to_sea_statement_id()
if sea_statement_id is None:
raise ValueError("Not a valid SEA command ID")

# Create the request model
request = GetStatementRequest(statement_id=sea_statement_id)
Expand All @@ -608,18 +613,18 @@ def get_execution_result(
response = GetStatementResponse.from_dict(response_data)

# Create and return a SeaResultSet
from databricks.sql.result_set import SeaResultSet
from databricks.sql.backend.sea.result_set import SeaResultSet

execute_response = self._results_message_to_execute_response(response)

return SeaResultSet(
connection=cursor.connection,
execute_response=execute_response,
sea_client=self,
buffer_size_bytes=cursor.buffer_size_bytes,
arraysize=cursor.arraysize,
result_data=response.result,
manifest=response.manifest,
buffer_size_bytes=cursor.buffer_size_bytes,
arraysize=cursor.arraysize,
)

# == Metadata Operations ==
Expand Down
71 changes: 71 additions & 0 deletions src/databricks/sql/backend/sea/queue.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
from __future__ import annotations

from abc import ABC
from typing import List, Optional, Tuple

from databricks.sql.backend.sea.backend import SeaDatabricksClient
from databricks.sql.backend.sea.models.base import ResultData, ResultManifest
from databricks.sql.backend.sea.utils.constants import ResultFormat
from databricks.sql.exc import ProgrammingError
from databricks.sql.utils import ResultSetQueue


class SeaResultSetQueueFactory(ABC):
@staticmethod
def build_queue(
sea_result_data: ResultData,
manifest: ResultManifest,
statement_id: str,
description: List[Tuple] = [],
max_download_threads: Optional[int] = None,
sea_client: Optional[SeaDatabricksClient] = None,
lz4_compressed: bool = False,
) -> ResultSetQueue:
"""
Factory method to build a result set queue for SEA backend.

Args:
sea_result_data (ResultData): Result data from SEA response
manifest (ResultManifest): Manifest from SEA response
statement_id (str): Statement ID for the query
description (List[List[Any]]): Column descriptions
max_download_threads (int): Maximum number of download threads
sea_client (SeaDatabricksClient): SEA client for fetching additional links
lz4_compressed (bool): Whether the data is LZ4 compressed

Returns:
ResultSetQueue: The appropriate queue for the result data
"""

if manifest.format == ResultFormat.JSON_ARRAY.value:
# INLINE disposition with JSON_ARRAY format
return JsonQueue(sea_result_data.data)
elif manifest.format == ResultFormat.ARROW_STREAM.value:
# EXTERNAL_LINKS disposition
raise NotImplementedError(
"EXTERNAL_LINKS disposition is not implemented for SEA backend"
)
raise ProgrammingError("Invalid result format")


class JsonQueue(ResultSetQueue):
"""Queue implementation for JSON_ARRAY format data."""

def __init__(self, data_array: Optional[List[List[str]]]):
"""Initialize with JSON array data."""
self.data_array = data_array or []
self.cur_row_index = 0
self.num_rows = len(self.data_array)

def next_n_rows(self, num_rows: int) -> List[List[str]]:
"""Get the next n rows from the data array."""
length = min(num_rows, self.num_rows - self.cur_row_index)
slice = self.data_array[self.cur_row_index : self.cur_row_index + length]
self.cur_row_index += length
return slice

def remaining_rows(self) -> List[List[str]]:
"""Get all remaining rows from the data array."""
slice = self.data_array[self.cur_row_index :]
self.cur_row_index += len(slice)
return slice
Loading
Loading