Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Jeadie
Copy link
Contributor

@Jeadie Jeadie commented Oct 18, 2025

πŸ“ Summary

Before: Loaded 0 rows for dataset financebench.data in 6m 15s 986ms.
After:  Loaded 0 rows for dataset financebench.data in 3s 37ms.

πŸ”— Related

🚨 Breaking Changes

πŸ“š Docs

πŸ‘€ Notes for Reviewers

@github-actions
Copy link
Contributor

github-actions bot commented Oct 18, 2025

βœ… Pull with Spice Passed

Passing checks:

  • βœ… Title meets minimum length requirement (10 characters)
  • βœ… Has at least one of the required labels: kind/refactor, kind/bug, kind/enhancement, kind/documentation, kind/optimization, kind/dependencies, kind/endgame, kind/task, kind/performance
  • βœ… No banned labels detected
  • βœ… Has at least one assignee: Jeadie

@Jeadie Jeadie marked this pull request as ready for review October 19, 2025 23:59
@Jeadie Jeadie requested a review from a team as a code owner October 19, 2025 23:59
Copilot AI review requested due to automatic review settings October 19, 2025 23:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a physical query optimization for empty hash joins and integrates it into the VectorScanTableProvider, resulting in significant performance improvements for certain queries (from 6 minutes to 3 seconds in the example).

  • Added EmptyHashJoinExecPhysicalOptimization to detect and optimize hash joins with empty inputs
  • Moved cache invalidation logic to a new datafusion-optimizer-rules crate for better organization
  • Enhanced VectorScanTableProvider with filter pushdown support and join optimization

Reviewed Changes

Copilot reviewed 16 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/datafusion-optimizer-rules/src/physical_plan/hash_join_optimization.rs New physical optimizer rule for empty hash joins
crates/runtime/src/embeddings/index/scan_table.rs Added filter pushdown and join optimization support
crates/runtime/src/datafusion/builder.rs Registered the new physical optimizer rule
crates/datafusion-optimizer-rules/src/logical_plan/cache_invalidation.rs Moved cache invalidation logic to new crate
crates/runtime/src/datafusion/extension/mod.rs Moved tests from cache invalidation module

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

phillipleblanc
phillipleblanc previously approved these changes Oct 20, 2025
Copilot AI review requested due to automatic review settings October 20, 2025 04:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 19 out of 21 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@phillipleblanc phillipleblanc added this pull request to the merge queue Oct 20, 2025
Merged via the queue into trunk with commit 631a8d8 Oct 20, 2025
147 of 150 checks passed
@phillipleblanc phillipleblanc deleted the jeadie/25-10-18/vector-join branch October 20, 2025 12:24
lukekim pushed a commit that referenced this pull request Oct 20, 2025
…ider (#7587)

* EmptyHashJoinExecPhysicalOptimization, and use in VectorScanTableProvider

* 'datafusion-optimizer-rules' crate

* move more to datafusion-optimizer-rules

* clppy

* testing

* snapshots

* PR comments

* update snapshots
github-merge-queue bot pushed a commit that referenced this pull request Oct 21, 2025
* Initial Pepper data accelerator
This renames Vortex to Pepper

* Fixes

* Update crates/pepper/README.md

* Update crates/runtime/src/component/dataset/acceleration.rs

Co-authored-by: Copilot <[email protected]>

* Update name

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/pepper/src/lib.rs

Co-authored-by: Copilot <[email protected]>

* Improvements

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/pepper/README.md

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/mod.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/pepper/README.md

Co-authored-by: Copilot <[email protected]>

* Apply suggestions from code review

Co-authored-by: Phillip LeBlanc <[email protected]>

* fix score order for one test case (#7595)

* `ObjectMeta` filter pushdown for `ObjectStoreTextTable` (#7572)

* setup code for document table filtering

* pushdown ObjectMeta filters to ObjectStoreTextTable

* fix filter

* Prefetch ObjectMeta to improve execution plan statistics

* PR comment refactors

* unit tests

* bad merge

* clippy

* clppy

* Return `TableProvider` from `CandidateGeneration::search`.  (#7559)

* Remove 'SearchIndex::metadata_columns'

* add non-filterable metadata to FTS index

* integration tests

* clppy

* clppy

* clppy

* clppy

* clppy

* compiles

* clppy

* clppy

* working

* docs etc

* revert

* fix match projection; nan in scores

* clppy

* fmt

* snapshots

* clppy

* multi-thread some tokio tests

---------

Co-authored-by: Luke Kim <[email protected]>

* EmptyHashJoinExecPhysicalOptimization, and use in VectorScanTableProvider (#7587)

* EmptyHashJoinExecPhysicalOptimization, and use in VectorScanTableProvider

* 'datafusion-optimizer-rules' crate

* move more to datafusion-optimizer-rules

* clppy

* testing

* snapshots

* PR comments

* update snapshots

* Update official Docker builds to use release binaries (#7597)

* Update official Docker builds to use release binaries

* update endgame

* fix docker builds

* Fix cuda build

* New Generate Changelog workflow (#7562)

* New Generate Changelog workflow

* Set default versions for reference

* Improvements for changelog

* Add comments

* Update scripts/generate_changelog.py

Co-authored-by: Copilot <[email protected]>

* Improvements to make it more reliable

* Update scripts/generate_changelog.py

Co-authored-by: Copilot <[email protected]>

* remove old changelog generator

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>

* BytesProcessedExec to allow optimizer to do limit pushdown (#7539)

* fix limit pushdown for children of bytesprocessedexec

* accept: limits push down lower after bytesprocessed allows passthrough

---------

Co-authored-by: Luke Kim <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>

* GitHub Data Connector add Projects, improve rate-limiting and error handling (#7547)

* Add better graphql validation

* WIP

* Formatting

* Fix

* Fix query

* Add validation for GitHub API

* More error handling improvements

* Revert "Merge branch 'trunk' into lukim/github-data-connector"

This reverts commit 8982710, reversing
changes made to 7b8f8bc.

* Fix issues

* Updares

* Consolidate

* Add Debug

* Bump golang.org/x/mod from 0.28.0 to 0.29.0 (#7530)

Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.28.0 to 0.29.0.
- [Commits](golang/mod@v0.28.0...v0.29.0)

---
updated-dependencies:
- dependency-name: golang.org/x/mod
  dependency-version: 0.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Luke Kim <[email protected]>

* Hive-style partitioning for DuckDB file mode (#7563)

* more advanced partition_by config

* add tests

* wip

* rename name

* add PartitionedBy to vector

* use new PartitionedBy in partition_by_expressions

* modify DataAccelerator::create_external_table

* use PartitionedBy more

* create hive style files

* discover hive style partitions

* discover hive partitions for duckdb

* remove unwraps

* fix clippy lints

* more clippy lints

* Update crates/spicepod/src/partitioning.rs

Co-authored-by: Copilot <[email protected]>

* fix spicepod tests

---------

Co-authored-by: Copilot <[email protected]>

* Vortex Data Accelerator (Dev grade) (#7566)

* Vortex Data Accelerator

* Fix imports

* Add back feature

* fix vortex

* fix build

* Remove memory

* Fix tests

* Add tests

* Update tests

* Fixes

* Update

* Fix

* Update tests

* Use StreamTable instead of ListingTable

* Update tests

* Use async buffered writes

* Update tests

* Works!

* Perf improvements

* Fix

* Add check for partition_by

* Fix memory leak

* fix lint issues

* fmt

* Improve benchmark tests

* fix lint

* Fix duplicate code.

* vendor vortex-datafusion

* fix

* finally clean lint

* Don't create dummy file, just specify the schema

* fix lint

* Property integrate vendored vortex-datafusion

* WIP

* Fix tests

* remove custom writing code

* fix lint

* Update crates/vortex-datafusion/src/persistent/opener.rs

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Phillip LeBlanc <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Only load eval scorers when eval defined (#7549)

* Only load eval scorers when eval defined

* Reinstate eval verification in async workflow

* Bump octocrab from 0.45.0 to 0.47.0 (#7531)

Bumps [octocrab](https://github.com/XAMPPRocky/octocrab) from 0.45.0 to 0.47.0.
- [Release notes](https://github.com/XAMPPRocky/octocrab/releases)
- [Changelog](https://github.com/XAMPPRocky/octocrab/blob/main/CHANGELOG.md)
- [Commits](XAMPPRocky/octocrab@v0.45.0...v0.47.0)

---
updated-dependencies:
- dependency-name: octocrab
  dependency-version: 0.47.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump regex from 1.11.3 to 1.12.1 (#7532)

Bumps [regex](https://github.com/rust-lang/regex) from 1.11.3 to 1.12.1.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](rust-lang/regex@1.11.3...1.12.1)

---
updated-dependencies:
- dependency-name: regex
  dependency-version: 1.12.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix custom file path for Vortex Data Accelerator (#7570)

* Only support append refresh

* Remove validation

* fix lint

* Fix the tests

* Add List type support to Vortex Data Accelerator (#7569)

* Vortex Data Accelerator

* Fix imports

* Add back feature

* fix vortex

* fix build

* Remove memory

* Fix tests

* Add tests

* Update tests

* Fixes

* Update

* Fix

* Update tests

* Use StreamTable instead of ListingTable

* Update tests

* Use async buffered writes

* Update tests

* Works!

* Perf improvements

* Fix

* Add check for partition_by

* Fix memory leak

* fix lint issues

* fmt

* Improve benchmark tests

* fix lint

* Fix duplicate code.

* vendor vortex-datafusion

* fix

* finally clean lint

* Don't create dummy file, just specify the schema

* fix lint

* Property integrate vendored vortex-datafusion

* WIP

* Fix tests

* remove custom writing code

* WIP

* Only support append mode for Vortex

* Add additional validation

* Add List to vortex supported types

* Fix linting issues

* Remove memory mode test, not supported anymore.

---------

Co-authored-by: Phillip LeBlanc <[email protected]>

* Bump parking_lot from 0.12.4 to 0.12.5 (#7534)

Bumps [parking_lot](https://github.com/Amanieu/parking_lot) from 0.12.4 to 0.12.5.
- [Release notes](https://github.com/Amanieu/parking_lot/releases)
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](Amanieu/parking_lot@parking_lot-v0.12.4...parking_lot-v0.12.5)

---
updated-dependencies:
- dependency-name: parking_lot
  dependency-version: 0.12.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump tokio-postgres from 0.7.14 to 0.7.15 (#7533)

Bumps [tokio-postgres](https://github.com/rust-postgres/rust-postgres) from 0.7.14 to 0.7.15.
- [Release notes](https://github.com/rust-postgres/rust-postgres/releases)
- [Commits](rust-postgres/rust-postgres@tokio-postgres-v0.7.14...tokio-postgres-v0.7.15)

---
updated-dependencies:
- dependency-name: tokio-postgres
  dependency-version: 0.7.15
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Remove duplicate line from 1.8.1 release notes (#7580)

* Upgrade Go from v1.24.2 to v1.25.3 (#7582)

* check if index/bucket exists after ConflictException (#7577)

* Add `runtime-async` crate with managed Tokio runtime (#7575)

* Add `runtime-async` crate with managed Tokio runtime

* fix

* remove test

* fix

* fix lint

* Optimize GitHub Actions workflows (#7584)

* Optimize builds for speed

* Update .github/workflows/pr.yml

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* Add prepared statements

* Remove dupe

* Revert "Add prepared statements"

This reverts commit 5f8a36b.

* Update crates/runtime/src/dataconnector/github/projects.rs

Co-authored-by: Copilot <[email protected]>

* Fix copilot's complaints

* Fixes

* Filter out empty segments

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kevin Zimmerman <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>
Co-authored-by: Jack Eadie <[email protected]>
Co-authored-by: Viktor Yershov <[email protected]>

* Add copilot-instructions to help improve Copilot reviews (#7606)

* Add copilot-instructions to help improve Copilot reviews.

* Updates

* Fixes

* Add support for DuckDB table-based partitioning (#7581)

* Support for partitioning based on table names

* Simplify infer_existing_partitions

* Better structure

* Add DuckDBPartitionedDataSink

* Update insert overwrite

* insert_append

* Update

* Update

* logic to delete old internal tables for full refresh

* Include partitioned_duckdb param

* Use statement for list_partitioned_tables

* Fix schema mismatch error

* lint

* Add tests for the DuckDBPartitionedDataSink

* Add test for TablesModePartitionedDuckDBAccelerator

* Update

* Primary key support

* on-conflict support

* Indexes support for append and full refresh

* Update crates/runtime/src/dataaccelerator/partitioned_duckdb/tables_mode/mod.rs

Co-authored-by: Phillip LeBlanc <[email protected]>

* Update

* Make PassThruExec public

* Update to the latest table-providers version

---------

Co-authored-by: Phillip LeBlanc <[email protected]>

* Add clarification

* Fix build

* Update deny.toml

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>
Co-authored-by: Jack Eadie <[email protected]>
Co-authored-by: Viktor Yershov <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>
Co-authored-by: David Stancu <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kevin Zimmerman <[email protected]>
Co-authored-by: Sergei Grebnov <[email protected]>
krinart added a commit that referenced this pull request Oct 21, 2025
* Initial Pepper data accelerator
This renames Vortex to Pepper

* Fixes

* Update crates/pepper/README.md

* Update crates/runtime/src/component/dataset/acceleration.rs

Co-authored-by: Copilot <[email protected]>

* Update name

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/pepper/src/lib.rs

Co-authored-by: Copilot <[email protected]>

* Improvements

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/pepper.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/pepper/README.md

Co-authored-by: Copilot <[email protected]>

* Update crates/runtime/src/dataaccelerator/mod.rs

Co-authored-by: Copilot <[email protected]>

* Update crates/pepper/README.md

Co-authored-by: Copilot <[email protected]>

* Apply suggestions from code review

Co-authored-by: Phillip LeBlanc <[email protected]>

* fix score order for one test case (#7595)

* `ObjectMeta` filter pushdown for `ObjectStoreTextTable` (#7572)

* setup code for document table filtering

* pushdown ObjectMeta filters to ObjectStoreTextTable

* fix filter

* Prefetch ObjectMeta to improve execution plan statistics

* PR comment refactors

* unit tests

* bad merge

* clippy

* clppy

* Return `TableProvider` from `CandidateGeneration::search`.  (#7559)

* Remove 'SearchIndex::metadata_columns'

* add non-filterable metadata to FTS index

* integration tests

* clppy

* clppy

* clppy

* clppy

* clppy

* compiles

* clppy

* clppy

* working

* docs etc

* revert

* fix match projection; nan in scores

* clppy

* fmt

* snapshots

* clppy

* multi-thread some tokio tests

---------

Co-authored-by: Luke Kim <[email protected]>

* EmptyHashJoinExecPhysicalOptimization, and use in VectorScanTableProvider (#7587)

* EmptyHashJoinExecPhysicalOptimization, and use in VectorScanTableProvider

* 'datafusion-optimizer-rules' crate

* move more to datafusion-optimizer-rules

* clppy

* testing

* snapshots

* PR comments

* update snapshots

* Update official Docker builds to use release binaries (#7597)

* Update official Docker builds to use release binaries

* update endgame

* fix docker builds

* Fix cuda build

* New Generate Changelog workflow (#7562)

* New Generate Changelog workflow

* Set default versions for reference

* Improvements for changelog

* Add comments

* Update scripts/generate_changelog.py

Co-authored-by: Copilot <[email protected]>

* Improvements to make it more reliable

* Update scripts/generate_changelog.py

Co-authored-by: Copilot <[email protected]>

* remove old changelog generator

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>

* BytesProcessedExec to allow optimizer to do limit pushdown (#7539)

* fix limit pushdown for children of bytesprocessedexec

* accept: limits push down lower after bytesprocessed allows passthrough

---------

Co-authored-by: Luke Kim <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>

* GitHub Data Connector add Projects, improve rate-limiting and error handling (#7547)

* Add better graphql validation

* WIP

* Formatting

* Fix

* Fix query

* Add validation for GitHub API

* More error handling improvements

* Revert "Merge branch 'trunk' into lukim/github-data-connector"

This reverts commit 8982710, reversing
changes made to 7b8f8bc.

* Fix issues

* Updares

* Consolidate

* Add Debug

* Bump golang.org/x/mod from 0.28.0 to 0.29.0 (#7530)

Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.28.0 to 0.29.0.
- [Commits](golang/mod@v0.28.0...v0.29.0)

---
updated-dependencies:
- dependency-name: golang.org/x/mod
  dependency-version: 0.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Luke Kim <[email protected]>

* Hive-style partitioning for DuckDB file mode (#7563)

* more advanced partition_by config

* add tests

* wip

* rename name

* add PartitionedBy to vector

* use new PartitionedBy in partition_by_expressions

* modify DataAccelerator::create_external_table

* use PartitionedBy more

* create hive style files

* discover hive style partitions

* discover hive partitions for duckdb

* remove unwraps

* fix clippy lints

* more clippy lints

* Update crates/spicepod/src/partitioning.rs

Co-authored-by: Copilot <[email protected]>

* fix spicepod tests

---------

Co-authored-by: Copilot <[email protected]>

* Vortex Data Accelerator (Dev grade) (#7566)

* Vortex Data Accelerator

* Fix imports

* Add back feature

* fix vortex

* fix build

* Remove memory

* Fix tests

* Add tests

* Update tests

* Fixes

* Update

* Fix

* Update tests

* Use StreamTable instead of ListingTable

* Update tests

* Use async buffered writes

* Update tests

* Works!

* Perf improvements

* Fix

* Add check for partition_by

* Fix memory leak

* fix lint issues

* fmt

* Improve benchmark tests

* fix lint

* Fix duplicate code.

* vendor vortex-datafusion

* fix

* finally clean lint

* Don't create dummy file, just specify the schema

* fix lint

* Property integrate vendored vortex-datafusion

* WIP

* Fix tests

* remove custom writing code

* fix lint

* Update crates/vortex-datafusion/src/persistent/opener.rs

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Phillip LeBlanc <[email protected]>
Co-authored-by: Copilot <[email protected]>

* Only load eval scorers when eval defined (#7549)

* Only load eval scorers when eval defined

* Reinstate eval verification in async workflow

* Bump octocrab from 0.45.0 to 0.47.0 (#7531)

Bumps [octocrab](https://github.com/XAMPPRocky/octocrab) from 0.45.0 to 0.47.0.
- [Release notes](https://github.com/XAMPPRocky/octocrab/releases)
- [Changelog](https://github.com/XAMPPRocky/octocrab/blob/main/CHANGELOG.md)
- [Commits](XAMPPRocky/octocrab@v0.45.0...v0.47.0)

---
updated-dependencies:
- dependency-name: octocrab
  dependency-version: 0.47.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump regex from 1.11.3 to 1.12.1 (#7532)

Bumps [regex](https://github.com/rust-lang/regex) from 1.11.3 to 1.12.1.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](rust-lang/regex@1.11.3...1.12.1)

---
updated-dependencies:
- dependency-name: regex
  dependency-version: 1.12.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix custom file path for Vortex Data Accelerator (#7570)

* Only support append refresh

* Remove validation

* fix lint

* Fix the tests

* Add List type support to Vortex Data Accelerator (#7569)

* Vortex Data Accelerator

* Fix imports

* Add back feature

* fix vortex

* fix build

* Remove memory

* Fix tests

* Add tests

* Update tests

* Fixes

* Update

* Fix

* Update tests

* Use StreamTable instead of ListingTable

* Update tests

* Use async buffered writes

* Update tests

* Works!

* Perf improvements

* Fix

* Add check for partition_by

* Fix memory leak

* fix lint issues

* fmt

* Improve benchmark tests

* fix lint

* Fix duplicate code.

* vendor vortex-datafusion

* fix

* finally clean lint

* Don't create dummy file, just specify the schema

* fix lint

* Property integrate vendored vortex-datafusion

* WIP

* Fix tests

* remove custom writing code

* WIP

* Only support append mode for Vortex

* Add additional validation

* Add List to vortex supported types

* Fix linting issues

* Remove memory mode test, not supported anymore.

---------

Co-authored-by: Phillip LeBlanc <[email protected]>

* Bump parking_lot from 0.12.4 to 0.12.5 (#7534)

Bumps [parking_lot](https://github.com/Amanieu/parking_lot) from 0.12.4 to 0.12.5.
- [Release notes](https://github.com/Amanieu/parking_lot/releases)
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](Amanieu/parking_lot@parking_lot-v0.12.4...parking_lot-v0.12.5)

---
updated-dependencies:
- dependency-name: parking_lot
  dependency-version: 0.12.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump tokio-postgres from 0.7.14 to 0.7.15 (#7533)

Bumps [tokio-postgres](https://github.com/rust-postgres/rust-postgres) from 0.7.14 to 0.7.15.
- [Release notes](https://github.com/rust-postgres/rust-postgres/releases)
- [Commits](rust-postgres/rust-postgres@tokio-postgres-v0.7.14...tokio-postgres-v0.7.15)

---
updated-dependencies:
- dependency-name: tokio-postgres
  dependency-version: 0.7.15
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Remove duplicate line from 1.8.1 release notes (#7580)

* Upgrade Go from v1.24.2 to v1.25.3 (#7582)

* check if index/bucket exists after ConflictException (#7577)

* Add `runtime-async` crate with managed Tokio runtime (#7575)

* Add `runtime-async` crate with managed Tokio runtime

* fix

* remove test

* fix

* fix lint

* Optimize GitHub Actions workflows (#7584)

* Optimize builds for speed

* Update .github/workflows/pr.yml

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

* Add prepared statements

* Remove dupe

* Revert "Add prepared statements"

This reverts commit 5f8a36b.

* Update crates/runtime/src/dataconnector/github/projects.rs

Co-authored-by: Copilot <[email protected]>

* Fix copilot's complaints

* Fixes

* Filter out empty segments

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kevin Zimmerman <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>
Co-authored-by: Jack Eadie <[email protected]>
Co-authored-by: Viktor Yershov <[email protected]>

* Add copilot-instructions to help improve Copilot reviews (#7606)

* Add copilot-instructions to help improve Copilot reviews.

* Updates

* Fixes

* Add support for DuckDB table-based partitioning (#7581)

* Support for partitioning based on table names

* Simplify infer_existing_partitions

* Better structure

* Add DuckDBPartitionedDataSink

* Update insert overwrite

* insert_append

* Update

* Update

* logic to delete old internal tables for full refresh

* Include partitioned_duckdb param

* Use statement for list_partitioned_tables

* Fix schema mismatch error

* lint

* Add tests for the DuckDBPartitionedDataSink

* Add test for TablesModePartitionedDuckDBAccelerator

* Update

* Primary key support

* on-conflict support

* Indexes support for append and full refresh

* Update crates/runtime/src/dataaccelerator/partitioned_duckdb/tables_mode/mod.rs

Co-authored-by: Phillip LeBlanc <[email protected]>

* Update

* Make PassThruExec public

* Update to the latest table-providers version

---------

Co-authored-by: Phillip LeBlanc <[email protected]>

* Add clarification

* Fix build

* Update deny.toml

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>
Co-authored-by: Jack Eadie <[email protected]>
Co-authored-by: Viktor Yershov <[email protected]>
Co-authored-by: Phillip LeBlanc <[email protected]>
Co-authored-by: David Stancu <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kevin Zimmerman <[email protected]>
Co-authored-by: Sergei Grebnov <[email protected]>
@mach-kernel mach-kernel added this to the v1.8.3 milestone Oct 24, 2025
mach-kernel pushed a commit that referenced this pull request Oct 26, 2025
…ider (#7587)

* EmptyHashJoinExecPhysicalOptimization, and use in VectorScanTableProvider

* 'datafusion-optimizer-rules' crate

* move more to datafusion-optimizer-rules

* clppy

* testing

* snapshots

* PR comments

* update snapshots
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants