Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@FabioCanedo
Copy link
Contributor

Description

Summary

This PR modifies the build.rs script in the libduckdb-sys crate to ensure deterministic builds by sorting the C++ file paths before including them in the build process.

Closes #582.

Motivation

Currently, cpp_files is derived from a HashSet, which yields non-deterministic ordering. This causes builds to vary across runs, leading to non-reproducible binaries. The change directly addresses the “Unreproducible binaries due to cpp_files HashSet in build.rs” issue by introducing a structured ordering of file inclusions.

Implementation Details

Convert the cpp_files iterator into a Vec.

Apply .sort() to that vector to enforce a consistent ordering.

Iterate over the sorted vector to invoke cfg.file(...) on each entry.

Test Results (Unchanged from Before)

These two tests continue to fail—they were failing prior to this PR and remain unrelated to its intent:

failures:

---- test_all_types::test_large_arrow_types stdout ----
Error: DuckDBFailure(Error { code: Unknown, extended_code: 1 }, Some("Binder Error: Column \"varint\" in EXCLUDE list not found in FROM clause"))

---- test_all_types::test_all_types stdout ----
Error: DuckDBFailure(Error { code: Unknown, extended_code: 1 }, Some("Binder Error: Column \"varint\" in EXCLUDE list not found in FROM clause"))

failures:
    test_all_types::test_all_types
    test_all_types::test_large_arrow_types

test result: FAILED. 123 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.88s

error: test failed, to rerun pass `-p duckdb --lib`

These errors are caused by a Binder Error—specifically stating that the column "varint" in the list was not found in the FROM clause—and are unrelated to the sorted C++ files.

FabioCanedo and others added 2 commits August 19, 2025 17:49
Ensure that C++ files are sorted before being added in build.rs to
guarantee that the compilation order is consistent across builds. This
change helps produce deterministic build outputs.
Copy link
Member

@mlafeldt mlafeldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me, but the ASAN tests are currently failing.

@FabioCanedo
Copy link
Contributor Author

The change looks good to me, but the ASAN tests are currently failing.

Was it not the case before the commit?

PS: I did not find clear instructions on how to execute this test

@mlafeldt
Copy link
Member

I fixed the (unrelated) ASAN problem in #584. Merging once green. Thanks!

@mlafeldt mlafeldt merged commit ef394fc into duckdb:main Aug 21, 2025
3 checks passed
@FabioCanedo FabioCanedo deleted the libduckdb-sys-reproducible branch August 21, 2025 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unreproducible binaries due to cpp_files HashSet in build.rs

2 participants