Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@CurtHagenlocher
Copy link
Contributor

@CurtHagenlocher CurtHagenlocher commented Sep 12, 2025

What's Changed

Enables bounds checking for Flatbuf as input can't generally be trusted.
Adds a test for a malformed column name length.
Makes some fixes required to be able to benchmark the change and demonstrate that there's no significant regression.

Closes #48.

@github-actions
Copy link

Documentation preview URL: https://CurtHagenlocher.github.io/arrow-dotnet

If the preview URL doesn't work, you may need to configure your fork repository for preview.
See https://github.com/apache/arrow-dotnet/blob/main/docs/README.md#preview-on-forks for instructions on how to configure.

@kou kou changed the title Bounds checking for Flatbuf should be enabled in the default build feat: Bounds checking for Flatbuf should be enabled in the default build Sep 12, 2025
@kou kou requested a review from adamreeve September 12, 2025 21:32
Copy link
Contributor

@adamreeve adamreeve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me thanks Curt.

#if ENABLE_SPAN_T && UNSAFE_BYTEBUFFER
public unsafe string GetStringUTF8(int startPos, int len)
{
AssertOffsetAndLength(startPos, len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, this has been added in the upstream Flatbuffers project but not yet released: google/flatbuffers#8673

So we should make sure if we upgrade the bundled Flatbuffers code we do so after that change is released or add it in manually again. The new test should catch this though.

<PropertyGroup>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<DefineConstants>$(DefineConstants);UNSAFE_BYTEBUFFER;BYTEBUFFER_NO_BOUNDS_CHECK;ENABLE_SPAN_T</DefineConstants>
<DefineConstants>$(DefineConstants);UNSAFE_BYTEBUFFER;ENABLE_SPAN_T</DefineConstants>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking we're happy with leaving UNSAFE_BYTEBUFFER enabled? As far as I understand, bounds checks should prevent any invalid reads and this should be safe in theory, but it means bugs in the Flatbuffers implementation (like missing bounds checks) could still cause memory safety issues. It sounds like this can increase performance significantly though so is worth keeping enabled (https://flatbuffers.dev/languages/c_sharp/#conditional-compilation-symbols).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try benchmarking without UNSAFE_BYTEBUFFER but I did audit all the code and confirmed that there was only one missing bounds check.

@CurtHagenlocher CurtHagenlocher merged commit 0194c4d into apache:main Sep 15, 2025
15 checks passed
@kou kou mentioned this pull request Sep 15, 2025
@CurtHagenlocher CurtHagenlocher deleted the BoundsCheck branch September 23, 2025 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ArrowStreamReader can crash due to invalid field name lengths

2 participants