Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Fuzzing Crash: VarBinArray filter panic on AllTrue/AllFalse mask #6262

@github-actions

Description

@github-actions

Fuzzing Crash Report

Analysis

Crash Location: vortex-array/src/arrays/varbin/compute/filter.rs:40 in the filter_select_var_bin function

Error Message:

AllTrue and AllFalse are handled by filter fn

Stack Trace:

   0: std::backtrace_rs::backtrace::libunwind::trace
   1: std::backtrace_rs::backtrace::trace_unsynchronized
   2: <std::backtrace::Backtrace>::create
   3: {closure#0}<&vortex_mask::MaskValues>
             at ./vortex-error/src/lib.rs:322:26
   4: unwrap_or_else<&vortex_mask::MaskValues, vortex_error::{impl#12}::vortex_expect::{closure_env#0}<&vortex_mask::MaskValues>>
   5: vortex_expect<&vortex_mask::MaskValues>
             at ./vortex-error/src/lib.rs:319:14
   6: filter_select_var_bin
             at ./vortex-array/src/arrays/varbin/compute/filter.rs:40:10
   7: filter
             at ./vortex-array/src/arrays/varbin/compute/filter.rs:31:9

Root Cause:

The filter_select_var_bin function at vortex-array/src/arrays/varbin/compute/filter.rs:37-48 assumes that AllTrue and AllFalse mask variants have been handled by the caller. However, in certain code paths (particularly through FSST-encoded arrays being filtered during file I/O operations with complex nested structs), the filter operation receives a mask that is AllTrue or AllFalse but hasn't been short-circuited.

The problematic code:

fn filter_select_var_bin(arr: &VarBinArray, mask: &Mask) -> VortexResult<VarBinArray> {
    match mask
        .values()
        .vortex_expect("AllTrue and AllFalse are handled by filter fn")  // Line 40 - PANIC!
        .threshold_iter(0.5)
    {
        MaskIter::Indices(indices) => {
            filter_select_var_bin_by_index(arr, indices, mask.true_count())
        }
        MaskIter::Slices(slices) => filter_select_var_bin_by_slice(arr, slices, mask.true_count()),
    }
}

The mask.values() method returns None for AllTrue and AllFalse mask variants, which causes the panic. The expectation encoded in the error message is that these cases should be handled earlier, but the filter operation path through FSSTArray -> VarBinArray doesn't properly handle these edge cases.

Call Path:

  1. File I/O operations trigger compare operation on a StructArray with nested List(Utf8) fields
  2. Arrow conversion executes filtering on FSSTArray-encoded strings
  3. FSST kernel delegates to parent (VarBinArray) via execute_parent (encodings/fsst/src/kernel.rs:42)
  4. VarBinArray filter kernel calls filter_select_var_bin without checking for AllTrue/AllFalse
  5. Panic occurs because mask is AllTrue/AllFalse but wasn't short-circuited

Array Structure:

  • StructArray with length 10, field name "\r\r\r"
  • Field contains ChunkedArray with List(Utf8(Nullable), Nullable) dtype
  • Two chunks: ListViewArray with VarBinViewArray elements (length 7 and 3)
  • FSSTArray encoding is applied during file operations
  • Compare operation generates AllTrue or AllFalse mask that isn't handled properly
Debug Output

```
FuzzFileAction {
array: StructArray {
len: 10,
dtype: Struct(
StructFields {
names: FieldNames(
[
FieldName(
"\r\r\r",
),
],
),
dtypes: [
FieldDType {
inner: Owned(
List(
Utf8(
Nullable,
),
Nullable,
),
),
},
],
},
NonNullable,
),
fields: [
ChunkedArray {
dtype: List(
Utf8(
Nullable,
),
Nullable,
),
len: 10,
chunk_offsets: PrimitiveArray {
dtype: Primitive(
U64,
NonNullable,
),
buffer: BufferHandle(
Host(
Buffer {
length: 24,
alignment: Alignment(
8,
),
as_slice: [0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, ...],
},
),
),
validity: NonNullable,
stats_set: ArrayStats { ... },
},
chunks: [
ListViewArray {
dtype: List(Utf8(Nullable), Nullable),
elements: VarBinViewArray { ... },
offsets: PrimitiveArray { ... },
sizes: PrimitiveArray { ... },
is_zero_copy_to_list: true,
validity: NonNullable,
stats_set: ArrayStats { ... },
},
ListViewArray { ... },
],
stats_set: ArrayStats { ... },
},
],
stats_set: ArrayStats { ... },
},
projection_expr: Some(...),
filter_expr: None,
compressor_strategy: Default,
}
```

Summary

Reproduction

  1. Download the crash artifact:

  2. Reproduce locally:
    ```bash

The artifact contains file_io/crash-6e749c06cc6fbc4da220d7e81a086238e366156c

cargo +nightly fuzz run -D --sanitizer=none file_io file_io/crash-6e749c06cc6fbc4da220d7e81a086238e366156c -- -rss_limit_mb=0
```

  1. Get full backtrace:
    ```bash
    RUST_BACKTRACE=full cargo +nightly fuzz run -D --sanitizer=none file_io file_io/crash-6e749c06cc6fbc4da220d7e81a086238e366156c -- -rss_limit_mb=0
    ```

Auto-created by fuzzing workflow with Claude analysis

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugA bug issuefuzzerIssues detected by the fuzzer

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions