Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Zero-copy ArrayExporter #3632

@gatesn

Description

@gatesn

Vortex is generally quite good about minimising copies of array data. It can load a segment from disk into a FlatLayout, zero-copy deserialize it into an in-memory array, and then perform computations over it.

There is one place where we do often copy that ideally we wouldn't and this is in the FlatLayoutReader when we eventually come to filter the zero-deserialization array. We invoke the filter(array, mask) compute function, which for many kernels will create a new canonicalized array.

When exporting to Arrow, this is fine. Since we can use a Vortex buffer zero-copy inside Arrow. But for other systems, e.g. DuckDB, or when the caller wishes to re-use a pre-allocated buffer of their own (Arrow, Numpy, DuckDB Vector), we need a way to pass the output buffer into the filter compute function and have the result written into it.

For example:

filter(&dyn Array, &Mask -> VortexResult<ArrayRef>
filter_into(&dyn Array, &Mask, &mut Canonical) -> VortexResult<()>

In the first function, the kernel has the option to return a non-canonical array. For example, a DictArray just needs to filter its codes leaving its values untouched.

In the second function, we can only really take a canonical array to write into (similar I suppose to the APIs inside DuckDB that pass around an output vector). With some changes to vortex-buffer (support for external BufferMut), this would allow us to wrap up pre-existing buffers and write results into them. In this case, Canonical::len == Mask::true_count.

For a scan, we cannot know the output length. So we should resort to an "exporter" style API (similar to how we currently implement the DuckDB exporter). This might look something like:

trait ArrayStreamExt {
  fn into_exporter(self) -> ArrayExporter;
}

trait ArrayExporter {
  fn export(&mut self, &mut Canonical) -> VortexResult<usize>; // Returning num rows exported <= Canonical::len.
}

Note the &mut Canonical could also be a non-resizable impl of an ArrayBuilder, or it could be something entirely new such as a trait Exportable. (Having a trait here would allow us to dynamically allocate additional buffers via the external system, e.g. asking DuckDB to attach a validity buffer to the vector when its needed, rather than having to pre-allocate one and wrap it up inside a mutable Canonical PrimitiveArray)

To summarize, a concrete example use-case might be:

  • Pre-allocate a numpy array of 2k floats.
  • Setup a Vortex scan of a compressed float column, filter using row indices, no filter expr.
  • Export the result of the scan 2k floats at a time into the same reused Numpy array with only a single copy (from the compressed form into the uncompressed Numpy buffer)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ext/duckdbRelates to the DuckDB integration

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions