Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MapAsync with subranges. #605

@Kangz

Description

@Kangz

YADUP (Yet Another Data Upload Proposal)

Last meeting there seemed to be appetite for asynchronous mapping that would allow requesting subranges but the group wanted to see a more fleshed out proposal.

This version of data upload is very similar to what we have in the spec today with mapWriteAsync and mapReadAsync but the resolution of the mapping promise doesn't give an ArrayBuffer, instead it stores the ArrayBuffer in an internal slot of the GPUBuffer and there's a GPUBuffer.getMappedRange method that allows getting subranges of the internal ArrayBuffer.

This is close the @kainino0x's old GPUMappedMemory idea.

Proposal

partial interface GPUBufferUsage {
    const GPUBufferUsageFlags MAP_READ  = 0x0001;
    const GPUBufferUsageFlags MAP_WRITE = 0x0002;
};

partial interface GPUBuffer {
  Promise<void> mapAsync();
  ArrayBuffer getMappedRange(unsigned long offset = 0, unsigned long size = 0);
  void unmap();
}

partial dictionary GPUBufferDescriptor {
  boolean mappedAtCreation = false;
};

Calling GPUBuffer.mapAsync is an error if the buffer is not valid or if it is not in the "unmapped" state (which means it is not destroyed either). Upon error mapAsync returns a promise that will reject. Upon success mapAsync puts the buffer in the "mapping" state and returns a promise that when it resolves, will put the buffer in the "mapped" state.

Calling GPUBuffer.getMappedRange, if the buffer is not in the "mapped" state, return null. If called in the "mapped" state it returns a new ArrayBuffer that's a view into the content of the buffer at range [offset, offset + size[ (obviously there's a JS exception on a bad range check). size and offset default to 0, and a size of 0 means the remaining size of the buffer after offset: buffer.getMappedRange returns the whole range.

Calling GPUBuffer.unmap is an error if the buffer is not valid or if it is in the unmapped state. On success:

  • if the buffer is in the "mapping" state, then the promise is rejected and the buffer put in the "unmapped" state
  • if the buffer is in the "mapped" state, all ArrayBuffers returned by GPUBuffer.getMappedRange() are detached and the buffer if put in the "unmapped" state

Note that modifications to the content of ArrayBuffer returned by getMappedRange are semantically modifications of the content of the buffer itself.

Calling GPUDevice.createBuffer with descriptor.mappedAtCreation can be done even if descriptor.usage doesn't contain the MAP_READ or MAP_WRITE flags. If mappedAtCreation is true, the buffer is created in the "mapped" and its content modified before unmap() and other uses like in a queue.submit().

As usual, other uses of GPUBuffer like in a GPUQueue.submit() would validate that the buffer is in the "unmapped" state. And similar to other proposals there would be restrictions on the usages that can be used in combination with MAP_READ and MAP_WRITE. Contrary to other proposals MAP_READ and MAP_WRITE could be set at the same time, and I suggest the following rules:

  • If MAP_WRITE is present COPY_SRC is allowed.
  • If MAP_READ is present, COPY_DST is allowed.
  • If MAP_READ and MAP_WRITE are present, then both COPY_SRC and COPY_DST are allowed.
  • (example for a UMA feature) if the adapter is UMA, then if MAP_WRITE is present, then VERTEX and UNIFORM are also allowed.

This mapping mechanism would live side-by-side with a writeToBuffer path.

There's also threading constraints that all calls to getMappedRange and unmap() must be in the same worker so ArrayBuffers can be detached.

Alternatives choices

A single mapAsync is present instead of mapWriteAsync and mapReadAsync. The proposal talks about the ArrayBuffer being the content of the GPUBuffer directly, so it was a bit weird to have
two map functions. The downside if that if the implementation can't wrap shmem in a GPU resource:

  • either a copy will have to take place on unmap() even for MAP_READ buffers to update the content with writes the application did in the ArrayBuffer
  • or range-tracking needs to happen for MAP_READ buffers so the implementation knows what to overwrite

It could be possible to not return a promise from mapAsync and instead make the GPUBuffer itself act like a promise with a .then method and maybe a synchronous "state" member.

The assumption is that multi-process browsers will allocate one large shmem corresponding to the whole size of mapped buffers, so multiple ArrayBuffers could look at the same memory and overlap. If we don't want to force one large continuous allocation, getMappedRange could enforce that the ranges are all disjoint between calls to unmap.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions