-
Notifications
You must be signed in to change notification settings - Fork 335
Description
YADUP (Yet Another Data Upload Proposal)
Last meeting there seemed to be appetite for asynchronous mapping that would allow requesting subranges but the group wanted to see a more fleshed out proposal.
This version of data upload is very similar to what we have in the spec today with mapWriteAsync
and mapReadAsync
but the resolution of the mapping promise doesn't give an ArrayBuffer
, instead it stores the ArrayBuffer
in an internal slot of the GPUBuffer
and there's a GPUBuffer.getMappedRange
method that allows getting subranges of the internal ArrayBuffer
.
This is close the @kainino0x's old GPUMappedMemory
idea.
Proposal
partial interface GPUBufferUsage {
const GPUBufferUsageFlags MAP_READ = 0x0001;
const GPUBufferUsageFlags MAP_WRITE = 0x0002;
};
partial interface GPUBuffer {
Promise<void> mapAsync();
ArrayBuffer getMappedRange(unsigned long offset = 0, unsigned long size = 0);
void unmap();
}
partial dictionary GPUBufferDescriptor {
boolean mappedAtCreation = false;
};
Calling GPUBuffer.mapAsync
is an error if the buffer is not valid or if it is not in the "unmapped" state (which means it is not destroyed either). Upon error mapAsync
returns a promise that will reject. Upon success mapAsync
puts the buffer in the "mapping" state and returns a promise that when it resolves, will put the buffer in the "mapped" state.
Calling GPUBuffer.getMappedRange
, if the buffer is not in the "mapped" state, return null. If called in the "mapped" state it returns a new ArrayBuffer
that's a view into the content of the buffer at range [offset, offset + size[
(obviously there's a JS exception on a bad range check). size
and offset
default to 0, and a size
of 0 means the remaining size of the buffer after offset
: buffer.getMappedRange
returns the whole range.
Calling GPUBuffer.unmap
is an error if the buffer is not valid or if it is in the unmapped state. On success:
- if the buffer is in the "mapping" state, then the promise is rejected and the buffer put in the "unmapped" state
- if the buffer is in the "mapped" state, all
ArrayBuffers
returned byGPUBuffer.getMappedRange()
are detached and the buffer if put in the "unmapped" state
Note that modifications to the content of ArrayBuffer
returned by getMappedRange
are semantically modifications of the content of the buffer itself.
Calling GPUDevice.createBuffer
with descriptor.mappedAtCreation
can be done even if descriptor.usage
doesn't contain the MAP_READ
or MAP_WRITE
flags. If mappedAtCreation
is true, the buffer is created in the "mapped" and its content modified before unmap()
and other uses like in a queue.submit()
.
As usual, other uses of GPUBuffer
like in a GPUQueue.submit()
would validate that the buffer is in the "unmapped" state. And similar to other proposals there would be restrictions on the usages that can be used in combination with MAP_READ
and MAP_WRITE
. Contrary to other proposals MAP_READ
and MAP_WRITE
could be set at the same time, and I suggest the following rules:
- If
MAP_WRITE
is presentCOPY_SRC
is allowed. - If
MAP_READ
is present,COPY_DST
is allowed. - If
MAP_READ
andMAP_WRITE
are present, then bothCOPY_SRC
andCOPY_DST
are allowed. - (example for a UMA feature) if the adapter is UMA, then if
MAP_WRITE
is present, thenVERTEX
andUNIFORM
are also allowed.
This mapping mechanism would live side-by-side with a writeToBuffer
path.
There's also threading constraints that all calls to getMappedRange
and unmap()
must be in the same worker so ArrayBuffers
can be detached.
Alternatives choices
A single mapAsync
is present instead of mapWriteAsync
and mapReadAsync
. The proposal talks about the ArrayBuffer
being the content of the GPUBuffer
directly, so it was a bit weird to have
two map functions. The downside if that if the implementation can't wrap shmem in a GPU resource:
- either a copy will have to take place on
unmap()
even forMAP_READ
buffers to update the content with writes the application did in theArrayBuffer
- or range-tracking needs to happen for
MAP_READ
buffers so the implementation knows what to overwrite
It could be possible to not return a promise from mapAsync
and instead make the GPUBuffer
itself act like a promise with a .then
method and maybe a synchronous "state" member.
The assumption is that multi-process browsers will allocate one large shmem corresponding to the whole size of mapped buffers, so multiple ArrayBuffers
could look at the same memory and overlap. If we don't want to force one large continuous allocation, getMappedRange
could enforce that the ranges are all disjoint between calls to unmap
.