Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

kdashg
Copy link
Contributor

@kdashg kdashg commented Mar 26, 2020

spec/index.bs Outdated

Embed a copy of |source| from |sourceOffset| to |size| into the {{GPUCommandEncoder}}.
Encode a command into the {{GPUCommandEncoder}} that copies |size| bytes of data from embedded copy to the |destinationOffset| of another {{GPUBuffer}} |destination|.
|size| must be less than or equal to 65536 bytes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly implementable with Metal's set{,Vertex,Fragment}Bytes which are limited to (or recommended to be under?) 4k.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setBytes is only transient. It's basically for setting uniforms OpenGL-style.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I guess those are only useful for a push-constant-like feature.

@Kangz
Copy link
Contributor

Kangz commented Mar 27, 2020

Apart from being closer to Vulkan, I'm not sure what the improvement is compared to writeBuffer:

  • writeBuffer is less constrained.
  • writeBuffer doesn't require allocating storage for the data for an unknown duration, and everything can be just streamed.
  • writeBuffer doesn't make GPUCommandEncoders hold on to data in addition to commands.

@kvark
Copy link
Contributor

kvark commented Mar 27, 2020

That approach is very similar to #154, which we discussed at length at San Diego F2F. The points about returning ArrayBuffer objects obviously don't apply to this one, but concerns about lifetimes that are mentioned here by @Kangz still do.

@Kangz
Copy link
Contributor

Kangz commented Apr 2, 2020

The API we have at the moment for WebGPU has a nice property of increasing in complexity as you want to extract the most of it, but works good enough if you do the simple things. writeBuffer is another step in that direction, but inlineUpdateBuffer isn't in my opinion.

inlineUpdateBuffer can cause the application to break in the wild while it worked on the developer's machine because all of a sudden some data became big enough. @jdashg mentioned in the call that this is similar to the max texture size limits, but I think there are two difference: inlineUploadBuffer doesn't have a limit, and texture sizes don't tend to very as much as, say, the size of model data that's put in an inlineUpdateBuffer.

Also inlineUpdateBuffer is the simple option that developers will be looking to use, and the natural thing to do when wanting to upload more data will be to call it multiple times. And doing so will result in surprisingly bad performance. Sure we can have console warnings, but for an important usecase like this one, it's just better to give developers a good enough options that works good enough and doesn't have footguns.

Also inlineUpdateBuffer essentially forces 4 copies:

  • JS ArrayBuffer to shmem
  • shmem to an temporary allocation that lives next to the GPU process object for a GPUCommandBuffer
  • that allocation to staging memory
  • staging memory to the final place

Due to its streaming and "instantaneous" nature, writeToBuffer skips the temporary allocation, getting to 3 copies, and can even be streamed into GPU-visible shmem, reducing copies to 2.

For example, updating model-view and projection matrices before or interleaved with rendering of a scene.
When these uploads are small, it's viable to inline the update data into the command buffer.
This does require more copies than other upload paths, but for small data sizes this overhead is negligible.
Implementations are expected to warn against using this for medium-to-large buffer updates. (e.g. >64k)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we draw the line of the "suboptimal" behavior? I.e. what if an application updates 64k of different multiple buffers? what if it updates 64k of data of the same buffer? does it matter if the updated range is the same? etc

For example, updating model-view and projection matrices before or interleaved with rendering of a scene.
When these uploads are small, it's viable to inline the update data into the command buffer.
This does require more copies than other upload paths, but for small data sizes this overhead is negligible.
Implementations are expected to warn against using this for medium-to-large buffer updates. (e.g. >64k)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with warnings here is that this size could easily be unknown at build/development time. Say, the developer loads a mesh and updates some vertices using this function, and everything works on their machine. But then later an user loads a bigger mesh, and not only they get a warning spam, it's also animating suspiciously slow, because of how many copies the data needs to go through (i.e. 4 on this path, as estimated by @Kangz).


In Vulkan, this is similar to |vkCmdUpdateBuffer|.
In D3D12, implementations can leverage |ID3D12GraphicsCommandList2::WriteBufferImmediate|.
Metal might use |makeBuffer(bytesNoCopy:length:options:deallocator:)| around some section of shared command buffer serialization memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how this would work in practice. It requires page-size alignment for both the pointer and the size, and also:

The existing memory allocation must be covered by a single VM region, typically allocated with vm_allocate or mmap. Memory allocated by malloc is specifically disallowed.

@Kangz
Copy link
Contributor

Kangz commented May 20, 2020

Closing now that #708 is landed

@Kangz Kangz closed this May 20, 2020
@kdashg
Copy link
Contributor Author

kdashg commented May 20, 2020

We may still want this later.

@kdashg kdashg reopened this May 20, 2020
@kvark kvark changed the base branch from master to main June 23, 2020 13:16
@kainino0x kainino0x added this to the Polish post-V1 milestone Aug 25, 2022
@kainino0x kainino0x marked this pull request as draft August 25, 2022 01:33
ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this pull request Sep 6, 2022
@kainino0x kainino0x modified the milestones: Polish post-V1, Milestone 2? Aug 15, 2023
@kainino0x kainino0x added the api WebGPU API label May 21, 2024
@kainino0x kainino0x modified the milestones: Polish post-V1, Milestone 1 Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api WebGPU API proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants