-
Notifications
You must be signed in to change notification settings - Fork 336
Description
From our conversation about occlusion queries.
Motivation
One of the natural use cases for using occlusion queries is in conjunction with predicated rendering. One can achieve object culling by rendering the bounding box of some object to an offscreen render target (or with rasterization disabled) using some trivial shader, and then by using occlusion queries to determine whether the object actually ends up hitting the screen. If it does, then the object can be rendered to the real render target like normal, but if it doesn't, then the object can be culled.
The problem is that the CPU shouldn't have to wait for the results of the occlusion query to know whether to issue the appropriate draw calls, because that can cause 2-3 frames of latency, which is too late. Instead, predicated rendering allows the GPU to turn a draw call into a no-op based on whether or not a buffer has a nonzero value in a particular place. The occlusion query can be hooked up to populate this value.
Direct3D 12
There's a command that can be added to the command list: ID3DGraphicsCommandList::SetPredication()
. It takes a buffer and an offset into the buffer. After issuing this command, future draw calls will consult with that 32-bit word in the buffer to determine if they should be turned into no-ops. There's a new resource state: D3D12_RESOURCE_STATE_PREDICATION
, which the buffer needs to be in (you can transition it to the state using ID3D12GraphicsCommandList::ResourceBarrier()).
But it's not just draw calls. Other operations are predicated too:
- DrawInstanced
- DrawIndexedInstanced
- Dispatch
- CopyTextureRegion
- CopyBufferRegion
- CopyResource
- CopyTiles
- ResolveSubresource
- ClearDepthStencilView
- ClearRenderTargetView
- ClearUnorderedAccessViewUint
- ClearUnorderedAccessViewFloat
- ExecuteIndirect
Vulkan - VK_EXT_conditional_rendering
GPUInfo says 31% Windows, 54% Linux, and 1% Android.
Works very similarly to D3D12. There's a new command you can issue in a command buffer: vkCmdBeginConditionalRenderingEXT(), and it takes a buffer and an offset just like D3D12 does. And, like D3D12, there are new flags values that this buffer needs to be vkCmdPipelineBarrier()
-ed to: VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT
, VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT
, and VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT
.
These operations are conditional:
- drawing commands
- dispatching commands
vkCmdClearAttachments()
Metal
Metal doesn't have a direct analogue to the other APIs. Instead, the effect can be achieved two different ways:
- The
indirect
flavor of draw commands and dispatch commands. ThevertexCount
andinstanceCount
fields, as present inside the indirect buffer, can be set to 0, which effectively no-ops the draw call. - Using Indirect Command Buffers. These can be recorded on the GPU, which means the draw calls themselves can be omitted from the recording. See the related issue for more discussion.
Both D3D12 and Vulkan have optional support for indirect commands; indeed, they are already part of WebGPU. Both D3D12 and Vulkan also have optional support for Indirect Command Buffers, though in Vulkan, it's only achievable using the vendor-specific VK_NVX_device_generated_commands
.
Indirect draws/dispatches are available on all relevant Metal devices. Indirect Command Buffers are only available on some relevant Metal devices. The Metal Feature Set Tables contain additional information. Given that predicated rendering already would be optional because Vulkan support for VK_EXT_conditional_rendering
is optional, both of these options seem reasonable.