GPU Web 2024 08 28

GPU Web WG 2024-08-28 Atlantic-time

Chair: CW

Scribe: KR/CW

Location: Google Meet

Tentative agenda

Administrivia
- Add items to the F2F agenda!
CTS Update
Talk from UCSC on memory isolation in WebGPU
Milestone 0
- Missing dynamic offset validation in "Validate encoder bind groups" #4811
- remove sourceMap from GPUShaderModuleDescriptor #4808
- mapAsync early-reject doesn't generate a validation error #4690
(Ken) WebGPU benchmark(s) in MotionMark?
Hardware feature levels (and compat) #4656
Compatibility Mode: Support OpenGLES devices with zero SSBOs and image uniforms in vertex, fragment stages #4721
[editorial] Consider if there's a much better name for "image copies" #4573 (comment)
Agenda for next meeting

Attendance

Apple
- Mike Wyrzykowski
- Myles C. Maxfield
Google
- Brandon Jones
- Corentin Wallez
- Dan Sinclair
- David Neto
- Gregg Tavares
- James Price
- Kai Ninomiya
- Loko Kung
- Peter McNeeley
- Stephen White
Mozilla
- Kelsey Gilbert
Nvidia
- Anders Leino
Mehmet Oguz Derin
Mendy Berger
Sean Isom
Reese Levine
Tyler Sorensen
Xiaoshen X

Administrivia

Add items to the F2F agenda!
Interaction with Immersive Web WG
- BJ: People most actively looking at WebGPU WebXR are BJ and MW so we can be liaison easily.

CTS Update

Gregg continuing texture builtin tests, finding more weird issues
- textureGather with integer-format cubemaps seems to be broken on all AMD GPUs
- Should we restrict the API or leave this as a driver bug?
- mipmap selection #4818
Experimental subgroup tests
Some test flakiness in the past month and haven’t yet determined where it’s coming from

Talk from UCSC on memory isolation in WebGPU

Slides
Try the fuzzer: https://gpuharbor.ucsc.edu/webgpu-race-testing/
MM: A natural way to have test cases would be to look at the source code and the assembly code produced by the GPU shader compiler. Are you looking at the underlying assembly?
RL: Could have access to SAS and others but would be very difficult to do. More looking at C assembly at the moment. Would love to collaborate on the low-level assembly.
TS: We have evidence that the underlying compilers are not optimizing things out. We have confidence that the fuzzer does real things because without robustness we see problems (in ⅓ of the fuzz shaders on M1).
RC: Have you had a chance to run it on Windows?
RL: Kind of the blindspot, but we know that D3D has a much stronger robustness story, but would love
CW: long term solution - suggestions? Previous ideas:
- Store special idea that driver doesn't know, to launder memory access
- Smart attacker could double-XOR to make things disappear
- Only solution: std::launder or similar, forget anything you know about this value
- RL: Faith Ekstrand during Vulkanized thought yes, have a secret instruction that compilers are instructed not to mess with. Need Vulkan, DX, Metal specs for the same thing. Not a bad idea. JS did think about this, have discussion around data races - dependencies on unsafe lower code? Or are the JS rules ensconced in the lower-level assembly code that JS engines generate?
- CW: their concerns were more around Spectre than data races. JS engines are the end-to-end optimizing compiler; nothing underneath them.
- RL: good point, can't optimize everything yourself for WebGPU because there are lower-level compilers. Let's keep thinking about enforcing this in lower-level specs.
GT: some APIs have robust access. Can't (easily) write a test from WebGPU that checks, even if you read out-of-bounds in the buffer, it's allowed that you get data from elsewhere in the same app.
- RL: also worrying for WebGPU. In Chrome, all WebGPU goes through the same process. Generally, in the same process, the protections are a lot more limited than if everything was in separate processes. How to know it's from your app, another tab's app, etc.?
- TS: easiest thing might be to highlight values you didn't put into that specific buffer, even if it's a false positive.

Milestone 0

Missing dynamic offset validation in "Validate encoder bind groups" #4811
- Resolved to fix the bug, Teo to make a PR
remove sourceMap from GPUShaderModuleDescriptor #4808
- CW: suggest we figure this out in M2. Nobody's implementing this yet.
- MM: sure, as long as applications using this don't break in the bindings layer.
- (more discussion)
- Consensus to remove it for M0
mapAsync early-reject doesn't generate a validation error #4690
- PR now only waiting on Apple review: Generate a validation error on mapAsync early-reject #4786
- MW will review.
(added late, OK to defer if necessary) Change mapAsync to also early-reject if already mapped #4787
- blocked on the other mapAsync one
- Makes it so we avoid lots of round-trips to the device timeline
- For both pending-map and already-mapped, reject immediately on the client side. We have the state on both client and device-timeline sides
- KG: if we can do it on the client timeline, let's do it on the client timeline
- MW: OK for WebKit too
- Resolved to make this change
(added late, OK to defer if necessary) maxInterStageShaderComponents seems to overlap with maxInterStageShaderVariables #4688
- Allow unknown limits to be requested with value undefined #4781
- Remove maxInterStageShaderComponents #4783
- KG: intent is that implementations should do this
- KN: like doing non-normative notes for this
- MW: thumbs up
- KN: there's also maxInterStageShaderComponents removal. We did agree to do this, but if you want to look at the mechanics, please look at the linked PR.

[editorial] Consider if there's a much better name for "image copies" #4573 (comment)

CW: technically not a change in the WebGPU API. Not visible anywhere. Kai has a pull request up. Last chance to bikeshed. FYI only.

(Ken) WebGPU benchmark(s) in MotionMark?

Focusing on things that browsers can actually control? Like overhead of creating BindGroups, issuing lots of small draw calls (to measure browser overhead), etc.
Want to avoid turning these tests into benchmarks of the GPU itself.
Thoughts? More ideas for areas to test?
KG: MotionMark's goal is to reflect real-world use cases. Trying to not be a microbenchmark.
MM: Might be confusing with Speedometer that tries to be real world use cases. MotionMark tests individual rendering primitives
KR: yeah I really dispute that. It's tons of microbenchmarks. Canvas arcs, the Images test which only does CPU readbacks - a pessimal path.
MM: MM is made of a bunch of benchmarks but the front page runs only a few of them by default. The bar is much higher to get it on the front page.
CW: Think the intent is that they would graduate to the front page.
RC: agree with Ken, we should have WebGPU tests in MotionMark. Think they should be reflective of real-world things. But if these become GPU benchmarks I don't think that's necessarily a bad side-effect.
KR: It is difficult in the 3D space to not end up bottlenecked by something like fill-rate. Should focus on things browsers can optimize, like creating many bindgroups, or many small draws instead of big draws. Gregg tried making benchmarks for WebGL for years and it was very hard.
MM: The tests in MM also try to benchmark specific things, like we want to optimize lines, so let's do a test for lines. The methodology for MM makes it difficult to put a whole game frame. MM measure the complexity in a time budget, and not how fast a frame renders.
KR: The WebGL MM draws a bunch of triangles badly.
KG: Let's be careful of not being fill-rate limited, e.g. keep the resolution very small. We can still do a complexity thing for larger workloads if simple enough "how many frames per frame".
CW: Seems we have general appetite to add some tests, but should talk with MotionMark people about how to design these

Hardware feature levels (and compat) #4656

Skip, Jim is unavailable today.

Compatibility Mode: Support OpenGLES devices with zero SSBOs and image uniforms in vertex, fragment stages #4721

Agenda for next meeting ()

Plan to meet again next week, Atlantic-time
Agenda
- [editorial] Rename image copies -> texel copies #4838 (issue #4573)
- Hardware feature levels (and compat) #4656
- Compatibility Mode: Support OpenGLES devices with zero SSBOs and image uniforms in vertex, fragment stages #4721
- [M0] Add usage to GPUTextureViewDescriptor #4746 (issue #4426)
- [M0] What happens on the device timeline after GPUDevice.destroy is called? #4294
- [M0] More strongly push developers toward correct use of GPUAdapter #4784
- (: as of that is absolutely everything in M0 so I’ll try to make sure if anything else comes up it gets listed here)
- [M1] Feature detection for HDR canvas #4828

GPU Web 2024 08 28

GPU Web WG 2024-08-28 Atlantic-time

Tentative agenda

Attendance

Administrivia

CTS Update

Talk from UCSC on memory isolation in WebGPU

Milestone 0

[editorial] Consider if there's a much better name for "image copies" #4573 (comment)

(Ken) WebGPU benchmark(s) in MotionMark?

Hardware feature levels (and compat) #4656

Compatibility Mode: Support OpenGLES devices with zero SSBOs and image uniforms in vertex, fragment stages #4721

Agenda for next meeting ()

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!