precise_math attribute on functions #2080

kvark · 2021-09-01T19:28:53Z

Closes #2077
Fixes #2076

On SPIR-V, this would decorate affected expressions with NoContraction.
On Metal, this would add "-fno-fast-math" (or a subset of it) to the affected MTLLibrary.
On DX12 this would add precise to the variable declarations used by the affected functions.

Note: SignedZeroInfNanPreserve and other features of VK_KHR_shader_float_controls are intentionally not included.

github-actions · 2021-09-01T19:33:41Z

Previews, as seen when this build job started (c2e4178):
WebGPU | IDL
WGSL
Explainer

dneto0

Thanks for taking this stab. It's getting there

wgsl/index.bs

dneto0 · 2021-09-01T20:09:03Z

wgsl/index.bs

+  <tr><td><dfn noexport dfn-for="attribute">`precise_math`</dfn>
+    <td>*None*
+
+    Indicates that the arithmetic computations in the function need to be performed with


I have trouble with the word "precision" here, because that means "with more bits represented".
(Never mind the "precise" part of the attribute name, inherited from GLSL. It's good to reuse the GLSL word.)

Also, this should be constrained to floating point, I think.

How about:

Indicates that the floating point arithmetic computations in the function should be performed

without [=reassociation/reassociating=] subexpressions

while preserving infinities, NaNs, and signed zeroes

Apply this attribute when the correctness of the function is numerically sensitive, and it is acceptable to incur potential performance loss when forbidding such optimizations.

blah blah blah?

Thank you!
I took the liberty of modifying this a bit more. Let me know if it needs more fixing!

I felt that it was important to refer to the floating point evaluation section from here

wgsl/index.bs

github-actions · 2021-09-01T20:26:27Z

Previews, as seen when this build job started (9c8c768):
WebGPU | IDL
WGSL
Explainer

dneto0

Seems ok to me now.

The key word is "should", instead of "must"

The group should review this.

litherum · 2021-09-07T19:54:38Z

Metal exposes fastmath on the entire module: https://developer.apple.com/documentation/metal/mtlcompileoptions?language=objc. So this is a good idea, but it should be elevated to module-level (either by something at the global scope in the language, or as additional data to createShaderModule()).

litherum · 2021-09-07T21:09:48Z

The SPIR-V registry says SignedZeroInfNanPreserve is missing before version 1.4. The earliest version of Vulkan to require SPIR-V 1.4 is Vulkan 1.2, which I thought was unavailable on most Android devices. Can we really require it?

kvark · 2021-09-07T21:55:18Z

I don't think we require this SignedZeroInfNanPreserve. The precise_math is basically a "best effort" attribute. If SPIR-V doesn't support SignedZeroInfNanPreserve, then we don't use it.

So this is a good idea, but it should be elevated to module-level

There is definitely value in having it exposed in a more granular level than the module scope:

SPIR-V implementations can use it
Metal implementations that have generate MTLLibrary at pipeline creation time (wgpu and Dawn, at least) can generate some entry points precise and other with fast math, if requested by the user.

kdashg · 2021-09-08T01:07:27Z

WGSL meeting minutes 2021-09-07

DN: Further discussion today: Concerns that it’s not testable. Also concerned that it’s not stable, no way to make sure that (because untestable) it will keep working. One possibility is to make it an extension with actual strict requirements, but without signing us up for these sometimes-impossible strict requirements in core.
MM: In version of spir-v that wgsl is targeting, there’s no way to guarantee/require ieee floats?
DN: OpenCL can, but not spir-v in general.
DM: Spir-v doesn’t support some things, but not everything we need?
DN: NoContraction is visible, feature in vulkan-spirv. Not sure how strict the vulkan conformance tests are good enough to guarantee what we need. I would need more time to test whether it’s feasible on vulkan.
MM: When this extension is enabled, we can test that the math would be right. Problem is when we don’t have the extension, where we can’t really guarantee anything.
DN: When this was “SHOULD”, it’s easy to try for. Making an extension would require more work to see if we can support “MUST”.
GR: Why was it said that this would have no effect on DX12?
DN: I think the original poster did minor testing and didn’s have issues, so didn’t look into this?
GR: We do have precise which should work, but yeah, not sure how to test non-precise.
DM: precise is applied to variables? (yes)
MM: Another question, is can global variables be precise? This leads me to a recommendation that it be per-module, and also that this all Metal can handle.
DN: Could (galaxy-brain idea) compile multiple times with and without precise as needed, since we control when entrypoints are used?
MM: Compiling multiple times would be bad, because compiling is already slow.
DM: I think we sort of already handle this (function granularity) in Metal backends.
MM: Why per-function, when no native API does that. Metal is per-module, others are per-variable.
DN: There’s some concerns about how deep (into function calls?) to propagate precise when on variables, at least in a way that’s not super verbose.
(timebox hit, tabled to next meeting)

dneto0 · 2021-09-08T13:53:59Z

To fill in some a detail:

First, @kvark is right about the implications of the best-effort framing
That said, the SignedZeroInfNanPreserve was first made available in a Vulkan extension VK_KHR_shader_float_controls / SPIR-V extension SPV_KHR_float_controls which has support going back almost three years..
You can use the SPIR-V extension in pre 1.4 SPIR-V modules if you declare the extension (OpExtension "SPV_KHR_float_controls"). Saying the feature was "incorporated into 1.4" means you can use the feature without having to declare the extension.

kvark · 2021-09-08T14:05:50Z

My reading of the current state of the debate is that we need to decide if this functionality is testable or not. I believe having it testable would make a stronger API, and thus we need to explore this path before proceeding (with this PR as it stands now).

It sounds like DX12 and Metal support this "precise" mode unconditionally, and there is a chance we'll be able to test it. In Vulkan, it's more complicated. As @dneto0 noted, there is an extension. However, one has to check for the properties of this extension before using them: https://vulkan.gpuinfo.org/listpropertiesextensions.php?extension=VK_KHR_shader_float_controls&platform=all
It's concerning to see "shaderSignedZeroInfNanPreserveFloat32" only supported by "46%" of reports.

If we make this an optional feature, we'd deny access to it for users who either don't care about shaderSignedZeroInfNanPreserveFloat32 specifically, or happy with Vulkan driver behavior by default. I don't think we want to end up in a situation where people write if features.contains(PreciseMath) || IsVulkan().

dneto0 · 2021-09-08T14:37:53Z

My reading of the current state of the debate is that we need to decide if this functionality is testable or not.

Agreed. I wasn't sure on the call yesterday, so I investigated what Vulkan does to test NoContract:

The NoContract feature has been supported by SPIR-V / Vulkan from the start.
Its test is here

The test tempts the compiler to fuse a multiply-add into one operation (FMA).
FMA is spec'd to produce a rounded result where the intermediate results are computed with infinite precision and accuracy. The test uses sample values that produce catastrophic cancellation. A fused operation would produce a tiny-magnitude number (2**-46), but a non-fused result produces either zero or a small but larger number (2**-24).

This depends crucially on the fact that certain basic operations (add, subtract, multiply) are "correctly rounded" (as defined by IEEE 754, and adopted by Vulkan and WGSL).

In general, catastrophic cancellation can be used to magnify errors for other undesirable cases: reassociation, distribution of multiply over addiiton.

So I think fusing, reassociation, and distribution aspects are testable.

mrshannon · 2021-09-08T14:38:15Z

In an ideal world precise (meaning no fusing, reassociation, or distribution) and the other fast math optimizations would be separate. It seems they can be on DX12 and Vulkan, but as far as I could tell Metal is all or nothing. I think a majority of use cases could be solved with precise alone. So perhaps a lesser feature could be made core where at some level of:

module
function
variable

precise mode could be enabled which would not enable shaderSignedZeroInfNanPreserveFloat32 (as support for that is not great, even on desktops) but would just:

Use precise on DX12
Use NoContraction and possibly Invariant on Vulkan
Use -fno-fast-math on Metal.

We already have the invariant qualifier which maps to precise in HLSL but it can only be used for the built-in position output. Also this maps to Invariant in SPIR-V and not NoContraction while precise in HLSL implies both.

kvark · 2021-09-08T15:01:46Z

Metal is not exactly all or nothing. As @kainino0x pointed in #2076 (comment), we can pick a subset of fast-math stuff. It sounds like you are suggesting to adopt the current PR but cut out everything related to VK_KHR_shader_float_controls, since it's not universally available. This means Metal compiler wouldn't need "-fno-signed-zeros" for example, and possibly other things. Do I understand your proposal, @mrshannon ?

Then we can have an optional feature exposing something that captures VK_KHR_shader_float_controls functionality, as a follow-up.

mrshannon · 2021-09-08T15:10:47Z

Do I understand your proposal, @mrshannon ?

Yes, just disable fusing, reassociation, and distribution. With signed zeros and such not universal, and the lack of example code that would be effected by them I am proposing scaling back to only what precise in HLSL promises as there is plenty of rendering code in the wild which relies on that.

Metal is not exactly all or nothing. As @kainino0x pointed in #2076 (comment), we can pick a subset of fast-math stuff.

I was not sure if that was kosher since it was not documented in the Metal spec.

Then we can have an optional feature exposing something that captures VK_KHR_shader_float_controls functionality, as a follow-up.

Or you could wait until someone needs it, its probably a failure of my imagination but I can't think of a case where asymptotic limits would be of use in rendering.

kvark · 2021-09-08T15:21:13Z

The last commit here describes this semantics. I'm sure @dneto0 would want to put more technical details of what is preserved, adding examples and such, and I'm hoping we can follow-up with this.

munrocket · 2021-09-08T18:12:01Z

Love to see where it is going, thanks @kvark. Floating point expansion definitely not rely on NaN/Infinity/SignedZero's.

litherum · 2021-09-09T08:24:15Z

From talking with the Metal team, we haven't gotten requests to apply fastMath per function rather than per MTLLibrary.

This makes intuitive sense, because the use cases that need IEEE precision are things like scientific computing, where it's likely that all the functions in the library will need to be precise. Conversely, for use cases like games, it's likely that none of the functions in the library will need to be precise.

(Games do need things like the invariant keyword, but that's a different thing.)

litherum · 2021-09-09T08:28:11Z

Metal is not exactly all or nothing. As @kainino0x pointed in #2076 (comment), we can pick a subset of fast-math stuff.

These things aren't API. Ideally, WebGPU / WGSL wouldn't rely on anything that isn't API in the 3 backend APIs. The API is a single boolean switch.

(Anything that isn't API is unsupported, and able/willing to be removed at any point in the future.)

litherum · 2021-09-09T08:35:34Z

It would be unfortunate to make fastMath a "best effort" attribute.

From an author's perspective, what's the point of a precision guarantee if the guarantee isn't actually guaranteed?

From an implementor's perspective, why would an implementor implement any of the feature at all if it just slows down code and doesn't actually have any expected (testable) behavior? Or, stated a different way: Let's say I want to implement this feature in a particular WebGPU implementation, and I sit down and start typing code into the computer to do it. How do I know when I'm done? Why shouldn't I consider myself to be done implementing the feature before writing a single line of code?

kvark · 2021-09-09T13:39:42Z

@litherum it sounds like the desire to have this behavior testable is shared between all parties, so it's good to have this settled. The last version of the PR, which I mentioned in #2080 (comment), already makes it normative. It just doesn't spell out the exact norms affecting it, which is intended to be written at some point. So, no "best effort" any more.

As for the scope of the change, I'm curious what use cases are to consider. From the distance, it felt useful to be able to make, say, vertex shaders precise but not the fragment shaders. Or even just computation of one specific output of a vertex shader. But I haven't used this myself, so happy to hear ISV feedback!

@mrshannon could you share the intended usage of this attribute? Would you be doing it for the whole module, or potentially more granularly?

mrshannon · 2021-09-09T14:11:53Z

@mrshannon could you share the intended usage of this attribute? Would you be doing it for the whole module, or potentially more granularly?

First, I am specifically talking about precise as it exists in HLSL (rearrangement etc), not signed zero and the rest. We have two use cases:

The first is extremely large scale terrain generation in a compute shader which requires double precision. An existing example of this is Elite Dangerous which uses real doubles on some cards and emulated doubles (which require precise) on others. Their reason for emulation is because it's faster than the real thing on some cards, our reason is because we don't have real doubles at all. In this case, while there will be calculations in the compute shader which do not require precise it is likely that at least half of the compute shader module will require it.

Use in the wild: Generating the Universe in Elite Dangerous

The second case is when rendering very large objects (which cannot be handled in other ways). To avoid jitter we need to perform the model to camera space transform in double precision sometimes. Therefore, again emulated doubles. But in this case the calculation is in the vertex shader and is pretty narrow in scope as it is just used for the model to world transform and furthermore is only used on a small subset of vertices (those close to the camera). Therefore it would be undesirable to require precise at the module level since variable, statement, or function level would allow the disabling of the optimizations at a narrow scope for a tiny part of the vertex shader and in our case only on some invocations.

Use in the wild: 3D Engine Design for Virtual Globes

Conversely, for use cases like games, it's likely that none of the functions in the library will need to be precise.

(Games do need things like the invariant keyword, but that's a different thing.)

This is not true, see Generating the Universe in Elite Dangerous. What is required is not IEEE but specifically the guarantees that HLSL gives with its precise decorator which is more than what invariant guarantees, except on DX12 where invariant maps to precise.

In general there are cases where floating point error needs to be mitigated, even in rendering, which requires controlling the order of operations.

kainino0x · 2021-09-10T06:32:09Z

As @kainino0x pointed in #2076 (comment), we can pick a subset of fast-math stuff.

FWIW the flags I pointed to can probably only be used when invoking an MSL compiler via command line, but not via newLibraryWithSource. However I found the associated clang pragmas:
https://clang.llvm.org/docs/LanguageExtensions.html#extensions-to-specify-floating-point-flags
(I haven't tested them, and it's entirely possible they don't actually work because MSL's LLVM backend doesn't understand them.)

Of course @litherum's point that these aren't officially supported still stands.

mrshannon · 2021-09-14T22:51:20Z

Here implementation of emulated double in WebGPU that works right now in Chrome/Firefox on MacBook and PC with Linux, uncomment trick with mix if you testing precise math. https://codepen.io/munrocket/pen/vYZgyqa

@munrocket Not sure that it is working on Windows, is the top of the fractal supposed to be filled with strange bands.

Also not sure you need mix, the select, or an if statement, should be enough.

munrocket · 2021-09-14T23:27:04Z

@mrshannon yes, it shows that float32 with limited precision. It’s intentional.

I am started to think that fast-math is pretty ok even for this purposes because Dekker multiplication algorithm become smaller in x10 (2 FLOP vs 17 FLOP) with hardware fma instruction. It is implicitly inherited from fma(a,b,c) in current WebGPU implementation in Chrome/Firefox. Also with select trick you could implement NoContract for Moller/Knuth’s summation and it will be calculated correctly but little bit slower. At lest on my machines all works pretty good. 😍

If you going to expose precise math in this PR then fma(a, b, c) will become twice rounded expression RN(RN(a * b) + c). And your will need to use more slower algorithm. I don’t know could you add support for hardware fma in this PR or not. But currently it is a trade-off.

Fast multiplication and slow summation VS slow multiplication and fast summation

mrshannon · 2021-09-14T23:31:33Z

@munrocket We just tested the select trick in our implementation and it works. Thanks for the idea.

We are likely to use it over this PR (even if it is merged in) as it has better performance on Metal due to not disabling all optimizations and works at the expression and not at the module level.

munrocket · 2021-09-15T00:54:15Z

Glad to help. The only reason why someone will still need to turn off fast-math if they detect that host doesn’t support FMA in hardware. After that fma emulation with select trick will be painful.

I am removed mix in NoContraction trick thanks, because reordering turned off without it. Also if you find that some devices not support this, please share.

kvark · 2021-09-15T14:28:54Z

Hey users, if you keep finding nice hacks and workarounds for this, we'll have no incentive to do anything with the spec! 🤪

munrocket · 2021-09-15T15:24:22Z

Ha-ha, that was fun.

It's actually miracle how it works. Because current rounding is UB and not specified, as well as fast-math mode. This PR still have potential. For example if somebody figure out how to turn on correct math and fused-multiply-add at the same time, mrshannon probably will use it.

kdashg · 2021-09-15T17:14:24Z

WGSL meeting minutes 2021-09-14

DN: Discussing earlier and with MS. Our concern is two things
- Making sure when targeting FXC that the math survives as well as we hope. We’re concerned about stability/reliability here
- The name precise may end up promising more than all underlying platforms can do, so we may want to revisit the name if we can’t guarantee its behavior everywhere.
- Thanks for the feedback about infinities and NaNs not being too important
MS: Requires operations to not be reordered
DN: Ops that are correctly rounded are +/-/*, but division is a harder request. Do you need division? (maybe?)
DM: MM worried about spooky action at a distance. (SAAAD)
KN: If this is defined as best-effort, at least for Metal, I think would prefer to use clang pragmas rather than SAAAD. I’m worried that we wouldn’t want to use all the flags, for perf reasons. Just reordering is less bad.
DN: Rough consensus that we want this to be testable, rather than a pure hint.
DN: I want an investigation to show that assured non-reordering and non-reassociation are both implementable. On DX11 (FXC), DX12, and Vulkan (desktop and mobile).
DM: Sounds like something we need to figure out before v1.
AB: Why?
DM: We see real issues, MS and users of MoltenVk exist and they need this.
JG: Is this a need that e.g. webgl already had.
MS: We’re going from desktop directly to WebGPU. There’s extant code for this. Without this, we would need to do vert shading on CPU. We do that today, but we’re expecting to want to change this. We’d be pretty sad to not have this.

dneto0 · 2021-09-15T18:29:07Z

Hey @munrocket thanks for this technique!

And thank you also for a nice compact test case. We had been discussing the need for a good way to test the behaviour.

Some thoughts:

Will this continue to work with future implementations? I think basically yes. The select introduces a data dependency and possibly a control dependency on the test condition. The compiler must not be able to know the value of that condition (statically). The safest thing (from the programmer's perspective) is to pass in a known-to-be-zero value from the outside (a parameter buffer), and compare a value against that. This is what GLSL fuzz does (blog paper)
What's the performance cost? I think probably low? That assumes a few things: (a) we care most about throughput (b) the test condition is cheap (e.g. compare against opaque zero) (3) the implementation evaluates both options, and uses predicated execution, and at least one side is cheap. Then I would guess the additional costs are small, so performance is probably going to be pretty good.

So two thumbs up for this technique!

dneto0 · 2021-09-15T18:29:37Z

Hey users, if you keep finding nice hacks and workarounds for this, we'll have no incentive to do anything with the spec!

It's a feature, not a bug. :-)
And this is why open processes can be so great.

dneto0 · 2021-09-15T18:30:47Z

Another thing about the performance cost: Yes, this prevents the compiler from rearranging code to go faster, but that's exactly what the programmer wanted.

munrocket · 2021-09-15T19:53:24Z

Will this continue to work with future implementations?

It works with round-to-nearest-even floating point rounding, which is default usually, but not specified for some reason in DX11 for example. Also as mentioned: floating point arithmetic not associative, muladd should be allowed only for fma.

What's the performance cost?

Usually for emulated double addition 20 flop, multiplication 24 flop with software FMA, 9 flop with hardware. So it's cheap. When we using select I don't know, but it is possible to measure. Probably select not so perfect, it's branching?

This is what GLSL fuzz does, papers

Interesting, if we need a stronger confidence, we can pass variable there.

dneto0 · 2021-09-15T23:07:09Z

About the performance cost, I meant the additional performance cost of using the select. Thanks for the extra info for the cost of the double precision emulation. :-)

Right, rounding mode is not specified for graphics APIs because some devices use round-to-even, some use round-to-zero (which is cheaper in hardware).

Does select do branching? It is common for GPUs to use predicated execution: they execute both paths, but selectively turn off side effects of that path "not taken", and then only use the chosen result. (wikipedia This trades off possibly wasting cycles stepping through the dead code path, but saves the machine from taking a branch and destroying internal state.

So that's why I would hope to make the evaluation of the "other" path and the condition cheap: we want that so on a predicated execution they don't waste too much extra time.

litherum · 2021-09-21T07:42:14Z

@mrshannon

The first is extremely large scale terrain generation in a compute shader which requires double precision.

The second case is when rendering very large objects ... in this case the calculation is in the vertex shader and is pretty narrow in scope

Both of these use cases are supported by putting the precise attribute on the entire module rather than the individual function - just put these two entry points and their dependent functionality in a separate module. Linking a vertex shader from one module and a fragment shader from a different module is supported.

kvark · 2021-09-21T15:24:47Z

If I understand correctly, we are mostly fine with introducing precise_math attribute (in a way that we can test), we just can't agree on what scope it covers:

function scope (the current shape of this PR):
- This is nice for SPIR-V and HLSL.
- On Metal, it has a SAaaD effect: adding an attribute to one function can end up affecting other functions. On wgpu and Dawn, it would affect all the functions in the call graph of a specific entry point (if one of them has the attribute). On Safari's implementation, it would affect all the functions in the module.
entry point scope:
- This is nice for Metal via wgpu or Dawn, since they build MTLLibrary per entry point.
- Has SAaaD on SPIR-V and HLSL, since using a function from another entry point (which has the attribute enabled) would make it slower for other entry points using this function.
- Has SAaaD on Metal in Safari, since one entry point will affect the others.
module scope:
- Can be mapped to all of the APIs
- less optimal for SPIR-V and HLSL
- No SAaaD

dneto0 · 2021-09-21T17:05:32Z

The user has reasonable workaround, and it appears to be performant and likely stable over time. I thought this was an easy "not in V1.0" decision.

kvark · 2021-09-21T17:16:35Z

I'm not happy about this workaround becoming sort of a tribal knowledge thing.
If we consider it good that people do this, can we hide the workaround behind something like:

fn compute(val: T) -> T

So doing let x = compute(a * b) + c would effectively put NoCompaction attribute on the intermediate result (and precise in HLSL). On Metal, it could use the select trick internally.

mrshannon · 2021-09-21T21:53:28Z

Both of these use cases are supported by putting the precise attribute on the entire module rather than the individual function - just put these two entry points and their dependent functionality in a separate module. Linking a vertex shader from one module and a fragment shader from a different module is supported.

It would be wasteful in the 2nd case. Perhaps as much as 10% of vertices (depending on camera location) in any given object need emulated double vertex position. The rest can take the faster 32-bit float path as they are further from the camera.

mrshannon · 2021-09-21T22:02:08Z

So doing let x = compute(a * b) + c would effectively put NoCompaction attribute on the intermediate result (and precise in HLSL). On Metal, it could use the select trick internally.

Either this or actual function scope (including Metal) would keep us from using the select trick. I agree with the tribal knowledge issue but I am not going to severely harm performance to avoid it. Not sure compute is the right term but I can't think of anything better at the moment.

kvark · 2021-09-28T14:39:01Z

@litherum it looks like MSL supports [[clang::optnone]] on functions - KhronosGroup/SPIRV-Cross#1746 . We could consider it as a direct effect of [[precise_math]] in WGSL.

kainino0x · 2021-09-28T16:31:50Z

[[clang::optnone]] seems like far too heavy of a hammer to me.

The optnone attribute suppresses essentially all optimizations on a function or method, regardless of the optimization level applied to the compilation unit as a whole. This is particularly useful when you need to debug a particular function, but it is infeasible to build the entire application without optimization. Avoiding optimization on the specified function can improve the quality of the debugging information for that function.

kdashg · 2021-09-29T17:05:37Z

WGSL meeting minutes 2021-09-28

(Previously: MM: Let’s postpone.)
DM: Offline, discussed MSL’s clang::optnone, which is on function scope. Kai noted it’s a big hammer and may not do what’s wanted.
MM: I thought we postponed this until after MVP.
DM: New information came up.
MM: Would like to re-propose postpone to MVP. That attribute is not part of the Metal API. So you can’t rely on it. Don’t think it’s a good solution.
DM: Other idea is to expose a function that shields optimizations across its argument vs. its result. Think we can implement it well on all backends. (NoContract on SPIR-V, or select trick as discussed on the issue.)
MM: Not familiar with the technique, and I didn’t prepare; think it was a mistake to put this on the agenda.
DN: Also think this can be postponed until after MVP. Have workaround, even if it’s in the “lore” category.
JG: Will mark as milestone Post-V1.

revoking my own review. Let's reconsider with fresh eyes

greggman · 2024-04-07T20:36:14Z

I'm not sure this idea appeared above but .... what about module level flag that only works if a feature like "high-precision" exists? So you check if the adapter supports "high-precision". If it does you request a device with {requiredFeatures: ['high-precision']}. Now you can pass 'high-precision' to createShaderModule

This way, if an GPU/driver can't pass the high-precision CTS tests it doesn't advertise the feature.

If you don't like features bleeding into WGSL you could move the check into pipeline creation where you use the precision keywords/options in WGSL but when you go try to make a pipeline, if you didn't request the 'high-precision' feature then you get an error your shader isn't supported on this device.

TimTheBig · 2025-10-26T19:05:03Z

Is there any way I can get this moving again?

TimTheBig · 2025-10-26T19:06:32Z

wgsl/index.bs

+      * without [=Reassociation|reassociating=] subexpressions
+
+    Note: this translates to `NoContraction` decoration in SPIR-V, `precise` qualifier in HLSL,
+    and a subset of `"-fno-fast-math" group of compile options in MSL.


and a subset of `"-fno-fast-math" group of compile options in MSL.

Should the subset not be documented?

precise_math attribute on functions

c2e4178

kvark mentioned this pull request Sep 1, 2021

Add method to disable fast-math on a per-shader basis. #2076

Open

dneto0 reviewed Sep 1, 2021

View reviewed changes

Apply David's suggestions

9c8c768

kvark requested a review from dneto0 September 1, 2021 20:21

kvark commented Sep 1, 2021

View reviewed changes

wgsl/index.bs Outdated Show resolved Hide resolved

dneto0 previously approved these changes Sep 1, 2021

View reviewed changes

dneto0 added the wgsl WebGPU Shading Language Issues label Sep 1, 2021

dneto0 added this to the V1.0 milestone Sep 1, 2021

Remove the nan/zero handling

09e6e9c

kdashg modified the milestones: V1.0, post-V1 Sep 28, 2021

kdashg mentioned this pull request Apr 26, 2022

Expose fast-math #2077

Closed

kdashg mentioned this pull request Nov 28, 2023

Missing intrinsic functions like rcp and rsqrt that are useful in other shading languages #4092

Open

rconde01 mentioned this pull request Apr 7, 2024

Do we need a high-precision vs speed option? #4562

Closed

TimTheBig reviewed Oct 26, 2025

View reviewed changes

precise_math attribute on functions #2080

Are you sure you want to change the base?

precise_math attribute on functions #2080

Uh oh!

Conversation

kvark commented Sep 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 1, 2021

Uh oh!

dneto0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dneto0 Sep 1, 2021

Choose a reason for hiding this comment

Uh oh!

kvark Sep 1, 2021

Choose a reason for hiding this comment

Uh oh!

kvark Sep 1, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Sep 1, 2021

Uh oh!

dneto0 left a comment

Choose a reason for hiding this comment

Uh oh!

litherum commented Sep 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

litherum commented Sep 7, 2021

Uh oh!

kvark commented Sep 7, 2021

Uh oh!

kdashg commented Sep 8, 2021

Uh oh!

dneto0 commented Sep 8, 2021

Uh oh!

kvark commented Sep 8, 2021

Uh oh!

dneto0 commented Sep 8, 2021

Uh oh!

mrshannon commented Sep 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kvark commented Sep 8, 2021

Uh oh!

mrshannon commented Sep 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kvark commented Sep 8, 2021

Uh oh!

munrocket commented Sep 8, 2021

Uh oh!

litherum commented Sep 9, 2021

Uh oh!

litherum commented Sep 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

litherum commented Sep 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kvark commented Sep 9, 2021

Uh oh!

mrshannon commented Sep 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kainino0x commented Sep 10, 2021

Uh oh!

mrshannon commented Sep 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

munrocket commented Sep 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kvark commented Sep 1, 2021 •

edited

Loading

litherum commented Sep 7, 2021 •

edited

Loading

mrshannon commented Sep 8, 2021 •

edited

Loading

mrshannon commented Sep 8, 2021 •

edited

Loading

litherum commented Sep 9, 2021 •

edited

Loading

litherum commented Sep 9, 2021 •

edited

Loading

mrshannon commented Sep 9, 2021 •

edited

Loading

mrshannon commented Sep 14, 2021 •

edited

Loading

munrocket commented Sep 14, 2021 •

edited

Loading

mrshannon commented Sep 14, 2021 •

edited

Loading

munrocket commented Sep 15, 2021 •

edited

Loading

litherum commented Sep 21, 2021 •

edited

Loading