NumericalCompliance

Rounding modes

The floating point operations of the VideoCore IV QPU processors round using the IEEE 754 rount-to-zero rounding mode. The rounding mode corresponds with the OpenCL CL_FP_ROUND_TO_ZERO mode and is allowed as only supported rounding mode for OpenCL 1.2 embedded profiles. [Source]

NOTE: The CPU uses the IEEE 754 round-to-nearest-even rounding mode, which means that the results for a CPU and GPU floating-point operation might not match! To be exact, the result of the GPU operation is either the same or 1 ULP closer to zero as the result of the corresponding CPU operation.

As an example, consider this calculation: 358.6662292480469 - 3.0502657890319824:

Calculation	Rounding mode	Result	Bit-cast integer
"Exact"	double precision	355.6159634590149
CPU	round-to-nearest-even	355.615966796875	0x43b1ced8
GPU	round-to-zero	355.6159362792969	0x43b1ced7

Neither of those results match the "exact" one, but both are "the closest value that can be represented" with single-precision floating point values using the corresponding rounding modes. Using the bit-cast integer representation, the difference of exactly 1 digit (corresponding to 1 ULP in floating-point) can be seen clearly.

Inf, NaN, Denormals

Inf is supported (at least by SFU)
NaN is not supported

Relative Error

If x is a real number that lies between two finite consecutive floating-point numbers a and b, without being equal to one of them, then ulp(x) = |b - a|, otherwise ulp(x) is the distance between the two non-equal finite floating-point numbers nearest x. Moreover, ulp(NaN) is NaN.

[Source]

The relative ULP is 2^-23 ≈ 1.19e^-07 [1] for single-precision floating-point values. So e.g. nextafter(1, 2) will return 1 + 2^-23 ≈ 1.000000119 [2].

Built-in Functions

Function	Allowed (in ULP)	Maximal error
x + y	correctly rounded (round-to-zero)	0
x - y	correctly rounded (round-to-zero)	0
x * y	correctly rounded (round-to-zero)	0
1.0 / x	3
x / y	3
acos	4
acospi	5
asin	4
asinpi	5
atan	5	1
atan2	6
atanpi	5
atan2pi	6
acosh	4
asinh	4
atanh	5
cbrt	4
ceil	correctly rounded	0
clamp	0	0
copysign	0	0
cos	4	2
cosh	4	2
cospi	4
cross	3	0
degrees	2	2
distance	5.5 + 2 * len(vector)	4
dot	2 * len(vector) - 1	0
erfc	16
erf	16	1
exp	4	1
exp2	4
exp10	4
expm1	4
fabs	0	0
fdim	correctly rounded
floor	correctly rounded	0
fma	correctly rounded	0
fmax	0	0
fmin	0	0
fmod	0
fract	correctly rounded
frexp	0
hypot	4
ilogb	0
length	5.5 + len(vector)	4
ldexp	correctly rounded
log	4	4
log2	4
log10	4
log1p	4
logb	0
mad	infinite
max	0	0
maxmag	0
min	0	0
minmag	0
mix	absolute error of 1e-3	0
modf	0
nan	0
nextafter	0	0
normalize	4.5 + len(vector)	7
pow	16
pown	16
powr	16
radians	2	2
remainder	0
remquo	0
rint	correclty rounded	0
rootn	16
round	correclty rounded	0
rsqrt	4	1
sign	0	0
sin	4	1
sincos	4 (both)	2
sinh	4	2
sinpi	4
smoothstep	absolute error of 1e-5
sqrt	4	1
step	0	0
tan	5
tanh	5
tanpi	6
tgamma	16
trunc	correctly rounded	0
half_cos	8192
half_divide	8192	8192
half_exp	8192	8192
half_exp2	8192	8192
half_exp10	8192	8192
half_log	8192	8192
half_log2	8192	8192
half_log10	8192	8192
half_powr	8192	8192
half_recip	8192	8192
half_rsqrt	8192	8192
half_sin	8192
half_sqrt	8192
half_tan	8192
fast_distance	8192 + 2 * len(vector)
fast_length	8192 + len(vector)
fast_normalize	8192 + len(vector)
native_cos	impl.-defined
native_divide	impl.-defined	8192
native_exp	impl.-defined	8192
native_exp2	impl.-defined	8192
native_exp10	impl.-defined	8192
native_log	impl.-defined	8192
native_log2	impl.-defined	8192
native_log10	impl.-defined	8192
native_powr	impl.-defined	8192
native_recip	impl.-defined	8192
native_rsqrt	impl.-defined	8192
native_sin	impl.-defined
native_sqrt	impl.-defined	8192
native_tan	impl.-defined

Sources: OpenCL 1.2 FULL PROFILE OpenCL 1.2 EMBEDDED PROFILE

Calculations of ULP are done via one of the following methods:

Plotting the difference between the original function and the approximation with kmplot
Calculating the result for the functions with the native C implementation and the custom approximation and checking the difference ( On host only)

Edge case behavior

Currently not supported

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NumericalCompliance

Rounding modes

Inf, NaN, Denormals

Relative Error

Built-in Functions

Edge case behavior

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally