Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

stephentoub
Copy link
Member

This just replicates the float path to be used for double as well, expanding out the constants used to be more precise.

With #97846, closes #96452.

This just replicates the float path to be used for double as well, expanding out the constants used to be more precise.
@ghost
Copy link

ghost commented Feb 2, 2024

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

This just replicates the float path to be used for double as well, expanding out the constants used to be more precise.

With #97846, closes #96452.

Author: stephentoub
Assignees: -
Labels:

area-System.Numerics

Milestone: -

Comment on lines 13063 to 13065
private const double LOGV = 0.6931610107421875;
private const double HALFV = 1.000013828277588;
private const double INVV2 = 0.24999308586120605;
Copy link
Member

@tannergooding tannergooding Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are likely going to introduce more imprecision than necessary.

LOGV is 0.693161 which is meant to be the nearest to the infinitely precise abs(log(0.5)), but with a touch of imprecision that accounts for the rounding error over the entire set of logic. HALFV is likewise 1 and INVV2 is likewise 0.25, both with a touch of explicit imprecision.

This comes from the actual algorithm where cosh(x) is (e^x + e^-x) / 2 or ((e^(2 * x)) + 1) / (2 * e^x) or (1 + e^(-2 * x)) / (2 * e^-x). All of this can simplify down to e^(|x| + log(0.5)) + 0.25 / e^(|x| + log(0.5)), i.e. the algorithm used here.

AOCL LIBM doesn't provide an official vectorized implementation for double, unfortunately, so we'd have to compute the correct adjustments ourselves. Short of that, simply using 0.6931471805599453, 1.0, and 0.25, is the next best thing and will be the closest to correct for us.


The actual logic used to pick these constants requires us to consider that for the infinitely precise logv, the nearest representable float (0.693147182464599609375) is off by approx 0.000000001904654299958.

The used LOGV value used for float ends up picking 0.63161f which is off by 0.0000138301822422 instead (note that this is part of the imprecision we then see in HALFV).

The used value for HALFV is then off by 0.000000001904654309375 from the infinitely precise 1.0000138301822422 (note that this is roughly the value that the defined LOGV is off by).

For INVV2 it picks 0.24999309 which is off from 0.25 by 0.0000069141387939453125, I've not done the work to figure out this where this adjustment factors in yet as I've been helping on a few other issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Meaning I should just update them to 0.6931471805599453, 1.0, and 0.25, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for now. I'll work on finishing figuring out the adjustment factor for 0.25 and then get then adjust to the "proper" values for double later.

@stephentoub stephentoub merged commit d1f0e29 into dotnet:main Feb 3, 2024
@stephentoub stephentoub deleted the vectorizehyp branch February 3, 2024 23:52
@github-actions github-actions bot locked and limited conversation to collaborators Mar 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vectorize several TensorPrimitive operations for double
2 participants