-
Notifications
You must be signed in to change notification settings - Fork 22
align equality check in yeo johnson transform #436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
align equality check in yeo johnson transform #436
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #436 +/- ##
==========================================
- Coverage 87.90% 81.94% -5.96%
==========================================
Files 40 44 +4
Lines 1745 1939 +194
==========================================
+ Hits 1534 1589 +55
- Misses 211 350 +139
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
Can you define eps = np.spacing(np.float64(1))or actually, if we don't do it for an array eps = np.finfo(np.float64).eps? |
|
I think another PR would make sense. |
|
Then let's merge this then? |
|
Can you merge main - I need to think about this. |
|
The question is, if when lambda is exactly eps, if we consider that to be zero, or not. I just wanted to make it consistent with the sklearn function. They however are not consistent themselves I think because they write # when x >= 0
if abs(lmbda) < np.spacing(1.0):
out[pos] = np.log1p(x[pos])
else: # lmbda != 0
out[pos] = (np.power(x[pos] + 1, lmbda) - 1) / lmbda
# when x < 0
if abs(lmbda - 2) > np.spacing(1.0):
out[~pos] = -(np.power(-x[~pos] + 1, 2 - lmbda) - 1) / (2 - lmbda)
else: # lmbda == 2
out[~pos] = -np.log1p(-x[~pos])Maybe that's the reason why Shruti wrote it herself? Because for the inverse transform she actually just uses the once from sklearn... mesmer/mesmer/mesmer_m/power_transformer.py Line 272 in 03ab48c
|
|
Shruti wrote this herself so she could have variable (or dependent) I think the |
|
scipy does the same but I think it's written by the same author: https://github.com/scipy/scipy/blob/7dcd8c59933524986923cde8e9126f5fc2e6b30b/scipy/stats/_morestats.py#L1572 |
Hm, I mean we could also pass every (value, lambda) pair to the sklearn power transform no? Like we do for the inverse transform.
In our case it is because lambda is derived from a logistic function. |
|
I don't get how it makes sense that in one case |
Yes, but then we have to check if this is vectorized - otherwise it will be too slow.
Ah ok, sorry - it's too many open PRs and comments. But then this is a function of the covariate function and it's not optimal if this is in mesmer/mesmer/mesmer_m/power_transformer.py Line 219 in 10c8c43
(Technically the user will not be able to easily replace the |
|
I agree, it is pretty hard to see through it all. The |
No - I was trying to understand why it's inconsistent in scikit-learn (and I think I do now). |
mathause
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, some more comments.
- I don't think what we choose will matter in practice.
epsis much smaller for 0 than for 1np.spacing([0, 1, 2])
array([4.94065646e-324, 2.22044605e-016, 4.44089210e-016])- That means there are values between 0 and eps (otherwise
np.abs(lambdas) < epswould be equal tonp.abs(lambdas) == 0)
Ok, let's merge this but I think we need to fix the title (we don't change a sign). Maybe: "align equality check in yeo johnson transform"?
for more information, see https://pre-commit.ci
* align equality check in yeo johnson transform with scikit learn yeo-johnson transform * add NOTE * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Changing the comparison to eps in
_yeo_johnson_transformto be consistent with sklearn's yet-johnsons power transform.