-
-
Notifications
You must be signed in to change notification settings - Fork 26.6k
Closed
Labels
EasyWell-defined and straightforward way to resolveWell-defined and straightforward way to resolvegood first issueEasy with clear instructions to resolveEasy with clear instructions to resolvehelp wanted
Description
In #9633, we've spotted a bad fp-comparison in naive_bayes.py
In naive_bayes.py, we have
if priors.sum() != 1.0:
raise ValueError('The sum of the priors should be 1.')
which sometimes fails unexpectedly.
import numpy as np
priors = np.array([0.08, 0.14, 0.03, 0.16, 0.11, 0.16, 0.07, 0.14, 0.11, 0.0])
my_sum = np.sum(priors)
print('my_sum: ', my_sum)
print('naive: ', my_sum == 1.0)
print('safe: ', np.isclose(my_sum, 1.0))
#('my_sum: ', 1.0000000000000002)
#('naive: ', False)
#('safe: ', True)
The problem is solved in GaussianNB for 'priors' (See #10005), but that's far from enough, we have similar problems, e.g., in discriminant_analysis.py:
scikit-learn/sklearn/discriminant_analysis.py
Lines 440 to 442 in 074a521
| if self.priors_.sum() != 1: | |
| warnings.warn("The priors do not sum to 1. Renormalizing", | |
| UserWarning) |
So the problem here is:
(1) Figure out a way to detect similar issues in the repo
(2) Fix these issues (honestly I don't think we need a test for these issues)
Metadata
Metadata
Assignees
Labels
EasyWell-defined and straightforward way to resolveWell-defined and straightforward way to resolvegood first issueEasy with clear instructions to resolveEasy with clear instructions to resolvehelp wanted