-
-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[MRG + 2] Sample-weight support for GaussianNB #4346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sklearn/tests/test_naive_bayes.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer a test where you explicitly construct a dataset with duplicate points, so that not all sample weights are the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
besides looks clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've extended the test by comparing a dataset with duplicate entries to a dataset without duplicates but correspondingly increased sample weight
sklearn/naive_bayes.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be " None
for uniform weights"? Not sure what the docstring usually says ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copy-paste from BaseDiscreteNB to keep docstrings consistent in the module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(n_samples,)
I've refactored _update_mean_variance() to reduce repetitions and added comments |
Great, thanks :) LGTM |
sklearn/naive_bayes.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shape (n_samples,)
with comma. It's a tuple with 1 element
besides +1 for merge thx @jmetzen ! |
FIX Converting sample_weight.sum() to float REFACTOR Reorganizing and commenting _update_mean_variance() in GaussianNB
TST Test GaussianNB sample-weight by somparing with duplicate entries
Docstring inconsistencies have been addressed |
Is the backticks commit from this PR? Otherwise could you rebase on master? |
Also, please add an entry to whatsnew under a new 0.17 heading. |
added to whatsnew and pushed. |
Thanks for merging! I was too slow ;-) |
Or I was too fast ;) Thanks for the contribution! |
This PR adds support for sample weights to GaussianNB.