Open
Description
Describe the issue linked to the documentation
The documentation page for the fit
method of the LinearRegression
class mentions that the sample_weight
parameter must be of type array_like
or None
(docs). However this is not entirely true since we can also pass float
or int
for this parameter. Floats or ints get transformed into an array of that same value repeating n times. Code snippet here:
scikit-learn/sklearn/utils/validation.py
Lines 2000 to 2003 in f59c503
This makes it that a sample weight of
float
or int
is essentially equal to None
since they all have the same relative weight (not sure if I'm overseeing something, but could not think of any case where a float or int for sample_weight
could be meaningful).
Suggest a potential alternative/fix
I see two possible fixes:
- Change the documentation to address the fact that numbers are valid values for
sample_weight
however they have no effect since there is no difference in the relative weight of the samples. - Change the code so that an error or warning is raised if the
sample_weight
parameter is afloat
or anint
.