Description
I would like us to refer to the Glossary in API reference for parameter descriptions that come up frequently, or which have associated caveats that are too long for parameter descriptions, most notably n_jobs
and random_state
.
So instead of something like:
random_state : int, RandomState instance or None, optional, default: None
If int, random_state is the seed used by the random number generator;
If RandomState instance, random_state is the random number generator;
If None, the random number generator is the RandomState instance used
by `np.random`.
in both KMeans and MiniBatchKMeans, we might have:
KMeans:
random_state : int, RandomState instance or None (default)
Determines random number generation for centroid initialization.
See :term:`random_state`.
MiniBatchKMeans:
random_state : int, RandomState instance or None (default)
Determines random number generation for centroid initialization and
random reassignment. See :term:`random_state`.
One question is how much verbosity we should have in describing how the user may parametrise random_state. We could have just See :term:`random_state`.
, or we could have An int seeds the random number generator deterministically, while None uses the current np.random state. See :term:`random_state`.
Just as I see us trying to describe what is random about the algorithm when describing random_state, I would like to see n_jobs
stating whether parallelism is only in fit, or in fit and predict, and what backend is used by default.
What do others think?