Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Ensure that functions's docstrings pass numpydoc validation #21350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thomasjpfan opened this issue Oct 16, 2021 · 215 comments · Fixed by #22780
Closed

Ensure that functions's docstrings pass numpydoc validation #21350

thomasjpfan opened this issue Oct 16, 2021 · 215 comments · Fixed by #22780
Labels
Documentation good first issue Easy with clear instructions to resolve Meta-issue General issue associated to an identified list of tasks Sprint

Comments

@thomasjpfan
Copy link
Member

thomasjpfan commented Oct 16, 2021

Background / Objective

Docstrings in Python are string literals that occur as the first statement in a module, function, class, or method definition.

These are some of the characteristics of a docstring:

  • Triple quotes are used to encompass the docstring text.
  • There is no blank line before or after the docstring.
  • The docstring is a phrase ending in a period.
  • more details

numpydoc is one set of criteria to check for consistent documentation structure.

Validating docstrings in scikit-learn

To ensure consistent documentation structure in scikit-learn, we are using numpydoc validation. Currently, documentation tests are failing for various functions. As a temporary fix, we have suppressed error messages in test_docstrings.py. Many of the functions in scikit-learn need to be updated to comply with numpy docstring validation. In the below issue, we provide step-by-step instructions on how contributors can test and update functions.

Note

For those who are running into "YD01: No Yields section found", it could be the cv parameter. Update An iterable yielding (train, test) splits as arrays of indices to:

        - An iterable that generates (train, test) splits as arrays of indices.

Steps

  1. Make sure you have the development dependencies and documentation dependencies installed.
  2. Pick an function from the list below and leave a comment saying you are going to work on it. This way we can keep track of what everyone is working on.
    2.1 Make sure you've created a separate branch from main before editing files for your new contribution. Refer to our contributing guidelines for more information.
  3. Remove the function from the list at:
    FUNCTION_DOCSTRING_IGNORE_LIST = [
  4. Let's say you picked sklearn._config.config_context, run numpydoc validation as follows.
pytest sklearn/tests/test_docstrings.py -k sklearn._config.config_context
  1. If you see failing test, please fix them by following the recommendation provided by the failing test.
  2. If you see all the tests past, you do not need to do any additional changes.
  3. Commit your changes.
  4. Open a Pull Request with an opening message Addresses #21350. Note that each item should be submitted in a separate Pull Request.
  5. Include the function name in the title of the pull request. For example: "DOC Ensures that config_context passes numpydoc validation".

Note: once you have issued 3 such PRs, feel free to move on to contributing more complex pull requests that involve more thinking and leave those issue fixes to first time contributors for them to learn the github contribution workflow :)

Functions to Update

@thomasjpfan thomasjpfan added Documentation Sprint good first issue Easy with clear instructions to resolve labels Oct 16, 2021
@thomasjpfan thomasjpfan changed the title Ensure that docstrings in functions pass numpydoc validation Ensure that functions's docstrings pass numpydoc validation Oct 16, 2021
@ABHIGPT401
Copy link
Contributor

I would like to work on this but I am a beginner so can you explain a bit more on where and how to run?

@ogrisel
Copy link
Member

ogrisel commented Oct 22, 2021

I would like to work on this but I am a beginner so can you explain a bit more on where and how to run?

@ABHIGPT401 The explanation is already quite detailed. Combined with our contributors guide I am not sure what we can add. Please ask specific questions on gitter: https://gitter.im/scikit-learn/scikit-learn if you need interactive help.

@chritter
Copy link
Contributor

chritter commented Oct 23, 2021

@fortune-uwha and I are going to work on sklearn.cluster._agglomerative.linkage_tree.

@gpablo6
Copy link
Contributor

gpablo6 commented Oct 23, 2021

Hi, we are working sklearn.cluster._kmeans.k_means @jmloyola

@arisayosh
Copy link
Contributor

Hi we are working on sklearn.metrics._regression.max_error. @iofall

@miwojc
Copy link
Contributor

miwojc commented Oct 23, 2021

Hi we are working on sklearn._config.config_context with @majauhar

@embandera
Copy link
Contributor

Working on sklearn.metrics.pairwise.euclidean_distances with @genvalen

@MaggieChege
Copy link
Contributor

Working on sklearn.cluster._optics.cluster_optics_dbscan with @muokicaleb

@fortune-uwha
Copy link
Contributor

@chritter and I are going to work on sklearn.model_selection._validation.cross_val_score

@iofall
Copy link
Contributor

iofall commented Oct 23, 2021

@arisayosh and I are working on sklearn.model_selection._validation.cross_val_predict

@isaacknjama
Copy link
Contributor

@dephans and I are working on sklearn.metrics.pairwise.paired_distances

@majauhar
Copy link
Contributor

Hi! We are working on validation.indexable @miwojc

@thomasjpfan
Copy link
Member Author

thomasjpfan commented Oct 23, 2021

For those who are running into "YD01: No Yields section found", it could be the cv parameter. Update An iterable yielding (train, test) splits as arrays of indices to:

        - An iterable that generates (train, test) splits as arrays of indices.

@embandera
Copy link
Contributor

Working on sklearn.metrics._classification.accuracy_score

@arisayosh
Copy link
Contributor

Working on sklearn.model_selection._split.train_test_split

@genvalen
Copy link
Contributor

Working on sklearn.covariance._empirical_covariance.log_likelihood

@genvalen
Copy link
Contributor

Working on sklearn.covariance._empirical_covariance.empirical_covariance

@awinml
Copy link
Contributor

awinml commented Oct 3, 2022

Working on sklearn.utils.extmath.safe_sparse_dot

@awinml
Copy link
Contributor

awinml commented Oct 4, 2022

Working on sklearn.utils.extmath.weighted_mode

@suryakapurothu
Copy link

Working on sklearn.metrics.pairwise.pairwise_kernels

@mansi1597
Copy link
Contributor

mansi1597 commented Oct 5, 2022

Working on sklearn.utils.extmath.svd_flip

@mansi1597
Copy link
Contributor

Working on sklearn.utils.fixes.linspace

@awinml
Copy link
Contributor

awinml commented Oct 5, 2022

Working on sklearn.utils.extmath.fast_logdet

@awinml
Copy link
Contributor

awinml commented Oct 5, 2022

Working on sklearn.utils.extmath.randomized_svd

@irene000
Copy link
Contributor

irene000 commented Oct 5, 2022

Working on sklearn.utils.metaestimators.available_if

@thatgeeman
Copy link
Contributor

Working on sklearn.utils.gen_even_slices

@thatgeeman
Copy link
Contributor

Working on sklearn.utils.gen_batches

@michpara
Copy link
Contributor

Working on sklearn.utils.metaestimators.if_delegate_has_method

@jeremiedbb
Copy link
Member

All functions now pass numpydoc validation. Thanks to everyone who contributed to this long standing issue !
We can close this issue and consider the numpydoc arc over :)

Repository owner moved this from Issues that are list of sub-issues to Done in WiMLDS Paris Sprint Oct 14, 2022
@glemaitre
Copy link
Member

glemaitre commented Oct 17, 2022

Nice. Thank you to everyone that contributed.

@avm19
Copy link
Contributor

avm19 commented Dec 30, 2022

For those who are running into "YD01: No Yields section found", it could be the cv parameter. Update An iterable yielding (train, test) splits as arrays of indices to:

        - An iterable that generates (train, test) splits as arrays of indices.

As of right now, not all entries of "yielding" have been removed from the source code, e.g.,

- An iterable yielding (train, test) splits as arrays of indices.

The error "YD01: No Yields section found" is still a problem for numpydoc 1.2, which is the currently required minimal version. The issue is resolved by updating it manually to 1.3.1, for example. This is another way of avoiding the YD01 error.

Perhaps, the minimal version requirements should be bumped up. I also wonder why no one else wrote about this issue, given that the quoted line has been around for 4 years according to blame: I can't be the only person whose package manager chooses numpydoc version 1.2, or am I?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation good first issue Easy with clear instructions to resolve Meta-issue General issue associated to an identified list of tasks Sprint
Projects
None yet
Development

Successfully merging a pull request may close this issue.