Thanks to visit codestin.com
Credit goes to github.com

Skip to content

DOC: Remove the tables of scalar types, and use ..autoclass to create link targets instead #17331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 6, 2020

Conversation

eric-wieser
Copy link
Member

@eric-wieser eric-wieser commented Sep 16, 2020

The page in question after this PR: https://16096-908607-gh.circle-artifacts.com/0/doc/build/html/reference/arrays.scalars.html

What's changed:

  • The tables of (name, short description, character code) have been replaced with standard class documentation blocks
  • The types are split into "canonical" and "aliased types". Arguably, this division is somewhat subjective, but the important thing is to distinguish the sized names from the C-linked names. To make this work, there's a very nasty hack in conf.py that adjusts PyTypeObject.tp_name on the scalar objects.
  • All of the scalar types now have valid link targets, which should fix broken references elsewhere

This page probably could do with some further cleanup in a future PR, but this should fix most of the technical issues and omissions.

See the commit message for more details

This likely closes gh-16884.

@eric-wieser
Copy link
Member Author

numpydoc seems to have the bizarre behavior of inserting all of the members and attributes under autoclass even if no :members: option is present. I don't know how to make that not happen.

@eric-wieser
Copy link
Member Author

eric-wieser commented Sep 16, 2020

There also seems to be a bug that turns :Alias *on this platform*: into Alias on this platform<em>on this platform</em>, which might be fixed in a newer sphinx

=================== ============================= ===============
.. inheritance-diagram:: byte short intc int_ longlong ubyte ushort uintc uint ulonglong half single double longdouble csingle cdouble clongdouble bool_ datetime64 timedelta64 object_ bytes_ str_ void

Signed integer types
Copy link
Member

@mattip mattip Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than ..autoclass:: numpy.byte, try

.. autosummary::
    :toctree: generated
    :nosignatures:

    numpy.byte
    numpy.short
    ...

Copy link
Member Author

@eric-wieser eric-wieser Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would probably work, but I don't really want a page per class, because then we lose the ability to see the character codes and aliases all in one place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I think you need to create a template to override the default one in doc/source/_templates/autosummary/class.rst and use a :template: option on the autosummary. I don't know if :template: works with autoclass

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autosummary shouldn't be relevant here - it's built on top of autodoc, not vice versa. I think numpydoc is injecting an autosummary into the autodoc docstring or something.

@takanori-pskq
Copy link

.. autoclass:: numpy.byte
   :exclude-members:

This prevents all the members from being listed (Maybe this behavior is undocumented, but already used in numpy document.)

@eric-wieser
Copy link
Member Author

Thanks @takanori-pskq, I'll try that

@eric-wieser
Copy link
Member Author

Wonderful, that did the trick!

@bjnath
Copy link
Contributor

bjnath commented Oct 2, 2020

+1 on this PR.

Here are suggestions; apologies if any are known issues.

image

  1. The entries would be clearer without the final period/full stop
  2. "Canonical name" should not be included; it just repeats the 'class' heading
  3. The [source] links all show the same place on the same page
  4. I'd suggest just "On this platform:" rather than "Alias on this platform"
  5. Can we suggest a test to find the platform alias if a user's platform is not the same as their browser's?
  6. The entries for classes numpy.longlong and numpy.ulonglong just show a character code
  7. There's inconsistency between the class headings, which use numpy, and the entries, which use np
  8. In the "Built-in scalar types" table, float_ and complex_ aren't linking
  9. The text at the start of the "Attributes" section should repeat the earlier point that they're read-only
  10. The numeric ranges should be punctuated with commas
  11. The second hierarchy figure is beautiful and unlike the first is complete. Perhaps it should replace the earlier one?
    image

@eric-wieser
Copy link
Member Author

eric-wieser commented Oct 2, 2020

Thanks for the feedback. Commenting on these one-by-one:

  1. The entries would be clearer without the final period/full stop

The full stop seems to be suggested by PEP 257

  1. "Canonical name" should not be included; it just repeats the 'class' heading

Agreed - that's there for users typing help(np.uint8), which the nasty hack in conf.py affects. We need to generate subtly different docstrings for sphinx and non-sphinx

  1. The [source] links all show the same place on the same page

That's true of all C functions, and probably something that should be fixed in numpydoc or sphinx

  1. I'd suggest just "On this platform:" rather than "Alias on this platform"

I'd like to keep the word alias there somehow

  1. Can we suggest a test to find the platform alias if a user's platform is not the same as their browser's?

That would probably make sense, perhaps at the top of the page, or in a separate section about sized aliases

  1. The entries for classes numpy.longlong and numpy.ulonglong just show a character code

Yes, because on linux they have no aliases.

  1. There's inconsistency between the class headings, which use numpy, and the entries, which use np

This is true of all autoclass directives. It might be possible to hide the numpy. prefix. Obviously we could just change the docstrings.

  1. In the "Built-in scalar types" table, float_ and complex_ aren't linking

Yes, I don't know how we want to handle that.

  1. The text at the start of the "Attributes" section should repeat the earlier point that they're read-only

I didn't think I changed this section

  1. The numeric ranges should be punctuated with commas

That's reasonable underscores would be better, because then the number is still legal python syntax for people copy-pasting

  1. The second hierarchy figure is beautiful and unlike the first is complete. Perhaps it should replace the earlier one?

Personally I think neither is what we want. The second one has some major drawbacks:

  • The layout is much less dense
  • There is no distinction between abstract and concrete classes
  • The HTML <map> element is misaligned, because the <img> is scaled with max-width: 100%, and browsers do not support scaling <map> from what I can tell

Regarding 1, 4, 7, and 10: These docstrings are already present when you use help(np.int64). Please feel free to make a PR to tidy those up (but for now, make sure it makes sense for terminal use).

@bjnath
Copy link
Contributor

bjnath commented Oct 3, 2020

  1. The entries for classes numpy.longlong and numpy.ulonglong just show a character code

Yes, because on linux they have no aliases.

I may not have made the problem clear. They don't have a canonical name entry either. There's nothing but a character code.

image

@eric-wieser
Copy link
Member Author

eric-wieser commented Oct 3, 2020

Right, that's deliberate too. The idea behind the canonical name is that in normal use, t.__name__ is not always the canonical name. So help(t) prints the canonical name in the docstring only if it is not already __name__. Try doing help(np.intc) and help(np.longlong). On most sane platforms, one of those will print as a sized int and show a canonical name, while the other will just print as itself. We can always change this, but the key takeaway is that on platforms with no 128-bit integer, there is always at least one integer scalar type which has no sized alias (as that alias is already taken by another integer of the same size).

Really, the "canonical name" entry should not be present on any of the scalar sphinx docs (as you remark in 2). This would be the case already, were it not for the fact that my hack executes after the docstrings have been computed.

@eric-wieser eric-wieser force-pushed the remove-scalar-tables branch from 93f5136 to 475295f Compare October 3, 2020 15:20
@eric-wieser
Copy link
Member Author

eric-wieser commented Oct 3, 2020

I've pushed an attempt at fixing a fix for "2." in the list above, "canonical name" no longer appears in the sphinx docs.

@eric-wieser eric-wieser marked this pull request as ready for review October 3, 2020 17:39
@eric-wieser
Copy link
Member Author

"8." is now fixed too

@bjnath
Copy link
Contributor

bjnath commented Oct 3, 2020

Wow, you got stuff under control quickly. Thanks, it's looking good.


Regarding the 2nd figure (item 11 above):

  • I didn't realize it was clickable, and lack of clickability doesn't seem a loss.
  • Though it doesn't distinguish classes, a caption can say that every leaf is concrete and every interior node is abstract.

Did you plan to replace both figures with a better figure? Otherwise, it's perplexing to see both. I like the second, but you're right, it isn't very compact.


The full stop seems to be suggested by PEP 257

The way PEP 257 applies the rule makes sense: There's a summary description that includes a verb, the question of punctuation comes up naturally, and a rule is useful. It seems improbable they want a full stop after every field of any content or length. Is that how you read it? Or am I looking at the wrong section?


Since the [Source] links are so misleading -- people will think the doc is broken -- is there any way to suppress them?

@eric-wieser
Copy link
Member Author

I didn't realize it was clickable, and lack of clickability doesn't seem a loss.

That was the main reason I added it

Did you plan to replace both figures with a better figure?

No, probably not. Perhaps I'll comment out the new inheritance diagram, and leave merging them for someone else to take on in future.


It seems improbable they want a full stop after every field of any content or length. Is that how you read it?

Oh, I thought by "entry" you meant "docstring". Sure, we could remove the . from the :fields: if you think that helps


Since the [Source] links are so misleading -- people will think the doc is broken -- is there any way to suppress them?

I have no idea.

@bjnath
Copy link
Contributor

bjnath commented Oct 3, 2020

we could remove the . from the :fields: if you think that helps

That's great! The wish list:

  • No periods on Character Code or Alias.
  • Prefer none for Alias on this Platform despite the roman text afterward, but OK if you deem it a PEP 257 case.
  • No problem on the text under the class heading ("Extended-precision floating-point...") -- I assume that's the case you thought I meant.

image

@eric-wieser
Copy link
Member Author

Comments addressed, I think

@eric-wieser eric-wieser force-pushed the remove-scalar-tables branch 6 times, most recently from 7ffb338 to d8cc5fa Compare October 3, 2020 23:02
@bjnath
Copy link
Contributor

bjnath commented Oct 4, 2020

Thank you, @eric-wieser, the changes look great.

There's a sentence possibly outside the scope of this PR, but perhaps you can comment. Can the indecisive

Class from which most (all?) numpy scalar types are derived.

instead read

Class from which most (all?) numpy scalar types are derived.

@eric-wieser
Copy link
Member Author

Yeah, I'd like to declare that out of scope, along with the scalar types that have no docstring at all.

@bjnath
Copy link
Contributor

bjnath commented Oct 4, 2020

Sure, no need to change it here. Were you planning to open issues for the scalar types that are missing docstrings?

@eric-wieser
Copy link
Member Author

I borrowed Guido's time machine to file #10106

@bjnath
Copy link
Contributor

bjnath commented Oct 4, 2020 via email

@mattip
Copy link
Member

mattip commented Oct 4, 2020

docs build is still not happy, there must be a stray * without a \:

docstring of numpy.ma.copy:2: WARNING: Inline emphasis start-string without end-string.

Maybe the error already exists and this PR exposes it somehow?

@eric-wieser
Copy link
Member Author

It's also possible the build is segfaulting due to the hacks in conf.py

@eric-wieser
Copy link
Member Author

Ah, the failure is real. What's happening is:

  • conf.py deliberately intercepts numpy.core._add_newdocs to prevent it from running
  • conf.py imports numpy, which imports np.ma.core, which derives the np.ma docstrings from the np.core docstrings. These are not yet present, so the ma docstrings end up as nonsense like copy(...)\nNone (we handle None badly)
  • conf.py lets numpy.core._add_newdocs import as normal, but the damage is done

How do we want to deal with the scalar renaming hack? We could add a builtins.__NUMPY_SPHINX_DOC_HOOK__ callable, which if present is run prior to the content of add_newdocs.py?

@eric-wieser eric-wieser force-pushed the remove-scalar-tables branch 2 times, most recently from 349f340 to b142a75 Compare October 4, 2020 15:20
@eric-wieser
Copy link
Member Author

Build passes. If everything looks good, I can rewrite history to squash to a nasty hack commit and a doc cleanup commit.

@mattip
Copy link
Member

mattip commented Oct 4, 2020

conf.py deliberately intercepts numpy.core._add_newdocs to prevent it from running

Where and why do we do this?

('tp_name', ctypes.c_char_p),
]

# prevent numpy attaching docstrings to the scalar types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, this is what @eric-wieser meant about docstings.

@mattip
Copy link
Member

mattip commented Oct 4, 2020

The new page looks great. If it is not too much trouble to clean up git history that would be nice.

Previously, these would all link to `numpy/core/__init__.py`.
Now the scalar type and `ndarray` link to the files where the `PyTypeObject` is defined.
In future, we should do this for all extension types, probably automatically.
… builds

By default, the `.__name__` of the numeric `np.generic` subclasses is their bitlength name, such as `np.int64`.
This is convenient when working interactively, because it lets users see the size of their array easily; but in docs it is confusing, as the sizes of the integers in the doc build may not match their size on the platform of the user reading them.
Without this change, `..autoclass:: numpy.short` would just display "alias of uint16", which is backwards.

Rather than changing the names globally, or adding a build flag to change the names, this uses `ctypes` to modify the scalar names at startup.
This resembles the approach taken by the `forbiddenfruit` module for patching builtin slots, although that would be overkill here.

The timing of when we perform this patching is important - we can't do it until after `numpy.core._umath_multiarray` has been loaded, but we need to do it before `numpy.core._add_newdocs` generates the name-based docstrings.
Similarly, we can't just disable `numpy.core._add_newdocs` until later, as it populates docstrings in `ndarray` on which `numpy.ma.core` does further processing.
To resolve this, we split out the scalar docstrings in `numpy.core._add_newdocs` into a new module `numpy.core._add_newdocs_scalars` that _is_ safe to disable until later.
This remove the tables. since they only had three columns, and using the character code is advised against anyway.

With this change, the individual scalar types as well as their aliases are now valid sphinx python domain targets.
@eric-wieser eric-wieser force-pushed the remove-scalar-tables branch from b142a75 to 3edc19f Compare October 4, 2020 20:04
@eric-wieser
Copy link
Member Author

Rebased, with an extensive commit message on the hack

Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eric-wieser for the detailed commit messages, they really helped me to better understand what's going on.

It's a little unfortunate that so much metaprogramming is required to get resolvable links for the scalar types, but with my limited knowledge of sphinx I can't think of a way around it. The resulting scalars refguide page looks really good and this change reduces the number of link warnings from 623 (on e6b8b19) down to 484.

@mattip mattip merged commit 23e42e9 into numpy:master Oct 6, 2020
@mattip
Copy link
Member

mattip commented Oct 6, 2020

Thanks @eric-wieser

@eric-wieser
Copy link
Member Author

eric-wieser commented Oct 6, 2020

this change reduces the number of link warnings from 623 (on e6b8b19) down to 484.

Wonderful, thanks for counting that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants