-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
DOC: re-add api example #26624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: re-add api example #26624
Conversation
I don't think self-contained should be a goal b/c that's how we end up w/ example drift and scope creep. I think as much as possible we should be cross linking to one source of truth cause that's more maintainable and makes things more discoverable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the example as given is cluttered in a way that obfuscates the goal of trying to highlight the API differences. I also think that since this is a doc regression it should have two approvals.
ETA: I also very much appreciate that you put the example into the pyplot and axes explanations.
x = np.arange(0, 4, 0.05) | ||
y = np.sin(x*np.pi) | ||
# Create a Figure and an Axes: | ||
fig, ax = plt.subplots(figsize=(3,2), layout='constrained') | ||
# Use the Axes to plot and label: | ||
ax.plot(x, y) | ||
ax.set_xlabel('t [s]') | ||
ax.set_ylabel('S [V]') | ||
ax.set_title('Sine wave') | ||
# Change a property of the Figure: | ||
fig.set_facecolor('lightsteelblue') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x = np.arange(0, 4, 0.05) | |
y = np.sin(x*np.pi) | |
# Create a Figure and an Axes: | |
fig, ax = plt.subplots(figsize=(3,2), layout='constrained') | |
# Use the Axes to plot and label: | |
ax.plot(x, y) | |
ax.set_xlabel('t [s]') | |
ax.set_ylabel('S [V]') | |
ax.set_title('Sine wave') | |
# Change a property of the Figure: | |
fig.set_facecolor('lightsteelblue') | |
# Create a Figure and an Axes: | |
fig, ax = plt.subplots(figsize=(3,2), layout='constrained') | |
# add a plot to the Axes ax | |
ax.plot(np.sin(np.linspace(0, 2*np.pi)) | |
# label the Axes and x and y Axis | |
ax.set(xlabel= 't [s]', ylabel = 'S [V]', title ='Sine wave') | |
# Change the face color of the Figure: | |
fig.set_facecolor('lightsteelblue') |
I think the original is a bit too long and also I think we want to encourage set unless they need the individual properties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ax.set
is pretty obscure as a way to shorten things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think we want folks to know about it. I also agree w/ Tim that this should be as short as possible, and probably in a 2 column grid rather than tabs 'cause it's that side by side that really illustrates the difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two column is fine, though it looks pretty cramped.
Re ax.set
that seems an advanced usage to me, whereas I thought we wanted this to be straightforward? ax.,set also does not allow one to customize anything in the set methods (eg fontsize in title etc), so I'm not sure it should become canonical usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope here but I think that's a point we're trying to make in all sorts of places - that if you don't need customization we've got this nice method that will do all the things at once, but if you need customization then you've gotta use the specific methods to do so cause we try to scope our functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of value in the 1:1 mapping between the lines so am very 👎🏻 on changing to use ax.set
.
At least for me clicking between the tabs and seeing what does/does not change is way easier to see the differences than trying to scan between two columns.
The example originally came in #19727. It was removed in #26402 without mention of the removal in the PR description, or discussion, on the strength of one review. I'm not clear how re-instating it is considered a regression. |
B/c it was removed and now it's going back? I think if we're thrashing back and forth then it's worth slowing down and giving space for at least one other person to sign off. In this case in particular I think it'd be good if @timhoffm at least saw this PR. |
In #26402, I motivated the change:
This page is the entry point for the API reference, the collection of all function / class docstrings. Categorizing according to https://documentation.divio.com/ it is information-oriented, not learning-oriented or explanation-oriented. My train of thought for deciding on the contents of the page is:
In particular, I believe a visual plot would be harming the cause here. It draws a lot of visual attention and does not explain anything about the API or Interfaces. I would be +/-0 on adding minimal pure code-blocks to the existing interface cards
On a general note, the major issue with our documentation is not that it's missing content or explanations. It's structure and guidance that is lacking. IMHO we gain a lot if we focus pages on their primary purpose and strip / link other aspects. |
I strongly feel the example helps set the context for what is being discussed on this page. Our API is very bizarre in that 95% of it is on the Axes objects we create, and showing a short example of that on a top-level landing page helps orient the reader.
Divio is a nice way to think about the purposes of documentation, but I don't think it is a rational way to organize things with pristine separation of the types of information. Information doesn't mean anything without some context. For others who do not wish to click through: |
Fully agree and that's why I added bullet point 3. in my reasoning. The key here is some. Can we agree on the principle that some means "enough to help understanding the main information, but as little as possible to not distract from the main information"? - The judgement call, and where I think we disargee is, is what is "enough".
For the context, the priority, determined by relevance, is:
|
I think it is a strange aesthetic to just show the code, but not show the code result in a plotting library. Code tends to be a lot easier to parse if you can see its result. You say it's "cluttered", and "distracts". But from what? There is no other content on this page - the rest of this page is just an index (that is repeated verbatim on the left sidebar). |
I mean that there's a lot of cognitive overload happening. Your example demonstrates data creation, figure and axes creation, plotting, labeling, and fig saving - and yes that's all very useful and things folks always want to do, but the question is are those concepts we want to be demoing here? Especially for a user who does not know the API, they may get disctracted by creation->plot->label and not focus on the difference between pyplot and axes which is what we want to be highlighting. |
It depends on what you want to achieve with that code. There are lots of Examples sections in our docstrings that do not generate the plot. |
ax.plot(x, y) | ||
ax.set_xlabel('t [s]') | ||
ax.set_ylabel('S [V]') | ||
ax.set_title('Sine wave') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ax.set_title('Sine wave') | |
ax.set_title('Sine wave (explicit)') |
plt.plot(x, y) | ||
plt.xlabel('t [s]') | ||
plt.ylabel('S [V]') | ||
plt.title('Sine wave') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plt.title('Sine wave') | |
plt.title('Sine wave (implicit)') |
The point of this PR was to re-add the figure. #26671 did not re-add the figure, and obviously there is some disagreement about whether figures are desirable in our docs. |
The disagreement is about whether a figure in this specific document clarifies or obfuscates the intended takeaway from the document, which is why #26671 added code to highlight the differences instead of a figure. I think at this point we probably need to have a full discussion cause otherwise we'll end up w/ more add-remove cycles, so I'm gonna put up a block until that happens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wary of an add-remove-add-remove cycle happening, so blocking until broader consensus is achieved.
I think the longer example with an image is actually quite helpful here, particularly as written with the 1:1 mapping of the lines of code. By flipping back and forth between the tabs it is easy to see what the required changes are and than you get the same output either way (the only change I would suggest would be to make the titles different so it is clear that the image is changing between the tabs, currently the only way to tell is to mouse over them and see the urls are different). We (the developers) know that the APIs are interchangeable, but that may not be clear to all of our users so showing that you can get exactly the same output using either path is a good thing. It is also good to show that the explicit API is not also necessarily more verbose. We probably should re-write some of the examples in https://matplotlib.org/devdocs/users/explain/figure/api_interfaces.html using this 2 tab style. |
Having to flip back and forth between tabs to see the difference itself obfuscates the difference because it requires the learner to hold the not visible code in memory while looking at the visible one - and that's a lot of cognitive overload for someone who may not know what they're supposed to be holding in their head while flipping between the tabs. I love tabs and think they're fantastic for presenting mutually exclusive - you can either do this thing or that thing and not both - information. But I don't think the goal on this page is necessarily to get folks to pick one - my guess is they'll go w/ the one they know cause that's easiest - but to see both so they can compare. And tables are better for comparison. Here's one example of universal design folks talking about tabs for reading flow controls universaldesign.ie |
My major concerns:
I still find the current side-by-side solution really concise: I don't think the rendered figure is relevant, but if you consider it valuable, I suggest to make a full-width div below the two columns:
|
At the end of the day, these are all taste issues. I think the final documentation say is @tacaswell, so he should just state his preference. I will point out some usability issues with the current links - I think that |
I think they're best practices issues.
I think @timhoffm's suggestion of putting one figure at the buttom of the grid showing that both code blocks produce the same visual is a good compromise in terms of having a visual. |
Again, whether something looks better tabbed or side-by-side is not subject to any hard and fast rules. FWIW I prefer side-by-side, but I wouldn't claim that my preference is "right" or a "best-practice". I had a slight preference for tabbed here just because it was more compact.
The original impetus to improving this page (#19727), and adding an example, is that this is the top-level API landing page, arrived at via "Reference". Lots of people who think they are familiar with Matplotlib probably land on "Reference", and then don't know where to go. I distinctly remember getting started with our API, and not understanding that most of it was in the In my mind, the goal of these improvements was not primarily to "communicate the difference between the two APIs". Indeed the original version of this had all the different API discussion below the TOC (https://matplotlib.org/3.5.3/api/index.html). Rather the goal was to provide a quick guide/reminder of our API using code snippets, and their results, that user may have seen elsewhere, be that in our docs, stack overflow etc. I agree that the difference between the APIs needs to be communicated, but as a small part of the larger goal of providing a guide the API reference. |
Why do you prefer it? And do you think the users reading this page may prefer it for the same reasons? The choice of tabs or tables creates different reading/learning experiences for comparing A and B:
How does adding data generation and a few more set methods do that? I mean this in earnest, as the current code on the page shows the three major components of making a graph that Matplotlib is in charge of:
I especially worry about including data creation b/c we already have enough trouble getting folks to realize that Matplotlib is just the graphs part, and I don't think putting Numpy on our API reference page will help. This is also making me think that what may be a better way to improve that page would be to put modules in alphabetical order in a tab/different page and reorganize the modules list so that is instead grouped into some reasonable categories so folks who do not know what we call things could still navigate to the right modules. Something like: |
The data creation is pretty trivial - I hope we can assume our readers have basic familiarity with numpy. Further, it makes it clear what kind of objects x and y are, versus completely unspecified. As for the setters, I think labelling a plot is best practice and common enough to warrant a couple of lines here. As for tabs versus side by side, in addition to size issues, we are trying to discourage one of the APIs - recall the original version didn't even have the pyplot API example. In fact for the purpose of indexing I'm not even sure it's needed at all since the pyplot API is easy to find I wouldn't object to re organizing the API on the page, but that is somewhat orthogonal here |
They may, but also may not realize that it's not built into Matplotlib. (ETA: bootcamp curricula in particular have an unfortunate tendency to lump the libraries together).
But isn't the point here that
this table exists to orient users to what we mean by API/OO/explicit vs pyplot/implicit API -> this is the back and forth in #26623 - so that they can find the right module. If they need usage guidance, they should be directed to the user guide where all this stuff is explained. Granted, I dunno that I'd be opposed to putting everything in tabs (so each column of the table gets integrated into a tab) but my guess is that the folks who most need the definitions would also benefit most from seeing them side by side.
100% agree it's best practice, but this page isn't supposed to be a "how to use matplotlib" page. |
Again, we disagree fundamentally on most of these points, none of which have an objective criteria. Someone, presumably @tacaswell, will have to decide what they want this page to look like. |
Or the API docs lead, who is @timhoffm |
Sure, if that falls under that bailiwick, I'll close this. |
PR summary
This re-introduces the example in
api/index
, and adds a pyplot counterexample.As discussed in #26623 - this provides concrete examples of the abstract concepts being discussed and makes the API pages more self-contained.
PR checklist