Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add documentation page on the internal representation of NumPy arrays #15793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Irio opened this issue Mar 21, 2020 · 8 comments
Closed

Add documentation page on the internal representation of NumPy arrays #15793

Irio opened this issue Mar 21, 2020 · 8 comments

Comments

@Irio
Copy link

Irio commented Mar 21, 2020

In my understanding, based on the definitions in NEP 44, my proposal is to make an Explanation page around the topic.

I am aware about two places in the docs mentioning this topic for NumPy users: the Quickstart tutorial and NumPy: the absolute basics for beginners. The former, starts with the following:

When operating and manipulating arrays, their data is sometimes copied into a new array and sometimes not. This is often a source of confusion for beginners.

Sometimes, even after five years using NumPy, I have to go with my gut feeling to predict if an operation will trigger a copy in the underlying data. I'd love to have an official documentation page for helping people build a better mental model of this.

Some questions and ideas:

  • What does an advanced user of NumPy needs to know about the underlying data structure?
  • What are the operations that will cause data to be moved/copied in memory?
  • How to tell if Numpy did or will copy data in an operation?
  • When is recommended to choose between creating a view or making a copy?
  • May include or link to a good overview of indexing
  • Shallow copy vs deep copy
  • Cite the classic SettingWithCopyWarning exception in Pandas, which I believe is caused by not having a good understanding of this very same subject
@rgommers
Copy link
Member

Yes, this'd be great! Agree that this is an Explanation

@rossbar
Copy link
Contributor

rossbar commented Mar 25, 2020

Just a quick addition: some of the proposed material is also covered in NumPy Internals in the reference guide. This material is pretty dense so I definitely agree that a distillation to a NEP44-style Explanation with a more user-centric theme would be very nice. I mention it here so that it can be drawn upon (and hopefully kept somewhat synced with) if/when anyone undertakes the Explanation doc.

@mrityagi
Copy link

mrityagi commented Jul 5, 2020

@melissawm @rgommers , I have tried to analyse it , this issue can be resolved if there is an FAQs section in the documentation , where we'll give a brief eplanation to the questions like these along with the ones which can be sourced from Stack overflow , these type of questions which may sound trivial or basic can sometime be a lot of pain in the workflow for even an experienced developer ,if we look upon this link we'll see that issues of this type are widespread , I'll try to structure a FAQs demo page where questions of such nature can be documented so that we can resolve issues like this which may occur in future too

@melissawm
Copy link
Member

Hi @mrityagi , I don't think we have considered a FAQ format for this, initially we thought about an "Explanation" document. However, I do think that the questions might be a good guide into which content to cover, and this might end up being an excellent guide to the internals of NumPy. How does that sound to you?

@mrityagi
Copy link

mrityagi commented Jul 6, 2020

@melissawm yeah I think Explanation/Questions sounds more reliable and a more trustworthy word for NumPy , infact we are on the same page with thought process to structure docs like an excellent guide to internals of NumPy . I have tried to give it a structure which can be useful for such issues . Infact I totally agree with you , the need to index them into which content to cover say there can be a topic named Copies and Views , under this we can give explanations like No Copy vs Shallow Copy (ie Views) , Shallow Copy (ie views) vs Deep Copy .
Similarly for arrays indexing and other stuff , i mean we need to order and Group(categorize) these Explanations like above .
This way we are basically answering questions , doubts whatever we choose to call them in a structured good guide , because a person coming to this page should know with just a glimpse whether we have an answer / explanation to his doubt by merely looking at the names of the topics covered .
This way our objective of giving solutions for popular queries will surely be covered , What do u say ?

@melissawm
Copy link
Member

Hello, @mrityagi - I think we talked about this in the documentation meeting but I'm not sure if you intend to follow up on this? If not maybe @Mukulikaa can pick this up. Thanks!

@Mukulikaa
Copy link
Contributor

I think this issue can be closed by gh-19791 for now. There is also a separate tracking issue (gh-20112) for future work on this topic.

@melissawm
Copy link
Member

Agreed, thanks for flagging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants