Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@nwlandry
Copy link
Collaborator

@nwlandry nwlandry commented Aug 18, 2023

Fixes #117. Added a method to compute the distribution of an array of numbers. Design decisions attempted to roughly follow the discussion in #117.

@codecov
Copy link

codecov bot commented Aug 18, 2023

Codecov Report

Patch coverage: 92.00% and project coverage change: +0.01% 🎉

Comparison is base (290729a) 91.84% compared to head (9037834) 91.86%.
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #452      +/-   ##
==========================================
+ Coverage   91.84%   91.86%   +0.01%     
==========================================
  Files          60       60              
  Lines        4255     4300      +45     
==========================================
+ Hits         3908     3950      +42     
- Misses        347      350       +3     
Files Changed Coverage Δ
xgi/stats/__init__.py 85.11% <66.66%> (+0.08%) ⬆️
xgi/utils/utilities.py 96.00% <100.00%> (+0.71%) ⬆️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maximelucas
Copy link
Collaborator

Great, I've been hoping for this feature for so long!

Looks good. Two thoughts:

  • you return ndarrays (of same size), and not df as Leo suggested, right? I'm fine with both, but better check with @leotrs . Maybe one of the reasons was that it's more easily plottable?
  • when I read asdist I tend to think of "distance" rather than "distribution". Do you think ashist() could make sense?

@nwlandry nwlandry requested a review from leotrs August 19, 2023 15:14
Copy link
Collaborator

@leotrs leotrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm willing to merge this asap because the corresponding issue has been open for so long. However, I'd like to say that I spent a bunch of time on proposing a spec that I thought was reasonable at the time, and this PR does not conform to that proposal.


def dist(self):
return [np.histogram(data, density=True) for data in self.asnumpy().T]
def dist(self, bins=10, density=False, log_binning=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to overwrite the IDStat.dist method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean. I'm importing dist (now called hist) from utils to avoid rewriting the same function for both IDStat and MultiIDStat.

@nwlandry
Copy link
Collaborator Author

nwlandry commented Aug 21, 2023

Good call. There are definitely some discrepancies between what we decided in that issue vs. what I did. My biggest takeaways on what I need to fix are (1) call this ashist() instead of dist, (2) allow users to get bin edges as well as centers, and (3) return as Pandas instead of two numpy arrays. From looking back at that conversation, I think that implementing both hist and dist seems like overkill but there are some nice suggestions that I will spend some time incorporating.

@nwlandry nwlandry changed the title Added dist method to stats Added ashist() method to stats Aug 22, 2023
@nwlandry
Copy link
Collaborator Author

@maximelucas @leotrs I think I addressed your comments - let me know if there's anything else I can address!

@maximelucas
Copy link
Collaborator

Thanks Nich, functionality looks good to me.

Two questions:

  • it looks like only xgi.hist() is tested, should we also test say H.nodes.degree.ashist()?
  • Unsure but might still need to be added to the docs?

@nwlandry
Copy link
Collaborator Author

nwlandry commented Aug 22, 2023

Thanks Nich, functionality looks good to me.

Two questions:

  • it looks like only xgi.hist() is tested, should we also test say H.nodes.degree.ashist()?
  • Unsure but might still need to be added to the docs?

I opted to only test the xgi.hist function because the tests for the stats package are a bit of a mess right now IMO and the stats functions directly call xgi.hist() so all the functionality is tested. If it's okay with you, I will make an issue to reorganize the stats tests (we can discuss the best way to do this) so that it's more organized.

Good catch on the docs - fixed now.

@nwlandry nwlandry merged commit bf2a099 into main Aug 22, 2023
@nwlandry nwlandry deleted the add-stats-hist branch August 22, 2023 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add features to degree_histogram

4 participants