-
Couldn't load subscription status.
- Fork 53
Add hetero neighbor sampler benchmark #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
kgajdamo
commented
Sep 19, 2022
- Added hetero neighbor sampler benchmark to pyg-lib.
- Benchmark measures performance for the hetero_neighbor_sample from pyg-lib as well as hetero_neighbor_sample from pytorch_sparse.
benchmark/sampler/hetero_neighbor.py
Outdated
|
|
||
| path = osp.join(osp.dirname(osp.realpath(__file__)), '../../data/OGB') | ||
| transform = T.ToUndirected(merge=True) | ||
| dataset = OGB_MAG(path, preprocess='metapath2vec', transform=transform) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can move this to pyg_lib.testing.withDataset. WDYT? And then just have it return a dictionary of (rowptr, col) entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll try
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the code so that the dataset can be retrieved using the decorator. But for the purposes of the benchmark, it is necessary not only to return the (rowptr_dict, col_dict) but also the number of nodes, edge types and node types. Please see if such an implementation suits you.
In addition, I noticed that sampling with replacement is not working properly for hetero sample in pyg-lib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Can you clarify what is not working for sampling with replacement? Does it simply crash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It crashes in uniform_sample() -> add()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will take a look!
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #106 +/- ##
=======================================
Coverage 89.32% 89.32%
=======================================
Files 16 16
Lines 412 412
=======================================
Hits 368 368
Misses 44 44 ☔ View full report in Codecov by Sentry. |
596e57a to
f904d98
Compare
|
This is great, thank you very much! It's interesting that |
Thanks @rusty1s for the updates. I have a question about your changes. In the hetero_neighbor.py file for the pytorch sampler case You put some declarations like node_types, edge_types etc inside the loop. These declarations make the measurement time greater and probably there is no need to declare them every time. Was it on purpose? |
Yes, IMO that makes the comparison more fair as our Python wrapper around pyg-libs sample code does the same as well.
Yes, this would be very valuable.
We need to use CSC here since that is the only format |
Right, thank You for the detailed answer :) |