Add hetero neighbor sampler benchmark #106

kgajdamo · 2022-09-19T13:11:02Z

Added hetero neighbor sampler benchmark to pyg-lib.
Benchmark measures performance for the hetero_neighbor_sample from pyg-lib as well as hetero_neighbor_sample from pytorch_sparse.

benchmark/sampler/hetero_neighbor.py

rusty1s · 2022-09-20T07:17:49Z

benchmark/sampler/hetero_neighbor.py

+
+path = osp.join(osp.dirname(osp.realpath(__file__)), '../../data/OGB')
+transform = T.ToUndirected(merge=True)
+dataset = OGB_MAG(path, preprocess='metapath2vec', transform=transform)


We can move this to pyg_lib.testing.withDataset. WDYT? And then just have it return a dictionary of (rowptr, col) entries?

Ok, I'll try

I changed the code so that the dataset can be retrieved using the decorator. But for the purposes of the benchmark, it is necessary not only to return the (rowptr_dict, col_dict) but also the number of nodes, edge types and node types. Please see if such an implementation suits you.

In addition, I noticed that sampling with replacement is not working properly for hetero sample in pyg-lib.

Thanks! Can you clarify what is not working for sampling with replacement? Does it simply crash?

It crashes in uniform_sample() -> add()

Thanks, will take a look!

benchmark/sampler/hetero_neighbor.py

codecov-commenter · 2022-09-22T12:52:42Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.32%. Comparing base (38ba009) to head (b7a9105).
Report is 178 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #106   +/-   ##
=======================================
  Coverage   89.32%   89.32%           
=======================================
  Files          16       16           
  Lines         412      412           
=======================================
  Hits          368      368           
  Misses         44       44

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rusty1s · 2022-09-23T09:29:28Z

This is great, thank you very much!

It's interesting that torch-sparse performs better on [-1] neighborhoods. Wondering why this is the case.

kgajdamo · 2022-09-23T12:56:21Z

This is great, thank you very much!

It's interesting that torch-sparse performs better on [-1] neighborhoods. Wondering why this is the case.

Thanks @rusty1s for the updates. I have a question about your changes. In the hetero_neighbor.py file for the pytorch sampler case You put some declarations like node_types, edge_types etc inside the loop. These declarations make the measurement time greater and probably there is no need to declare them every time. Was it on purpose?
Another question is should I add dgl hetero neighbor sampler to the script?
And also I would like to ask why we take the matrix in csr format but the variable names are colptr and row?

kgajdamo · 2022-09-23T14:39:03Z

This is great, thank you very much!

It's interesting that torch-sparse performs better on [-1] neighborhoods. Wondering why this is the case.

Yes, pytorch_sparse sampler has better time when sample all one hop neighbors. Out of curiosity, I checked how it would look if num_neighbors=[-1, -1], and for this case pyg-lib is ~2 times faster. Here are results:

rusty1s · 2022-09-25T08:23:57Z

Was it on purpose?

Yes, IMO that makes the comparison more fair as our Python wrapper around pyg-libs sample code does the same as well.

Another question is should I add dgl hetero neighbor sampler to the script?

Yes, this would be very valuable.

And also I would like to ask why we take the matrix in csr format but the variable names are colptr and row?

We need to use CSC here since that is the only format torch-sparse supports. Note that we take the CSR format of the transposed adjacency matrix with corresponds to CSC.

kgajdamo · 2022-09-28T06:43:40Z

Was it on purpose?

Yes, IMO that makes the comparison more fair as our Python wrapper around pyg-libs sample code does the same as well.

Another question is should I add dgl hetero neighbor sampler to the script?

Yes, this would be very valuable.

And also I would like to ask why we take the matrix in csr format but the variable names are colptr and row?

We need to use CSC here since that is the only format torch-sparse supports. Note that we take the CSR format of the transposed adjacency matrix with corresponds to CSC.

Right, thank You for the detailed answer :)

Add hetero neighbor sampler benchmark

f31d6f3

kgajdamo requested a review from rusty1s September 19, 2022 13:11

rusty1s reviewed Sep 20, 2022

View reviewed changes

kgajdamo force-pushed the hetero_neighbor_bench branch from 596e57a to f904d98 Compare September 22, 2022 13:18

kgajdamo and others added 5 commits September 22, 2022 15:28

get dataset using decorator, change matrix format to csr

ed87a1d

update CHANGELOG.md

f904d98

Merge branch 'master' into hetero_neighbor_bench

5200972

update

00c1e29

typo

c643086

rusty1s approved these changes Sep 23, 2022

View reviewed changes

Merge branch 'master' into hetero_neighbor_bench

5a7ef64

update

b7a9105

rusty1s enabled auto-merge (squash) September 23, 2022 09:32

rusty1s assigned kgajdamo Sep 23, 2022

rusty1s added 0 - Priority P0 benchmark sampler labels Sep 23, 2022

rusty1s merged commit d0ce8be into pyg-team:master Sep 23, 2022

kgajdamo deleted the hetero_neighbor_bench branch November 3, 2023 07:56

Uh oh!

Add hetero neighbor sampler benchmark #106

Add hetero neighbor sampler benchmark #106

Uh oh!

Conversation

kgajdamo commented Sep 19, 2022

Uh oh!

Uh oh!

Uh oh!

rusty1s Sep 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kgajdamo Sep 20, 2022

Choose a reason for hiding this comment

Uh oh!

kgajdamo Sep 22, 2022

Choose a reason for hiding this comment

Uh oh!

rusty1s Sep 23, 2022

Choose a reason for hiding this comment

Uh oh!

kgajdamo Sep 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rusty1s Sep 23, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov-commenter commented Sep 22, 2022 • edited by codecov bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

rusty1s commented Sep 23, 2022

Uh oh!

kgajdamo commented Sep 23, 2022

Uh oh!

kgajdamo commented Sep 23, 2022

Uh oh!

rusty1s commented Sep 25, 2022

Uh oh!

kgajdamo commented Sep 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rusty1s Sep 20, 2022 •

edited

Loading

kgajdamo Sep 23, 2022 •

edited

Loading

codecov-commenter commented Sep 22, 2022 •

edited by codecov bot

Loading