-
Notifications
You must be signed in to change notification settings - Fork 78
generate_loop_schedule_v2 #350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
inducer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts from a quick look here.
loopy/schedule/__init__.py
Outdated
| tree = get_loop_nest_tree(kernel) | ||
|
|
||
| iname_to_insns = kernel.iname_to_insns() | ||
| loop_nest_around_map = defaultdict(frozenset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my view, the loop_nest_around_map should go. The loop nest tree is a strictly more informative data structure, and it is not of size O(n^2). I think we should get rid of it. I looked through the uses, and it shouldn't be hard to replace, and it'll fix another bottleneck in linearization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the new version, I didn't touch the old implementation (v1-scheduler) to keep things simpler in this PR. So, for such cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can handle it as a separate PR.
loopy/schedule/tools.py
Outdated
| Updates *tree* to make *inames_to_pull_out* a loop nesting level in | ||
| *loop_nests* | ||
| :returns: a :class:`tuple` ``(outer_loop_nest, inner_loop_nest)``, where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe the types of the arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done.
loopy/schedule/tools.py
Outdated
| Updates *tree* to make *inames_to_pull_out* a loop nesting level in | ||
| *loop_nests* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe which is nested "inside" and which is "outside".
96856f8 to
0203d09
Compare
|
Does #372 supersede this? |
e14a03d to
2d526dc
Compare
e8b03c8 to
72334e8
Compare
6945d16 to
d74da04
Compare
d74da04 to
1aef04f
Compare
1aef04f to
581e82e
Compare
e20c6ca to
931c6e9
Compare
931c6e9 to
36a1a2f
Compare
f10f24c to
c1f8e0d
Compare
c1f8e0d to
8c42d10
Compare
8c42d10 to
15cea20
Compare
15cea20 to
5b55bcd
Compare
5b55bcd to
f8dc959
Compare
f8dc959 to
80425bd
Compare
|
I was just rebasing this. You beat me to it! :) |
inducer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a first look. I haven't really gotten very far into the core algorithm, because I'm not sure I understood _pull_out_loop_nest_tree, which seems like it's a core operation.
| return wrapper | ||
|
|
||
|
|
||
| # {{{ tree data structure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is big enough to be its own file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#694 puts the Tree class into its own file (loopy/schedule/tree.py).
| return wrapper | ||
|
|
||
|
|
||
| # {{{ tree data structure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this has type annotations, mypy should like them (and there should be CI saying as much).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, mypy checks and passes the annotations.
| @dataclass(frozen=True) | ||
| class Tree(Generic[T]): | ||
| """ | ||
| An immutable tree implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe the role of the type variable T. Specifically describe that there's one Tree object, but many nodes T. Maybe rename T to NodeT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c53a9af hopefully clarifies this (and renames T to NodeT).
|
|
||
|
|
||
| @dataclass(frozen=True) | ||
| class Tree(Generic[T]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this mature enough to be in pytools?
loopy/tools.py
Outdated
| _parent_to_children: "PMap[T, FrozenSet[T]]" | ||
| _child_to_parent: "PMap[T, Optional[T]]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of data structure, this is a forest (i.e. it allows multiple trees). It wouldn't take much to allow that fully... YAGNI though.
|
|
||
| loop_nests = sorted(loop_nests, key=lambda nest: tree.depth(nest)) | ||
|
|
||
| for outer, inner in zip(loop_nests[:-1], loop_nests[1:]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that all you'd need to pass here is the innermost node in the loop nest tree, and tree.ancestors (at least the proposed changed version) would cheaply give you this without the need to enforce these invariants.
inducer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Finally made it all the way through. There's a bit to digest here, but as far as I remember, it's mostly superficial. I didn't find any flaws in the thinking. Nice job!
| if outer_loop == "": | ||
| continue | ||
|
|
||
| for child in loop_nest_tree.children(outer_loop): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment that this only works because loop_nest_tree contains a total order of loops. Otherwise it might windup with mismatched enter/leaves.
|
Unsubscribing... @-mention or request review once it's ready for a look or needs attention. But I'm eager to get this in soon! |
65a6b62 to
c21c092
Compare
c21c092 to
fadb652
Compare
|
Most review comments from here are addressed in #864. |
Implementation for finding loop nest around map in O(N.k), 'N' being the number of inames and 'k' being the max. loop depth.
For comparison, let's consider the kernel in #288: on
mainthis map in computed in 5 minutes and this branch takes300.4 seconds.