Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@srkreddy1238
Copy link
Contributor

@srkreddy1238 srkreddy1238 commented Jan 21, 2025

@srkreddy1238
Copy link
Contributor Author

@tvm-bot rerun

@tqchen
Copy link
Member

tqchen commented Jan 25, 2025

@Hzfengsy do you mind take a look given it touches FuseOps/TIR

@tqchen
Copy link
Member

tqchen commented Jan 25, 2025

also cc @yongwww for memory scope related changes

Copy link
Member

@Hzfengsy Hzfengsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some initial comments

@srkreddy1238
Copy link
Contributor Author

@tvm-bot rerun

Copy link
Member

@Hzfengsy Hzfengsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tqchen
Copy link
Member

tqchen commented Feb 3, 2025

Thanks @srkreddy1238 for updates. I take a closer look and now understands the motivation behind add_attributes. This is mainly to handle the case of conv2d operators where texture can be supported.

However, attaching op attributes into the call_tir indeed introduce less desirable impact, as the specification of call_tir originally do not have to deal with these attributes, and having them will results in "leak through". This would increase the surface area for developers working with call_tir

I also now understand the demand is to enable the finally fused call_tir function to decide whether texture memory is feasible.

I think it is more cleaner to try a different approach. Instead of relying on legalize pass, let us introduce an adreno specific conv_dispatch which can be used before legalize, to offload these conv operators. We specifically attach the attribute tir.opencl_texture_2d_supported = true to the call node.

Now the remaining question is where the schedule can appear

  • The most clean way is to actually have relax.andreno.conv_dispatch to call the dlight schedule and construct such call_tir, and mark it as already scheduled. The only issue is that in such case followup FuseOps/TIR should treat this as opaque, and do not yet have capabilities to run more fusions. But we should be fine getting the right conv2d op scheduled

To further enable fusion, one can try adopt the following customized legalize sequence
- S0: relax.andreno.conv_dispatch: run conv dispatch and mark it as opaque with tir.opencl_texture_2d_supported = true
- S1: Run legalize and analysis
- S2: Do a pattern match to manually fuse the ewise onto the conv2d (by creating a sub function that calls into conv2d then ewise), this will create a sub function that calls into conv2d and then ewise, which can then be consumed by FuseTIR
- Run FuseOps (this will try to fuse the other ops)
- Run FuseTIR
- Run dlight

@srkreddy1238
Copy link
Contributor Author

Off late realized, I could have drafted an RFC to describe the approach. Have done now https://discuss.tvm.apache.org/t/rfc-annotate-custom-scope-layout-relax-pass-for-adreno-gpu/18052

@tqchen thanks for the thoughts. Few concerns I have in this approach

  • tir.opencl_texture_2d_supported = true : I assume this flag will be used to realize VDevice in struct_info after FuseTIR. Then, only flag may not be sufficient here we might need scope information for each input. And this information to be consistent while we pass through the fusion ops.
  • Another moderate challenge is in S2, where we need to define and maintain BYOC like pattern table to ensure maximum fusion possibilities.

Pls advice.

@srkreddy1238 srkreddy1238 force-pushed the annotate_texture_scope branch 2 times, most recently from c1b22d5 to 57e46e0 Compare February 26, 2025 17:41
@srkreddy1238
Copy link
Contributor Author

@srkreddy1238 srkreddy1238 requested a review from tqchen February 27, 2025 02:45
@srkreddy1238
Copy link
Contributor Author

@tvm-bot rerun

@srkreddy1238 srkreddy1238 force-pushed the annotate_texture_scope branch from 294b7a6 to 3b402e9 Compare March 11, 2025 10:09
@srkreddy1238
Copy link
Contributor Author

@tqchen can you take a look at this ?

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @srkreddy1238 , sorry for the delayed review. I think we are getting close. The main comments are related to testcases and group some of the adreno specific passes to relax/backend

@srkreddy1238 srkreddy1238 force-pushed the annotate_texture_scope branch from 3b402e9 to 68b615b Compare March 25, 2025 19:44
@srkreddy1238
Copy link
Contributor Author

@tvm-bot rerun

@srkreddy1238 srkreddy1238 requested a review from tqchen March 26, 2025 14:59
@srkreddy1238 srkreddy1238 force-pushed the annotate_texture_scope branch from 9479ee9 to a43d9f7 Compare July 21, 2025 07:13
@srkreddy1238 srkreddy1238 force-pushed the annotate_texture_scope branch from a43d9f7 to f1d6847 Compare July 21, 2025 07:41
@srkreddy1238 srkreddy1238 requested a review from tqchen July 26, 2025 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants