Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Dec 22, 2022. It is now read-only.

Conversation

@rheacangeo
Copy link
Contributor

@rheacangeo rheacangeo commented May 13, 2021

Purpose

a2b_ord4 was made into a class, but the stencils were only merged a bit towards the single stencil pr version. This takes another swing at it to try to improve speed and gpu utilization. It still seems to need to be split in multiple stencils to validate on the gt backends, but removed the blocking computation dependence, getting the different regions to not rely on each other by recomputing components rather than using a previously computed offset value. Removed qxx and qyy temporaries. corner computations are done in a larger stencil rather than one at a time. Putting 2 together worked and no longer ran into the nan issue that had been happening before using gtscript functions in regions. Putting all 4 together results in an error with the gtcuda backend: excessive recursion at instantiation of class "gridtools::meta::lazy::lfold<gri dtools::meta::dedup_step_impl

Code changes:

  • stencil refactors to a2b_ord4

Checklist

Before submitting this PR, please make sure:

@rheacangeo rheacangeo marked this pull request as draft May 13, 2021 00:40
@rheacangeo rheacangeo changed the title more a2b_ord4 [WIP] a2b_ord4 with less stencils but still validating and not slower May 13, 2021
@rheacangeo
Copy link
Contributor Author

launch jenkins

@rheacangeo
Copy link
Contributor Author

launch jenkins

@rheacangeo rheacangeo marked this pull request as ready for review May 20, 2021 04:36
@rheacangeo rheacangeo changed the title [WIP] a2b_ord4 with less stencils but still validating and not slower a2b_ord4 with less stencils but still validating and not slower May 20, 2021
@rheacangeo rheacangeo requested a review from twicki May 20, 2021 04:36
qin[1, 1, 0],
)
qout = (ec1 + ec2 + ec3) * (1.0 / 3.0)
tmp = 0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it still an issue to have regions first?

If we need this, should we put a TODO comment here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is still an issue -- a horizontalIf error raises . I added a TODO

from __externals__ import i_end, i_start

with computation(PARALLEL), interval(...):
# ppm_volume_mean_x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this a docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now use a gtscript function instead

@rheacangeo
Copy link
Contributor Author

blarg is 10x slower?!

@rheacangeo rheacangeo marked this pull request as draft May 28, 2021 17:22
eddie-c-davis pushed a commit that referenced this pull request Jun 8, 2021
@jdahm jdahm added the inactive Not currently worked on label Oct 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

inactive Not currently worked on

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants