fixes an issue with macro directives for `!$acc kernels` #926

sbryngelson · 2025-07-08T14:07:48Z

User description

The #883 created an issue that substituted !$acc parallel for !$acc kernels in all of 3 whole places (!!). It turns out this does not work on NVHPC. This is a stopgap fix. @prathi-wind will fix it up more properly.

PR Type

Bug fix

Description

Replace GPU_PARALLEL macro with !$acc kernels directives
Fix NVHPC compiler compatibility issues
Update OpenACC directives in data output and time stepping modules

Changes diagram

flowchart LR
  A["GPU_PARALLEL macro"] -- "replace with" --> B["!$acc kernels directives"]
  B --> C["NVHPC compatibility"]

Changes walkthrough 📝

Relevant files

Bug fix

m_data_output.fpp `Update OpenACC directives in data output module` src/simulation/m_data_output.fpp Replace `#:call GPU_PARALLEL()` with `!$acc kernels` for `icfl_max_loc` calculation Update viscous flow section with `!$acc kernels` directives for `vcfl_max_loc` and `Rc_min_loc`	+7/-7
m_time_steppers.fpp `Fix OpenACC directives in time stepping module` src/simulation/m_time_steppers.fpp Replace `#:call GPU_PARALLEL()` with `!$acc kernels` for `dt_local` calculation	+3/-3

Need help?
Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
Check out the documentation for more information.

qodo-merge-pro · 2025-07-08T14:08:11Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Duplicate Code The same pattern of wrapping maxval/minval operations with !$acc kernels directives is repeated multiple times. This could be refactored into a reusable macro or subroutine to reduce code duplication and improve maintainability. !$acc kernels icfl_max_loc = maxval(icfl_sf) !$acc end kernels if (viscous) then !$acc kernels vcfl_max_loc = maxval(vcfl_sf) Rc_min_loc = minval(Rc_sf) !$acc end kernels end if Performance Concern Using !$acc kernels for a single minval operation may not provide optimal GPU performance compared to !$acc parallel. The kernels directive relies on compiler analysis which may not be as efficient for simple reduction operations. !$acc kernels dt_local = minval(max_dt) !$acc end kernels

Copilot

Pull Request Overview

This PR replaces the custom GPU_PARALLEL() macro with raw OpenACC kernels directives as a stopgap for NVHPC compatibility.

Substituted #:call GPU_PARALLEL()/#:endcall GPU_PARALLEL with !$acc kernels/!$acc end kernels
Applied the change in both the time-stepping and data-output modules

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
src/simulation/m_time_steppers.fpp	Replaced GPU_PARALLEL macro around `minval(max_dt)` with kernels
src/simulation/m_data_output.fpp	Replaced GPU_PARALLEL macro around `maxval`/`minval` calls

Comments suppressed due to low confidence (3)

src/simulation/m_time_steppers.fpp:996

Wrapping a scalar minval call in a kernels region may incur unnecessary kernel launch overhead and may not generate a reduction on the device. Consider using !$acc parallel loop reduction(min:dt_local) around the explicit loop over max_dt with a collapse if multiple dimensions are involved.

        !$acc kernels

src/simulation/m_data_output.fpp:319

Enclosing maxval(icfl_sf) in a kernels region may not produce an efficient reduction; consider converting this to a !$acc parallel loop reduction(max:icfl_max_loc) over the underlying array to leverage device-side reductions.

        !$acc kernels

src/simulation/m_data_output.fpp:323

This kernels region wraps two scalar reductions (vcfl_max_loc, Rc_min_loc); you may get better performance by using a combined !$acc parallel loop reduction(max:vcfl_max_loc) reduction(min:Rc_min_loc) over the loop indices instead of kernels.

            !$acc kernels

qodo-merge-pro · 2025-07-08T14:08:38Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Separate reduction operations into individual kernels Consider using separate kernel regions for each reduction operation. Multiple reductions in a single kernel may not be optimal for GPU performance and could lead to synchronization issues. src/simulation/m_data_output.fpp [323-326] !$acc kernels vcfl_max_loc = maxval(vcfl_sf) +!$acc end kernels +!$acc kernels Rc_min_loc = minval(Rc_sf) !$acc end kernels Suggestion importance[1-10]: 5 __ Why: This is a valid performance consideration, as splitting multiple reduction operations into separate kernels can sometimes improve GPU performance, although the benefit is not guaranteed and depends on the compiler and hardware.	Low
More

codecov · 2025-07-08T16:11:01Z

Codecov Report

Attention: Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.

Project coverage is 43.71%. Comparing base (0900648) to head (534a8a9).
Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
src/simulation/m_data_output.fpp	0.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #926      +/-   ##
==========================================
+ Coverage   43.68%   43.71%   +0.02%     
==========================================
  Files          68       68              
  Lines       18363    18360       -3     
  Branches     2295     2292       -3     
==========================================
+ Hits         8022     8026       +4     
+ Misses       8949     8945       -4     
+ Partials     1392     1389       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kernels fix

534a8a9

Copilot AI review requested due to automatic review settings July 8, 2025 14:07

sbryngelson requested a review from a team as a code owner July 8, 2025 14:07

qodo-merge-pro bot added the Review effort 2/5 label Jul 8, 2025

Copilot AI reviewed Jul 8, 2025

View reviewed changes

sbryngelson mentioned this pull request Jul 8, 2025

Metadirectives kernels fixup #927

Closed

sbryngelson merged commit 8026a1c into MFlowCode:master Jul 8, 2025
25 of 43 checks passed

sbryngelson deleted the kernels branch July 10, 2025 00:17

prathi-wind pushed a commit to prathi-wind/MFC-prathi that referenced this pull request Jul 13, 2025

kernels fix (MFlowCode#926)

9317f8d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fixes an issue with macro directives for `!$acc kernels` #926

fixes an issue with macro directives for `!$acc kernels` #926

Uh oh!

sbryngelson commented Jul 8, 2025 •

edited by qodo-merge-pro bot

Loading

Uh oh!

qodo-merge-pro bot commented Jul 8, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

qodo-merge-pro bot commented Jul 8, 2025

Uh oh!

codecov bot commented Jul 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

fixes an issue with macro directives for !$acc kernels #926

fixes an issue with macro directives for !$acc kernels #926

Uh oh!

Conversation

sbryngelson commented Jul 8, 2025 • edited by qodo-merge-pro bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

Changes diagram

Changes walkthrough 📝

Uh oh!

qodo-merge-pro bot commented Jul 8, 2025

PR Reviewer Guide 🔍

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

qodo-merge-pro bot commented Jul 8, 2025

PR Code Suggestions ✨

Uh oh!

codecov bot commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

fixes an issue with macro directives for `!$acc kernels` #926

fixes an issue with macro directives for `!$acc kernels` #926

sbryngelson commented Jul 8, 2025 •

edited by qodo-merge-pro bot

Loading

codecov bot commented Jul 8, 2025 •

edited

Loading