[Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR #111155

kaviya2510 · 2024-10-04T13:47:54Z

Added support for lowering of task_reduction and in_reduction to MLIR
Fixed the below issue which is observed while lowering in_reduction to MLIR.

The below testcase is not generating the expected MLIR.

test.f90
subroutine omp_task_in_reduction()
   integer i
   i = 0
   !$omp task in_reduction(+:i)
   i = i + 1
   !$omp end task
end subroutine omp_task_in_reduction

It adds some information(mentioned below) related to omp.private in the MLIR lowering and this issue is not observed while enclosing in_reduction inside a parallel construct.
This issue has also been resolved as part of this PR.

  omp.private {type = firstprivate} @_QFomp_task_in_reductionEi_firstprivate_ref_i32 : !fir.ref<i32> alloc {
  ^bb0(%arg0: !fir.ref<i32>):
    %0 = fir.alloca i32 {bindc_name = "i", pinned, uniq_name = "_QFomp_task_in_reductionEi"}
    %1:2 = hlfir.declare %0 {uniq_name = "_QFomp_task_in_reductionEi"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
    omp.yield(%1#0 : !fir.ref<i32>)
  } copy {
  ^bb0(%arg0: !fir.ref<i32>, %arg1: !fir.ref<i32>):
    %0 = fir.load %arg0 : !fir.ref<i32>
    hlfir.assign %0 to %arg1 : i32, !fir.ref<i32>
    omp.yield(%arg1 : !fir.ref<i32>)
  }

github-actions · 2024-10-04T13:48:12Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2024-10-04T13:48:46Z

@llvm/pr-subscribers-flang-semantics
@llvm/pr-subscribers-flang-openmp

@llvm/pr-subscribers-flang-fir-hlfir

Author: Kaviya Rajendiran (kaviya2510)

Changes

This patch supports lowering of task_reduction and in_reduction to MLIR

Patch is 22.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111155.diff

8 Files Affected:

(modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+70-1)
(modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+10)
(modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+41-13)
(modified) flang/lib/Lower/OpenMP/ReductionProcessor.cpp (+46-11)
(modified) flang/lib/Lower/OpenMP/ReductionProcessor.h (+2-1)
(added) flang/test/Lower/OpenMP/task_array_reduction.f90 (+50)
(added) flang/test/Lower/OpenMP/task_in_reduction.f90 (+48)
(added) flang/test/Lower/OpenMP/task_reduction.f90 (+43)

diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index a4d2524bccf5c3..95ab51809dcf94 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -1063,7 +1063,7 @@ bool ClauseProcessor::processReduction(
         llvm::SmallVector<mlir::Attribute> reductionDeclSymbols;
         llvm::SmallVector<const semantics::Symbol *> reductionSyms;
         ReductionProcessor rp;
-        rp.addDeclareReduction(
+        rp.addDeclareReduction<omp::clause::Reduction>(
             currentLocation, converter, clause, reductionVars, reduceVarByRef,
             reductionDeclSymbols, outReductionSyms ? &reductionSyms : nullptr);
 
@@ -1085,6 +1085,75 @@ bool ClauseProcessor::processReduction(
       });
 }
 
+bool ClauseProcessor::processTaskReduction(
+    mlir::Location currentLocation, mlir::omp::TaskReductionClauseOps &result,
+    llvm::SmallVectorImpl<mlir::Type> *outReductionTypes,
+    llvm::SmallVectorImpl<const semantics::Symbol *> *outReductionSyms) const {
+  return findRepeatableClause<omp::clause::TaskReduction>(
+      [&](const omp::clause::TaskReduction &clause, const parser::CharBlock &) {
+        llvm::SmallVector<mlir::Value> taskReductionVars;
+        llvm::SmallVector<bool> taskReductionByref;
+        llvm::SmallVector<mlir::Attribute> taskReductionDeclSymbols;
+        llvm::SmallVector<const semantics::Symbol *> taskReductionSyms;
+        ReductionProcessor rp;
+        rp.addDeclareReduction<omp::clause::TaskReduction>(
+            currentLocation, converter, clause, taskReductionVars, taskReductionByref,
+            taskReductionDeclSymbols, outReductionSyms ? &taskReductionSyms : nullptr);
+
+        // Copy local lists into the output.
+        llvm::copy(taskReductionVars, std::back_inserter(result.taskReductionVars));
+        llvm::copy(taskReductionByref, std::back_inserter(result.taskReductionByref));
+        llvm::copy(taskReductionDeclSymbols,
+                   std::back_inserter(result.taskReductionSyms));
+
+        if (outReductionTypes) {
+          outReductionTypes->reserve(outReductionTypes->size() +
+                                     taskReductionVars.size());
+          llvm::transform(taskReductionVars, std::back_inserter(*outReductionTypes),
+                          [](mlir::Value v) { return v.getType(); });
+        }
+
+        if (outReductionSyms)
+          llvm::copy(taskReductionSyms, std::back_inserter(*outReductionSyms));
+      });
+}
+
+bool ClauseProcessor::processInReduction(
+    mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result,
+    llvm::SmallVectorImpl<mlir::Type> *outReductionTypes,
+    llvm::SmallVectorImpl<const semantics::Symbol *> *outReductionSyms) const {
+  return findRepeatableClause<omp::clause::InReduction>(
+      [&](const omp::clause::InReduction &clause,
+          const parser::CharBlock &source) {
+        llvm::SmallVector<mlir::Value> inReductionVars;
+        llvm::SmallVector<bool> inReductionByref;
+        llvm::SmallVector<mlir::Attribute> inReductionDeclSymbols;
+        llvm::SmallVector<const semantics::Symbol *> inReductionSyms;
+        ReductionProcessor rp;
+        rp.addDeclareReduction<omp::clause::InReduction>(
+            currentLocation, converter, clause, inReductionVars,
+            inReductionByref, inReductionDeclSymbols,
+            outReductionSyms ? &inReductionSyms : nullptr);
+
+        // Copy local lists into the output.
+        llvm::copy(inReductionVars, std::back_inserter(result.inReductionVars));
+        llvm::copy(inReductionByref, std::back_inserter(result.inReductionByref));
+        llvm::copy(inReductionDeclSymbols,
+                   std::back_inserter(result.inReductionSyms));
+
+        if (outReductionTypes) {
+          outReductionTypes->reserve(outReductionTypes->size() +
+                                     inReductionVars.size());
+          llvm::transform(inReductionVars,
+                          std::back_inserter(*outReductionTypes),
+                          [](mlir::Value v) { return v.getType(); });
+        }
+
+        if (outReductionSyms)
+          llvm::copy(inReductionSyms, std::back_inserter(*outReductionSyms));
+      });
+}
+
 bool ClauseProcessor::processTo(
     llvm::SmallVectorImpl<DeclareTargetCapturePair> &result) const {
   return findRepeatableClause<omp::clause::To>(
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index 0c8e7bd47ab5a6..04416a927a1c37 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -129,6 +129,16 @@ class ClauseProcessor {
       llvm::SmallVectorImpl<mlir::Type> *reductionTypes = nullptr,
       llvm::SmallVectorImpl<const semantics::Symbol *> *reductionSyms =
           nullptr) const;
+  bool processTaskReduction(
+      mlir::Location currentLocation, mlir::omp::TaskReductionClauseOps &result,
+      llvm::SmallVectorImpl<mlir::Type> *taskReductionTypes = nullptr,
+      llvm::SmallVectorImpl<const semantics::Symbol *> *taskReductionSyms =
+          nullptr) const;
+  bool processInReduction(
+      mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result,
+      llvm::SmallVectorImpl<mlir::Type> *inReductionTypes = nullptr,
+      llvm::SmallVectorImpl<const semantics::Symbol *> *inReductionSyms =
+          nullptr) const;
   bool processTo(llvm::SmallVectorImpl<DeclareTargetCapturePair> &result) const;
   bool processUseDeviceAddr(
       lower::StatementContext &stmtCtx,
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 60c83586e468b6..850f32ff0bf030 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -1244,29 +1244,34 @@ static void genTaskClauses(lower::AbstractConverter &converter,
                            semantics::SemanticsContext &semaCtx,
                            lower::StatementContext &stmtCtx,
                            const List<Clause> &clauses, mlir::Location loc,
-                           mlir::omp::TaskOperands &clauseOps) {
+                           mlir::omp::TaskOperands &clauseOps,
+                           llvm::SmallVectorImpl<mlir::Type> &inReductionTypes,
+                           llvm::SmallVectorImpl<const semantics::Symbol *> &inReductionSyms) {
   ClauseProcessor cp(converter, semaCtx, clauses);
   cp.processAllocate(clauseOps);
   cp.processDepend(clauseOps);
   cp.processFinal(stmtCtx, clauseOps);
   cp.processIf(llvm::omp::Directive::OMPD_task, clauseOps);
+  cp.processInReduction(loc, clauseOps, &inReductionTypes,
+                          &inReductionSyms);
   cp.processMergeable(clauseOps);
   cp.processPriority(stmtCtx, clauseOps);
   cp.processUntied(clauseOps);
   // TODO Support delayed privatization.
 
-  cp.processTODO<clause::Affinity, clause::Detach, clause::InReduction>(
+  cp.processTODO<clause::Affinity, clause::Detach>(
       loc, llvm::omp::Directive::OMPD_task);
 }
 
 static void genTaskgroupClauses(lower::AbstractConverter &converter,
-                                semantics::SemanticsContext &semaCtx,
-                                const List<Clause> &clauses, mlir::Location loc,
-                                mlir::omp::TaskgroupOperands &clauseOps) {
+    semantics::SemanticsContext &semaCtx,
+    const List<Clause> &clauses, mlir::Location loc,
+    mlir::omp::TaskgroupOperands &clauseOps,
+    llvm::SmallVectorImpl<mlir::Type> &taskReductionTypes,
+    llvm::SmallVectorImpl<const semantics::Symbol *> &taskReductionSyms) {
   ClauseProcessor cp(converter, semaCtx, clauses);
   cp.processAllocate(clauseOps);
-  cp.processTODO<clause::TaskReduction>(loc,
-                                        llvm::omp::Directive::OMPD_taskgroup);
+  cp.processTaskReduction(loc, clauseOps, &taskReductionTypes, &taskReductionSyms);
 }
 
 static void genTaskwaitClauses(lower::AbstractConverter &converter,
@@ -1866,13 +1871,26 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable,
           ConstructQueue::const_iterator item) {
   lower::StatementContext stmtCtx;
   mlir::omp::TaskOperands clauseOps;
-  genTaskClauses(converter, semaCtx, stmtCtx, item->clauses, loc, clauseOps);
+  llvm::SmallVector<mlir::Type> inReductionTypes;
+  llvm::SmallVector<const semantics::Symbol *> inreductionSyms;
+  genTaskClauses(converter, semaCtx, stmtCtx, item->clauses, loc, clauseOps,
+                 inReductionTypes, inreductionSyms);
+
+  auto reductionCallback = [&](mlir::Operation *op) {
+    genReductionVars(op, converter, loc, inreductionSyms, inReductionTypes);
+    return inreductionSyms;
+  };
 
-  return genOpWithBody<mlir::omp::TaskOp>(
+  auto taskOp = genOpWithBody<mlir::omp::TaskOp>(
       OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval,
                         llvm::omp::Directive::OMPD_task)
-          .setClauses(&item->clauses),
+          .setClauses(&item->clauses)
+          .setGenRegionEntryCb(reductionCallback),
       queue, item, clauseOps);
+  // Add reduction variables as arguments
+  llvm::SmallVector<mlir::Location> blockArgLocs(inReductionTypes.size(), loc);
+  taskOp->getRegion(0).addArguments(inReductionTypes, blockArgLocs);
+  return taskOp;
 }
 
 static mlir::omp::TaskgroupOp
@@ -1882,13 +1900,21 @@ genTaskgroupOp(lower::AbstractConverter &converter, lower::SymMap &symTable,
                const ConstructQueue &queue,
                ConstructQueue::const_iterator item) {
   mlir::omp::TaskgroupOperands clauseOps;
-  genTaskgroupClauses(converter, semaCtx, item->clauses, loc, clauseOps);
+  llvm::SmallVector<mlir::Type> taskReductionTypes;
+  llvm::SmallVector<const semantics::Symbol *> taskReductionSyms;
+  genTaskgroupClauses(converter, semaCtx, item->clauses, loc, clauseOps,
+                      taskReductionTypes, taskReductionSyms);
 
-  return genOpWithBody<mlir::omp::TaskgroupOp>(
+  auto taskgroupOp = genOpWithBody<mlir::omp::TaskgroupOp>(
       OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval,
                         llvm::omp::Directive::OMPD_taskgroup)
           .setClauses(&item->clauses),
       queue, item, clauseOps);
+
+  // Add reduction variables as arguments
+  llvm::SmallVector<mlir::Location> blockArgLocs(taskReductionSyms.size(), loc);
+  taskgroupOp->getRegion(0).addArguments(taskReductionTypes, blockArgLocs);
+  return taskgroupOp;
 }
 
 static mlir::omp::TaskwaitOp
@@ -2764,7 +2790,9 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
         !std::holds_alternative<clause::ThreadLimit>(clause.u) &&
         !std::holds_alternative<clause::Threads>(clause.u) &&
         !std::holds_alternative<clause::UseDeviceAddr>(clause.u) &&
-        !std::holds_alternative<clause::UseDevicePtr>(clause.u)) {
+        !std::holds_alternative<clause::UseDevicePtr>(clause.u) &&
+        !std::holds_alternative<clause::TaskReduction>(clause.u) &&
+        !std::holds_alternative<clause::InReduction>(clause.u)) {
       TODO(clauseLocation, "OpenMP Block construct clause");
     }
   }
diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
index 9da15ba303a475..deb25b4fff3792 100644
--- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
@@ -24,6 +24,7 @@
 #include "flang/Parser/tools.h"
 #include "mlir/Dialect/OpenMP/OpenMPDialect.h"
 #include "llvm/Support/CommandLine.h"
+#include <type_traits>
 
 static llvm::cl::opt<bool> forceByrefReduction(
     "force-byref-reduction",
@@ -34,6 +35,38 @@ namespace Fortran {
 namespace lower {
 namespace omp {
 
+// explicit template declarations
+template void ReductionProcessor::addDeclareReduction<omp::clause::Reduction>(
+        mlir::Location currentLocation,
+        lower::AbstractConverter &converter,
+        const omp::clause::Reduction &reduction,
+        llvm::SmallVectorImpl<mlir::Value> &reductionVars,
+        llvm::SmallVectorImpl<bool> &reduceVarByRef,
+        llvm::SmallVectorImpl<mlir::Attribute> &reductionDeclSymbols,
+        llvm::SmallVectorImpl<const semantics::Symbol *> *reductionSymbols
+    );
+
+template void ReductionProcessor::addDeclareReduction<omp::clause::TaskReduction>(
+        mlir::Location currentLocation,
+        lower::AbstractConverter &converter,
+        const omp::clause::TaskReduction &reduction,
+        llvm::SmallVectorImpl<mlir::Value> &reductionVars,
+        llvm::SmallVectorImpl<bool> &reduceVarByRef,
+        llvm::SmallVectorImpl<mlir::Attribute> &reductionDeclSymbols,
+        llvm::SmallVectorImpl<const semantics::Symbol *> *reductionSymbols
+    );
+
+template void ReductionProcessor::addDeclareReduction<omp::clause::InReduction>(
+        mlir::Location currentLocation,
+        lower::AbstractConverter &converter,
+        const omp::clause::InReduction &reduction,
+        llvm::SmallVectorImpl<mlir::Value> &reductionVars,
+        llvm::SmallVectorImpl<bool> &reduceVarByRef,
+        llvm::SmallVectorImpl<mlir::Attribute> &reductionDeclSymbols,
+        llvm::SmallVectorImpl<const semantics::Symbol *> *reductionSymbols
+    );
+
+
 ReductionProcessor::ReductionIdentifier ReductionProcessor::getReductionType(
     const omp::clause::ProcedureDesignator &pd) {
   auto redType = llvm::StringSwitch<std::optional<ReductionIdentifier>>(
@@ -716,24 +749,26 @@ static bool doReductionByRef(mlir::Value reductionVar) {
   return false;
 }
 
-void ReductionProcessor::addDeclareReduction(
-    mlir::Location currentLocation, lower::AbstractConverter &converter,
-    const omp::clause::Reduction &reduction,
+template <class T>
+void ReductionProcessor::addDeclareReduction(mlir::Location currentLocation,
+    lower::AbstractConverter &converter,
+    const T &reduction,
     llvm::SmallVectorImpl<mlir::Value> &reductionVars,
     llvm::SmallVectorImpl<bool> &reduceVarByRef,
     llvm::SmallVectorImpl<mlir::Attribute> &reductionDeclSymbols,
     llvm::SmallVectorImpl<const semantics::Symbol *> *reductionSymbols) {
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
-
-  if (std::get<std::optional<omp::clause::Reduction::ReductionModifier>>(
-          reduction.t))
-    TODO(currentLocation, "Reduction modifiers are not supported");
+  if constexpr (std::is_same<T, omp::clause::Reduction>::value) {
+      if (std::get<std::optional<typename T::ReductionModifier>>(
+            reduction.t))
+      TODO(currentLocation, "Reduction modifiers are not supported");
+    }
 
   mlir::omp::DeclareReductionOp decl;
-  const auto &redOperatorList{
-      std::get<omp::clause::Reduction::ReductionIdentifiers>(reduction.t)};
-  assert(redOperatorList.size() == 1 && "Expecting single operator");
-  const auto &redOperator = redOperatorList.front();
+    const auto &redOperatorList{
+      std::get<typename T::ReductionIdentifiers>(reduction.t)};
+    assert(redOperatorList.size() == 1 && "Expecting single operator");
+    const auto &redOperator = redOperatorList.front();
   const auto &objectList{std::get<omp::ObjectList>(reduction.t)};
 
   if (!std::holds_alternative<omp::clause::DefinedOperator>(redOperator.u)) {
diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.h b/flang/lib/Lower/OpenMP/ReductionProcessor.h
index 0ed5782e5da1b7..d34db0618c7cda 100644
--- a/flang/lib/Lower/OpenMP/ReductionProcessor.h
+++ b/flang/lib/Lower/OpenMP/ReductionProcessor.h
@@ -120,9 +120,10 @@ class ReductionProcessor {
 
   /// Creates a reduction declaration and associates it with an OpenMP block
   /// directive.
+  template <class T>
   static void addDeclareReduction(
       mlir::Location currentLocation, lower::AbstractConverter &converter,
-      const omp::clause::Reduction &reduction,
+      const T &reduction,
       llvm::SmallVectorImpl<mlir::Value> &reductionVars,
       llvm::SmallVectorImpl<bool> &reduceVarByRef,
       llvm::SmallVectorImpl<mlir::Attribute> &reductionDeclSymbols,
diff --git a/flang/test/Lower/OpenMP/task_array_reduction.f90 b/flang/test/Lower/OpenMP/task_array_reduction.f90
new file mode 100644
index 00000000000000..74693343744c26
--- /dev/null
+++ b/flang/test/Lower/OpenMP/task_array_reduction.f90
@@ -0,0 +1,50 @@
+! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s
+! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s
+
+! CHECK-LABEL:  omp.declare_reduction @add_reduction_byref_box_Uxf32 : !fir.ref<!fir.box<!fir.array<?xf32>>> alloc {
+! [...]
+! CHECK:  omp.yield
+! CHECK-LABEL:  } init {
+! [...]
+! CHECK:  omp.yield
+! CHECK-LABEL:  } combiner {
+! [...]
+! CHECK:  omp.yield
+! CHECK-LABEL:  }  cleanup {
+! [...]
+! CHECK:  omp.yield
+! CHECK:  }
+
+! CHECK-LABEL:  func.func @_QPtaskreduction
+! CHECK-SAME:  (%[[VAL_0:.*]]: !fir.box<!fir.array<?xf32>> {fir.bindc_name = "x"}) {
+! CHECK:  %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope
+! CHECK:  %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_0]] dummy_scope %[[VAL_1]]
+! CHECK-SAME  {uniq_name = "_QFtaskreductionEx"} : (!fir.box<!fir.array<?xf32>>, !fir.dscope) -> (!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>)
+! CHECK:  omp.parallel {
+! CHECK:  %[[VAL_3:.*]] = fir.alloca !fir.box<!fir.array<?xf32>>
+! CHECK:  fir.store %[[VAL_2]]#1 to %[[VAL_3]] : !fir.ref<!fir.box<!fir.array<?xf32>>>
+! CHECK:  omp.taskgroup task_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_3]] ->  %[[VAL_4:.*]]: !fir.ref<!fir.box<!fir.array<?xf32>>>) {
+! CHECK:  %[[VAL_5:.*]] = fir.alloca !fir.box<!fir.array<?xf32>>
+! CHECK:  fir.store %[[VAL_2]]#1 to %[[VAL_5]] : !fir.ref<!fir.box<!fir.array<?xf32>>>
+! CHECK:  omp.task in_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_5]] -> %[[VAL_6:.*]] : !fir.ref<!fir.box<!fir.array<?xf32>>>) {
+! [...]
+! CHECK:  omp.terminator
+! CHECK:  }
+! CHECK:  omp.terminator
+! CHECK:  }
+! CHECK:  omp.terminator
+! CHECK:  }
+! CHECK:  return
+! CHECK:  }
+
+subroutine taskReduction(x)
+   real, dimension(:) :: x
+   !$omp parallel
+   !$omp taskgroup task_reduction(+:x)
+   !$omp task in_reduction(+:x)
+   x = x + 1
+   !$omp end task
+   !$omp end taskgroup
+   !$omp end parallel
+end subroutine
+
diff --git a/flang/test/Lower/OpenMP/task_in_reduction.f90 b/flang/test/Lower/OpenMP/task_in_reduction.f90
new file mode 100644
index 00000000000000..26c079d5ac8aa5
--- /dev/null
+++ b/flang/test/Lower/OpenMP/task_in_reduction.f90
@@ -0,0 +1,48 @@
+! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s
+! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s
+
+!CHECK-LABEL: omp.declare_reduction
+!CHECK-SAME: @[[RED_I32_NAME:.*]] : i32 init {
+!CHECK: ^bb0(%{{.*}}: i32):
+!CHECK:  %[[C0_1:.*]] = arith.constant 0 : i32
+!CHECK:  omp.yield(%[[C0_1]] : i32)
+!CHECK: } combiner {
+!CHECK: ^bb0(%[[ARG0:.*]]: i32, %[[ARG1:.*]]: i32):
+!CHECK:  %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i32
+!CHECK:  omp.yield(%[[RES]] : i32)
+!CHECK: }
+
+!CHECK-LABEL:  func.func @_QPin_reduction() {
+!CHECK:  %[[VAL_0:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFin_reductionEx"}
+!CHECK:  %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFin_reductionEx"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:  %[[VAL_2:.*]] = arith.constant 0 : i32
+!CHECK:  hlfir.assign %[[VAL_2]] to %[[VAL_1]]#0 : i32, !fir.ref<i32>
+!CHECK:  omp.parallel {
+!CHECK:  omp.taskgroup task_reduction(@[[RED_I32_NAME]] %[[VAL_1]]#0 -> %[[VAL_3:.*]] : !fir.ref<i32>) {
+!CHECK:  omp.task in_reduction(@[[RED_I32_NAME]] %[[VAL_1]]#0 -> %[[VAL_4:.*]] : !fir.ref<i32>) {
+!CHECK:  %[[VAL_5:.*]]:2 = hlfir.declare %[[VAL_4]] {uniq_name = "_QFin_reductionEx"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:  %[[VAL_6:.*]] = fir.load %[[VAL_5]]#0 : !fir.ref<i32>
+!CHECK:  %[[VAL_7:.*]] = arith.constant 1 : i32
+!CHECK:  %[[VAL_8:.*]] = arith.addi %[[VAL_6]], %[[VAL_7]] : i32
+!CHECK:  hlfir.assign %[[VAL_8]] to %[[VAL_5]]#0 : i32, !fir.ref<i32>
+!CHECK:  omp.terminator
+!CHECK:  }
+!CHECK:  omp.terminator
+!CHECK:  }
+!CHECK:  omp.terminator
+!CHECK:  }
+!CHECK:  return
+!CHECK:  }
+
+subroutine in_reduction
+   integer :: x
+   x = 0
+   !$omp parallel
+   !$omp taskgroup task_reduction(+:x)
+   !$omp task in_reduction(+:x)
+   x = x + 1
+   !$omp end task
+   !$omp end taskgroup
+   !$omp end parallel
+end subroutine
+
diff --git...
[truncated]

github-actions · 2024-10-04T13:51:28Z

✅ With the latest revision this PR passed the C/C++ code formatter.

kiranchandramohan

Please add the translation from OpenMP dialect to LLVMIR before lowering to MLIR. Otherwise this will manifest as a crash.

kaviya2510 · 2024-10-04T18:56:19Z

Please add the translation from OpenMP dialect to LLVMIR before lowering to MLIR. Otherwise this will manifest as a crash.

Sure, I will do it.

skatrak

Thank you @kaviya2510 for this work. The approach seems fine to me, just adding some comments to hopefully help you rebase this patch whenever it's time.

flang/lib/Lower/OpenMP/ClauseProcessor.cpp

flang/lib/Lower/OpenMP/ClauseProcessor.h

flang/lib/Lower/OpenMP/OpenMP.cpp

kaviya2510 · 2024-12-06T14:36:44Z

Thanks for the review @skatrak . I rebased and also addressed your comments.
Kindly take a look at it and let me know if it needs any other changes.

skatrak

Thank you Kaviya, I only have a few small comments.

skatrak · 2024-12-06T15:04:52Z

flang/lib/Lower/OpenMP/ClauseProcessor.h

@@ -104,6 +104,9 @@ class ClauseProcessor {
  bool processIsDevicePtr(
      mlir::omp::IsDevicePtrClauseOps &result,
      llvm::SmallVectorImpl<const semantics::Symbol *> &isDeviceSyms) const;
+  bool processInReduction(
+      mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result,
+      llvm::SmallVectorImpl<const semantics::Symbol *> &InReductionSyms) const;


Nit: I think it's better to specify that it's intended as an output rather than restating the full clause name. Feel free to disagree, but in that case please fix capitalization.

Suggested change

llvm::SmallVectorImpl<const semantics::Symbol *> &InReductionSyms) const;

llvm::SmallVectorImpl<const semantics::Symbol *> &outReductionSyms) const;

Thanks for the reviews @skatrak.
yeah, I can understand. I will modify it.

Looking at it again I changed my mind on this 😅, so feel free to use whichever way you prefer, as long as the first letter is lowercase.

flang/lib/Lower/OpenMP/ClauseProcessor.h

flang/lib/Lower/OpenMP/DataSharingProcessor.cpp

flang/lib/Lower/OpenMP/OpenMP.cpp

…o MLIR

skatrak

Sorry for the delay getting back to this, I hope I haven't blocked you too much. I think there is one minor problem at the moment, which will probably make you have to update some of the tests, but other than that I just have some small nits.

flang/lib/Lower/OpenMP/ClauseProcessor.h

skatrak · 2025-01-09T16:26:31Z

flang/lib/Lower/OpenMP/DataSharingProcessor.cpp

  converter.collectSymbolSet(eval, allSymbols, flag,
-                             /*collectSymbols=*/true,
+                             /*collectSymbols=*/collectSymbols,


Nit: The comment is redundant when passing a variable instead of a constant.

Suggested change

/*collectSymbols=*/collectSymbols,

collectSymbols,

skatrak · 2025-01-09T16:33:12Z

flang/lib/Lower/OpenMP/OpenMP.cpp

+  OpWithBodyGenInfo genInfo =
      OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval,
                        llvm::omp::Directive::OMPD_task)
          .setClauses(&item->clauses)
          .setDataSharingProcessor(&dsp)
-          .setGenRegionEntryCb(genRegionEntryCB),
-      queue, item, clauseOps);
+          .setGenRegionEntryCb(genRegionEntryCB);
+
+  auto taskOp =
+      genOpWithBody<mlir::omp::TaskOp>(genInfo, queue, item, clauseOps);
+  return taskOp;


Nit: It doesn't look like this change is necessary. Since the general convention in this file is to return the result of genOpWithBody directly and construct the OpWithBodyGenInfo structure parameter inside of the call whenever possible, I think this should be left as it was.

skatrak · 2025-01-09T16:34:57Z

flang/lib/Lower/OpenMP/OpenMP.cpp

+  OpWithBodyGenInfo genInfo =
      OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval,
                        llvm::omp::Directive::OMPD_taskgroup)
-          .setClauses(&item->clauses),
-      queue, item, clauseOps);
+          .setClauses(&item->clauses)
+          .setGenRegionEntryCb(genRegionEntryCB);
+
+  auto taskgroupOp =
+      genOpWithBody<mlir::omp::TaskgroupOp>(genInfo, queue, item, clauseOps);
+  return taskgroupOp;


Nit: Same comment as above. For consistency, return directly the result of genOpWithBody and construct the genInfo argument within the argument list.

skatrak · 2025-01-09T16:40:36Z

flang/lib/Lower/OpenMP/OpenMP.cpp

+    genEntryBlock(converter.getFirOpBuilder(), taskgroupArgs, op->getRegion(0));
+    return llvm::to_vector(taskgroupArgs.getSyms());


This is missing the binding of the symbols to the new entry block arguments:

Suggested change

genEntryBlock(converter.getFirOpBuilder(), taskgroupArgs, op->getRegion(0));

return llvm::to_vector(taskgroupArgs.getSyms());

genEntryBlock(converter.getFirOpBuilder(), taskgroupArgs, op->getRegion(0));

bindEntryBlockArgs(converter,

llvm::cast<mlir::omp::BlockArgOpenMPOpInterface>(op),

taskgroupArgs);

return llvm::to_vector(taskgroupArgs.getSyms());

skatrak · 2025-01-09T16:49:53Z

flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90

+! CHECK:            %[[VAL_3:.*]] = fir.alloca !fir.box<!fir.array<?xf32>>
+! CHECK:            fir.store %[[VAL_2]]#1 to %[[VAL_3]] : !fir.ref<!fir.box<!fir.array<?xf32>>>
+! CHECK:            omp.taskgroup task_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_3]] ->  %[[VAL_4:.*]]: !fir.ref<!fir.box<!fir.array<?xf32>>>) {
+! CHECK:              %[[VAL_5:.*]] = fir.alloca !fir.box<!fir.array<?xf32>>


This doesn't look right to me, but I might be wrong. Shouldn't we be passing the result of an hlfir.declare of VAL_4 to omp.task, rather than creating an alloca for it? Perhaps this is related to the missing binding of Fortran symbols to entry block arguments during the creation of the region for omp.taskgroup.

skatrak · 2025-01-09T16:52:56Z

flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90

+!CHECK:         %[[VAL_0:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskgroup_task_reductionEres"}
+!CHECK:         %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFomp_taskgroup_task_reductionEres"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+!CHECK:         omp.taskgroup task_reduction(@[[RED_I32_NAME]]  %[[VAL_1]]#0 -> %[[VAL_2:.*]] : !fir.ref<i32>) {
+!CHECK:           %[[VAL_3:.*]] = fir.load %[[VAL_1]]#0 : !fir.ref<i32>


The omp.taskgroup region shouldn't have references to VAL_1. References to that variable should be done through VAL_2.

skatrak · 2025-01-09T16:54:05Z

flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90

+!CHECK-LABEL:  func.func @_QPin_reduction() {
+!                [...]
+!CHECK:          omp.taskgroup task_reduction(@[[RED_I32_NAME]] %[[VAL_1:.*]]#0 -> %[[VAL_3:.*]] : !fir.ref<i32>) {
+!CHECK:          omp.task in_reduction(@[[RED_I32_NAME]] %[[VAL_1]]#0 -> %[[VAL_4:.*]] : !fir.ref<i32>) {


Same problem here: VAL_1 shouldn't be used inside of the omp.taskgroup region.

skatrak · 2025-01-09T16:55:02Z

flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90

+! CHECK:             return
+! CHECK:           }
+
+subroutine taskReduction(x)


Nit: For consistency with other tests.

Suggested change

subroutine taskReduction(x)

subroutine task_reduction(x)

kaviya2510 · 2025-01-19T09:58:34Z

@skatrak It's fine. This PR is dependent on the translation patches for inreduction and task_reduction being merged first, so no problem.
Thank you for the feedback again. I’ll review your comments and get back to you soon.

…uments in taskgroup construct and fixed testcases

kaviya2510 · 2025-04-23T09:31:33Z

@skatrak apologies for the delayed response. I thought that this patch depends on the LLVM translation patch for task reduction and in-reduction and it might crash without the translation part. So I decided to keep it on hold and waiting for the translation patch to merge first.

Later, I came to know that you had added a patch which throws an error for unhandled clauses during LLVM translation. Realizing that this patch does not have a dependency, I resumed working on it.

kaviya2510 · 2025-04-23T09:33:13Z

I have addressed all your review comments in the recent patch. Could you please take a look at it and let me know if you have any comments?
Thankyou.

tblah

Thanks for your work on this. Looks pretty good overall. Just some minor points.

flang/lib/Lower/OpenMP/OpenMP.cpp

flang/lib/Lower/OpenMP/ReductionProcessor.cpp

…th setEntryBlockArgs()

kaviya2510 · 2025-04-28T08:11:26Z

Thanks for the review @tblah.

tblah

LGTM, thanks!

kaviya2510 · 2025-04-29T02:46:38Z

Hi @skatrak, Could you please review the patch and provide your approval?

kaviya2510 · 2025-04-29T02:47:39Z

LGTM, thanks!

Thank you for the approval @tblah

skatrak

Thank you Kaviya, LGTM! I have a minimal nit, but no need for another review by me before merging.

flang/lib/Lower/OpenMP/ClauseProcessor.cpp

…ion clause

kaviya2510 · 2025-05-05T08:51:27Z

`According to the OpenMP specification, the rules for variables with implicitly determined data-sharing attributes are:

In a parallel construct, if no default clause is present, these variables are shared
In a task generating construct, if no default clause is present, a variable for which the data-sharing attribute is not determined by the rules aboveand that in the enclosing context is determined to be shared by all implicit tasks bound to the current team is shared
In an orphaned task generating construct, if no default clause is present, dummy arguments are firstprivate
In a task generating construct, if no default clause is present, a variable for which the data-sharing attribute is not determined by the rules above is firstprivate`

subroutine omp_task_in_reduction()
   integer i
   i = 0
   !$omp task in_reduction(+:i)
   i = i + 1
   !$omp end task
end subroutine omp_task_in_reduction

In the above example, without any explicit data sharing attribute the variable i would normally considered as firstprivate in task construct. However, since the variablei appears in the in_reduction clause, its data-sharing attribute should not be firstprivate.

With the current flow, the compiler incorrectly marks the data-sharing attribute of i as firstprivate, based on the implicit rule for task constructs and as a result it generate the below mlir during lowering.

 omp.private {type = firstprivate} @_QFomp_task_in_reductionEi_firstprivate_ref_i32 : !fir.ref<i32> alloc {
  ^bb0(%arg0: !fir.ref<i32>):
    %0 = fir.alloca i32 {bindc_name = "i", pinned, uniq_name = "_QFomp_task_in_reductionEi"}
    %1:2 = hlfir.declare %0 {uniq_name = "_QFomp_task_in_reductionEi"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
    omp.yield(%1#0 : !fir.ref<i32>)
  } copy {
  ^bb0(%arg0: !fir.ref<i32>, %arg1: !fir.ref<i32>):
    %0 = fir.load %arg0 : !fir.ref<i32>
    hlfir.assign %0 to %arg1 : i32, !fir.ref<i32>
    omp.yield(%arg1 : !fir.ref<i32>)
  }

I am felling that my earlier fix in DataSharingProcessor.cpp is not correct. Instead of skipping collecting symbols in DataSharingProcessor.cpp, I added a new change which detects the dsa of variable i as in_reduction and skip marking it as firstprivate.

@tblah , could you please take a look at it and review my changes?

tblah

The new approach sounds good. Sorry I didn't catch this on the last round of review.

flang/lib/Lower/OpenMP/DataSharingProcessor.cpp

kaviya2510 · 2025-05-06T14:13:20Z

Thanks for the review.

kaviya2510 · 2025-05-07T04:54:17Z

Thank you Kaviya, LGTM! I have a minimal nit, but no need for another review by me before merging.

Thanks for the review @skatrak

github-actions · 2025-05-07T04:56:15Z

@kaviya2510 Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

…o MLIR (llvm#111155) This patch, - Added support for lowering of task_reduction to MLIR - Added support for lowering of in_reduction to MLIR - Fixed incorrect DSA handling for variables in the presence of 'in_reduction' clause.

kaviya2510 requested review from tblah, kiranchandramohan, harishch4, NimishMishra, kiranktp and Thirumalai-Shaktivel October 4, 2024 13:47

llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Oct 4, 2024

kiranchandramohan reviewed Oct 4, 2024

View reviewed changes

kaviya2510 changed the title ~~Support for lowering task_reduction and in_reduction to MLIR~~ [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR Oct 4, 2024

kaviya2510 force-pushed the task_reduction branch 4 times, most recently from 8aeb419 to 711fe33 Compare October 7, 2024 07:54

skatrak reviewed Oct 23, 2024

View reviewed changes

flang/lib/Lower/OpenMP/ClauseProcessor.cpp Show resolved Hide resolved

flang/lib/Lower/OpenMP/ClauseProcessor.h Outdated Show resolved Hide resolved

flang/lib/Lower/OpenMP/OpenMP.cpp Outdated Show resolved Hide resolved

kaviya2510 force-pushed the task_reduction branch 2 times, most recently from b401393 to e31a990 Compare December 6, 2024 13:13

skatrak reviewed Dec 6, 2024

View reviewed changes

kaviya2510 force-pushed the task_reduction branch from e31a990 to 00f3f5d Compare December 11, 2024 11:02

[Flang][OpenMP]Support for lowering task_reduction and in_reduction t…

60cbcc2

…o MLIR

kaviya2510 force-pushed the task_reduction branch 2 times, most recently from cef327b to 72d2230 Compare December 13, 2024 11:58

[Flang][OpenMP] Addressed review comments

1b5a47e

kaviya2510 force-pushed the task_reduction branch from 72d2230 to 1b5a47e Compare December 13, 2024 12:33

skatrak reviewed Jan 9, 2025

View reviewed changes

kaviya2510 added 2 commits April 11, 2025 12:28

Merge branch 'main' into task_reduction

3f53c5e

Addressed review comments: Binding Fortran symbols to entry block arg…

c2026f4

…uments in taskgroup construct and fixed testcases

tblah reviewed Apr 26, 2025

View reviewed changes

flang/lib/Lower/OpenMP/OpenMP.cpp Outdated Show resolved Hide resolved

flang/lib/Lower/OpenMP/ReductionProcessor.cpp Show resolved Hide resolved

Addressed review comment: Replacing the call setGenRegionEntryCb() wi…

258259a

…th setEntryBlockArgs()

tblah approved these changes Apr 28, 2025

View reviewed changes

kaviya2510 requested a review from skatrak April 28, 2025 14:55

skatrak approved these changes Apr 30, 2025

View reviewed changes

flang/lib/Lower/OpenMP/ClauseProcessor.cpp Show resolved Hide resolved

Fixed dsa of variables in task construct in the presence of in_reduct…

4954246

…ion clause

llvmbot added the flang:semantics label May 5, 2025

kaviya2510 requested a review from tblah May 5, 2025 14:18

tblah approved these changes May 6, 2025

View reviewed changes

flang/lib/Lower/OpenMP/DataSharingProcessor.cpp Outdated Show resolved Hide resolved

Fix code formatting

32e59c0

kaviya2510 merged commit 9e7d529 into llvm:main May 7, 2025
11 checks passed

kaviya2510 deleted the task_reduction branch May 12, 2025 04:42

	llvm::SmallVectorImpl<const semantics::Symbol *> &InReductionSyms) const;
	llvm::SmallVectorImpl<const semantics::Symbol *> &outReductionSyms) const;

		genEntryBlock(converter.getFirOpBuilder(), taskgroupArgs, op->getRegion(0));
		return llvm::to_vector(taskgroupArgs.getSyms());

[Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR #111155

[Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR #111155

Uh oh!

Conversation

kaviya2510 commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 4, 2024

Uh oh!

llvmbot commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kiranchandramohan left a comment

Choose a reason for hiding this comment

Uh oh!

kaviya2510 commented Oct 4, 2024

Uh oh!

skatrak left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kaviya2510 commented Dec 6, 2024

Uh oh!

skatrak left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

skatrak left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaviya2510 commented Jan 19, 2025

Uh oh!

kaviya2510 commented Apr 23, 2025

Uh oh!

kaviya2510 commented Apr 23, 2025

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kaviya2510 commented Apr 28, 2025

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

kaviya2510 commented Oct 4, 2024 •

edited

Loading

llvmbot commented Oct 4, 2024 •

edited

Loading

github-actions bot commented Oct 4, 2024 •

edited

Loading