Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[flang][fir] Add fir.local op for locality specifiers #138505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 7, 2025

Conversation

ergawy
Copy link
Member

@ergawy ergawy commented May 5, 2025

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels May 5, 2025
@llvmbot
Copy link
Member

llvmbot commented May 5, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Kareem Ergawy (ergawy)

Changes

Adds a new fir.local op to model local and local_init locality specifiers. This op is a clone of omp.private. In particular, this new op also models the privatization/localization logic of an SSA value in the fir dialect just like omp.private does for OpenMP.


Full diff: https://github.com/llvm/llvm-project/pull/138505.diff

4 Files Affected:

  • (modified) flang/include/flang/Optimizer/Dialect/FIRAttr.td (+19)
  • (modified) flang/include/flang/Optimizer/Dialect/FIROps.td (+131)
  • (modified) flang/lib/Optimizer/Dialect/FIROps.cpp (+99)
  • (modified) flang/test/Fir/do_concurrent.fir (+19)
diff --git a/flang/include/flang/Optimizer/Dialect/FIRAttr.td b/flang/include/flang/Optimizer/Dialect/FIRAttr.td
index 3ebc24951cfff..2845080030b92 100644
--- a/flang/include/flang/Optimizer/Dialect/FIRAttr.td
+++ b/flang/include/flang/Optimizer/Dialect/FIRAttr.td
@@ -200,4 +200,23 @@ def fir_OpenMPSafeTempArrayCopyAttr : fir_Attr<"OpenMPSafeTempArrayCopy"> {
   }];
 }
 
+def LocalitySpecTypeLocal : I32EnumAttrCase<"Local", 0, "local">;
+def LocalitySpecTypeLocalInit
+    : I32EnumAttrCase<"LocalInit", 1, "local_init">;
+
+def LocalitySpecifierType : I32EnumAttr<
+    "LocalitySpecifierType",
+    "Type of a locality specifier", [
+      LocalitySpecTypeLocal,
+      LocalitySpecTypeLocalInit
+    ]> {
+  let genSpecializedAttr = 0;
+  let cppNamespace = "::fir";
+}
+
+def LocalitySpecifierTypeAttr : EnumAttr<FIROpsDialect, LocalitySpecifierType,
+                                                "locality_specifier_type"> {
+  let assemblyFormat = "`{` `type` `=` $value `}`";
+}
+
 #endif // FIR_DIALECT_FIR_ATTRS
diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td
index 0ba985641069b..aea57d2e8dd71 100644
--- a/flang/include/flang/Optimizer/Dialect/FIROps.td
+++ b/flang/include/flang/Optimizer/Dialect/FIROps.td
@@ -3485,6 +3485,137 @@ def fir_BoxTotalElementsOp
   let hasCanonicalizer = 1;
 }
 
+def YieldOp : fir_Op<"yield",
+    [Pure, ReturnLike, Terminator,
+     ParentOneOf<["LocalitySpecifierOp"]>]> {
+  let summary = "loop yield and termination operation";
+  let description = [{
+    "fir.yield" yields SSA values from the fir dialect op region and
+    terminates the region. The semantics of how the values are yielded is
+    defined by the parent operation.
+  }];
+
+  let arguments = (ins Variadic<AnyType>:$results);
+
+  let builders = [
+    OpBuilder<(ins), [{ build($_builder, $_state, {}); }]>
+  ];
+
+  let assemblyFormat = "( `(` $results^ `:` type($results) `)` )? attr-dict";
+}
+
+def fir_LocalitySpecifierOp : fir_Op<"local", [IsolatedFromAbove]> {
+  let summary = "Provides declaration of [first]private logic.";
+  let description = [{
+    This operation provides a declaration of how to implement the
+    localization of a variable. The dialect users should provide
+    which type should be allocated for this variable. The allocated (usually by
+    alloca) variable is passed to the initialization region which does everything
+    else (e.g. initialization of Fortran runtime descriptors). Information about
+    how to initialize the copy from the original item should be given in the
+    copy region, and if needed, how to deallocate memory (allocated by the
+    initialization region) in the dealloc region.
+
+    Examples:
+
+    * `local(x)` would not need any regions because no initialization is
+      required by the standard for i32 variables and this is not firstprivate.
+    ```mlir
+    fir.local {type = local} @x.localizer : i32
+    ```
+
+    * `local_init(x)` would be emitted as:
+    ```mlir
+    fir.local {type = local_init} @x.localizer : i32 copy {
+    ^bb0(%arg0: !fir.ref<i32>, %arg1: !fir.ref<i32>):
+    // %arg0 is the original host variable.
+    // %arg1 represents the memory allocated for this private variable.
+    ... copy from host to the localized clone ....
+    fir.yield(%arg1 : !fir.ref<i32>)
+    }
+    ```
+
+    * `local(x)` for "allocatables" would be emitted as:
+    ```mlir
+    fir.local {type = local} @x.privatizer : !some.type init {
+    ^bb0(%arg0: !some.pointer<!some.type>, %arg1: !some.pointer<!some.type>):
+    // initialize %arg1, using %arg0 as a mold for allocations.
+    // For example if %arg0 is a heap allocated array with a runtime determined
+    // length and !some.type is a runtime type descriptor, the init region
+    // will read the array length from %arg0, and heap allocate an array of the
+    // right length and initialize %arg1 to contain the array allocation and
+    // length.
+    fir.yield(%arg1 : !some.pointer<!some.type>)
+    } dealloc {
+    ^bb0(%arg0: !some.pointer<!some.type>):
+    // ... deallocate memory allocated by the init region...
+    // In the example above, this will free the heap allocated array data.
+    fir.yield
+    }
+    ```
+
+    There are no restrictions on the body except for:
+    - The `dealloc` regions has a single argument.
+    - The `init` & `copy` regions have 2 arguments.
+    - All three regions are terminated by `fir.yield` ops.
+    The above restrictions and other obvious restrictions (e.g. verifying the
+    type of yielded values) are verified by the custom op verifier. The actual
+    contents of the blocks inside all regions are not verified.
+
+    Instances of this op would then be used by ops that model directives that
+    accept data-sharing attribute clauses.
+
+    The `sym_name` attribute provides a symbol by which the privatizer op can be
+    referenced by other dialect ops.
+
+    The `type` attribute is the type of the value being localized. This type
+    will be implicitly allocated in MLIR->LLVMIR conversion and passed as the
+    second argument to the init region. Therefore the type of arguments to
+    the regions should be a type which represents a pointer to `type`.
+
+    The `locality_specifier_type` attribute specifies whether the localized
+    corresponds to a `local` or a `local_init` specifier.
+  }];
+
+  let arguments = (ins SymbolNameAttr:$sym_name,
+                       TypeAttrOf<AnyType>:$type,
+                       LocalitySpecifierTypeAttr:$locality_specifier_type);
+
+  let regions = (region AnyRegion:$init_region,
+                        AnyRegion:$copy_region,
+                        AnyRegion:$dealloc_region);
+
+  let assemblyFormat = [{
+    $locality_specifier_type $sym_name `:` $type
+      (`init` $init_region^)?
+      (`copy` $copy_region^)?
+      (`dealloc` $dealloc_region^)?
+      attr-dict
+  }];
+
+  let builders = [
+    OpBuilder<(ins CArg<"mlir::TypeRange">:$result,
+                   CArg<"mlir::StringAttr">:$sym_name,
+                   CArg<"mlir::TypeAttr">:$type)>
+  ];
+
+  let extraClassDeclaration = [{
+    /// Get the type for arguments to nested regions. This should
+    /// generally be either the same as getType() or some pointer
+    /// type (pointing to the type allocated by this op).
+    /// This method will return Type{nullptr} if there are no nested
+    /// regions.
+    mlir::Type getArgType() {
+      for (mlir::Region *region : getRegions())
+        for (mlir::Type ty : region->getArgumentTypes())
+          return ty;
+      return nullptr;
+    }
+  }];
+
+  let hasRegionVerifier = 1;
+}
+
 def fir_DoConcurrentOp : fir_Op<"do_concurrent",
     [SingleBlock, AutomaticAllocationScope]> {
   let summary = "do concurrent loop wrapper";
diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp
index 05ef69169bae5..65ec730e134c2 100644
--- a/flang/lib/Optimizer/Dialect/FIROps.cpp
+++ b/flang/lib/Optimizer/Dialect/FIROps.cpp
@@ -4909,6 +4909,105 @@ void fir::BoxTotalElementsOp::getCanonicalizationPatterns(
   patterns.add<SimplifyBoxTotalElementsOp>(context);
 }
 
+//===----------------------------------------------------------------------===//
+// LocalitySpecifierOp
+//===----------------------------------------------------------------------===//
+
+llvm::LogicalResult fir::LocalitySpecifierOp::verifyRegions() {
+  mlir::Type argType = getArgType();
+  auto verifyTerminator = [&](mlir::Operation *terminator,
+                              bool yieldsValue) -> llvm::LogicalResult {
+    if (!terminator->getBlock()->getSuccessors().empty())
+      return llvm::success();
+
+    if (!llvm::isa<fir::YieldOp>(terminator))
+      return mlir::emitError(terminator->getLoc())
+             << "expected exit block terminator to be an `fir.yield` op.";
+
+    YieldOp yieldOp = llvm::cast<YieldOp>(terminator);
+    mlir::TypeRange yieldedTypes = yieldOp.getResults().getTypes();
+
+    if (!yieldsValue) {
+      if (yieldedTypes.empty())
+        return llvm::success();
+
+      return mlir::emitError(terminator->getLoc())
+             << "Did not expect any values to be yielded.";
+    }
+
+    if (yieldedTypes.size() == 1 && yieldedTypes.front() == argType)
+      return llvm::success();
+
+    auto error = mlir::emitError(yieldOp.getLoc())
+                 << "Invalid yielded value. Expected type: " << argType
+                 << ", got: ";
+
+    if (yieldedTypes.empty())
+      error << "None";
+    else
+      error << yieldedTypes;
+
+    return error;
+  };
+
+  auto verifyRegion = [&](mlir::Region &region, unsigned expectedNumArgs,
+                          llvm::StringRef regionName,
+                          bool yieldsValue) -> llvm::LogicalResult {
+    assert(!region.empty());
+
+    if (region.getNumArguments() != expectedNumArgs)
+      return mlir::emitError(region.getLoc())
+             << "`" << regionName << "`: "
+             << "expected " << expectedNumArgs
+             << " region arguments, got: " << region.getNumArguments();
+
+    for (mlir::Block &block : region) {
+      // MLIR will verify the absence of the terminator for us.
+      if (!block.mightHaveTerminator())
+        continue;
+
+      if (failed(verifyTerminator(block.getTerminator(), yieldsValue)))
+        return llvm::failure();
+    }
+
+    return llvm::success();
+  };
+
+  // Ensure all of the region arguments have the same type
+  for (mlir::Region *region : getRegions())
+    for (mlir::Type ty : region->getArgumentTypes())
+      if (ty != argType)
+        return emitError() << "Region argument type mismatch: got " << ty
+                           << " expected " << argType << ".";
+
+  mlir::Region &initRegion = getInitRegion();
+  if (!initRegion.empty() &&
+      failed(verifyRegion(getInitRegion(), /*expectedNumArgs=*/2, "init",
+                          /*yieldsValue=*/true)))
+    return llvm::failure();
+
+  LocalitySpecifierType dsType = getLocalitySpecifierType();
+
+  if (dsType == LocalitySpecifierType::Local && !getCopyRegion().empty())
+    return emitError("`local` specifiers do not require a `copy` region.");
+
+  if (dsType == LocalitySpecifierType::LocalInit && getCopyRegion().empty())
+    return emitError(
+        "`local_init` specifier require at least a `copy` region.");
+
+  if (dsType == LocalitySpecifierType::LocalInit &&
+      failed(verifyRegion(getCopyRegion(), /*expectedNumArgs=*/2, "copy",
+                          /*yieldsValue=*/true)))
+    return llvm::failure();
+
+  if (!getDeallocRegion().empty() &&
+      failed(verifyRegion(getDeallocRegion(), /*expectedNumArgs=*/1, "dealloc",
+                          /*yieldsValue=*/false)))
+    return llvm::failure();
+
+  return llvm::success();
+}
+
 //===----------------------------------------------------------------------===//
 // DoConcurrentOp
 //===----------------------------------------------------------------------===//
diff --git a/flang/test/Fir/do_concurrent.fir b/flang/test/Fir/do_concurrent.fir
index 8e80ffb9c7b0b..4e55777402428 100644
--- a/flang/test/Fir/do_concurrent.fir
+++ b/flang/test/Fir/do_concurrent.fir
@@ -90,3 +90,22 @@ func.func @dc_2d_reduction(%i_lb: index, %i_ub: index, %i_st: index,
 // CHECK:             fir.store %[[J_IV_CVT]] to %[[J]] : !fir.ref<i32>
 // CHECK:           }
 // CHECK:         }
+
+
+fir.local {type = local} @local_privatizer : i32
+
+// CHECK:   fir.local {type = local} @[[LOCAL_PRIV_SYM:local_privatizer]] : i32
+
+fir.local {type = local_init} @local_init_privatizer : i32 copy {
+^bb0(%arg0: !fir.ref<i32>, %arg1: !fir.ref<i32>):
+    %0 = fir.load %arg0 : !fir.ref<i32>
+    fir.store %0 to %arg1 : !fir.ref<i32>
+    fir.yield(%arg1 : !fir.ref<i32>)
+}
+
+// CHECK:   fir.local {type = local_init} @[[LOCAL_INIT_PRIV_SYM:local_init_privatizer]] : i32
+// CHECK:   ^bb0(%[[ORIG_VAL:.*]]: !fir.ref<i32>, %[[LOCAL_VAL:.*]]: !fir.ref<i32>):
+// CHECK:      %[[ORIG_VAL_LD:.*]] = fir.load %[[ORIG_VAL]]
+// CHECK:      fir.store %[[ORIG_VAL_LD]] to %[[LOCAL_VAL]] : !fir.ref<i32>
+// CHECK:      fir.yield(%[[LOCAL_VAL]] : !fir.ref<i32>)
+// CHECK:   }

Adds support for lowering `do concurrent` nests from PFT to the new
`fir.do_concurrent` MLIR op as well as its special terminator
`fir.do_concurrent.loop` which models the actual loop nest.

To that end, this PR emits the allocations for the iteration variables
within the block of the `fir.do_concurrent` op and creates a region for
the `fir.do_concurrent.loop` op that accepts arguments equal in number
to the number of the input `do concurrent` iteration ranges.

For example, given the following input:
```fortran
   do concurrent(i=1:10, j=11:20)
   end do
```
the changes in this PR emit the following MLIR:
```mlir
    fir.do_concurrent {
      %22 = fir.alloca i32 {bindc_name = "i"}
      %23:2 = hlfir.declare %22 {uniq_name = "_QFsub1Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
      %24 = fir.alloca i32 {bindc_name = "j"}
      %25:2 = hlfir.declare %24 {uniq_name = "_QFsub1Ej"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
      fir.do_concurrent.loop (%arg1, %arg2) = (%18, %20) to (%19, %21) step (%c1, %c1_0) {
        %26 = fir.convert %arg1 : (index) -> i32
        fir.store %26 to %23#0 : !fir.ref<i32>
        %27 = fir.convert %arg2 : (index) -> i32
        fir.store %27 to %25#0 : !fir.ref<i32>
      }
    }
```
@ergawy ergawy force-pushed the users/ergawy/pft_to_do_concurrent_3 branch from 4374004 to 1211438 Compare May 5, 2025 11:13
Adds a new `fir.local` op to model `local` and `local_init` locality
specifiers. This op is a clone of `omp.private`. In particular, this new
op also models the privatization/localization logic of an SSA value in
the `fir` dialect just like `omp.private` does for OpenMP.
Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please could you add a test for the verifier failures.

@ergawy
Copy link
Member Author

ergawy commented May 6, 2025

Please could you add a test for the verifier failures.

On it ....

Copy link
Contributor

@bhandarkar-pranav bhandarkar-pranav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside from the two nits and, more importantly, the tests that @tblah has requested

ParentOneOf<["LocalitySpecifierOp"]>]> {
let summary = "loop yield and termination operation";
let description = [{
"fir.yield" yields SSA values from the fir dialect op region and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: sed s/from the fir dialect op region/from a fir dialect op region/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


* `local(x)` for "allocatables" would be emitted as:
```
fir.local {type = local} @x.privatizer : !some.type init {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultra Nit : sed s/@x.privatizer/@x.localizer/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ergawy
Copy link
Member Author

ergawy commented May 7, 2025

Please could you add a test for the verifier failures.

Done.

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates

Base automatically changed from users/ergawy/pft_to_do_concurrent_3 to main May 7, 2025 10:52
@ergawy ergawy merged commit a83bb35 into main May 7, 2025
7 checks passed
@ergawy ergawy deleted the users/ergawy/fir-dc-local-spec-0 branch May 7, 2025 12:00
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 7, 2025
…138505)

Adds a new `fir.local` op to model `local` and `local_init` locality
specifiers. This op is a clone of `omp.private`. In particular, this new
op also models the privatization/localization logic of an SSA value in
the `fir` dialect just like `omp.private` does for OpenMP.

PR stack:
- llvm/llvm-project#137928
- llvm/llvm-project#138505 (this PR)
- llvm/llvm-project#138506
- llvm/llvm-project#138512
- llvm/llvm-project#138534
- llvm/llvm-project#138816
@@ -94,10 +94,11 @@ struct IncrementLoopInfo {
template <typename T>
explicit IncrementLoopInfo(Fortran::semantics::Symbol &sym, const T &lower,
const T &upper, const std::optional<T> &step,
bool isUnordered = false)
bool isConcurrent = false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unordered is also used for array operation. how is this handled now?

@@ -120,7 +121,7 @@ struct IncrementLoopInfo {
const Fortran::lower::SomeExpr *upperExpr;
const Fortran::lower::SomeExpr *stepExpr;
const Fortran::lower::SomeExpr *maskExpr = nullptr;
bool isUnordered; // do concurrent, forall
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is forall treated as do concurrent?

Copy link
Contributor

@clementval clementval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some post commit questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants