-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[flang][fir] Add fir.local
op for locality specifiers
#138505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-flang-fir-hlfir Author: Kareem Ergawy (ergawy) ChangesAdds a new Full diff: https://github.com/llvm/llvm-project/pull/138505.diff 4 Files Affected:
diff --git a/flang/include/flang/Optimizer/Dialect/FIRAttr.td b/flang/include/flang/Optimizer/Dialect/FIRAttr.td
index 3ebc24951cfff..2845080030b92 100644
--- a/flang/include/flang/Optimizer/Dialect/FIRAttr.td
+++ b/flang/include/flang/Optimizer/Dialect/FIRAttr.td
@@ -200,4 +200,23 @@ def fir_OpenMPSafeTempArrayCopyAttr : fir_Attr<"OpenMPSafeTempArrayCopy"> {
}];
}
+def LocalitySpecTypeLocal : I32EnumAttrCase<"Local", 0, "local">;
+def LocalitySpecTypeLocalInit
+ : I32EnumAttrCase<"LocalInit", 1, "local_init">;
+
+def LocalitySpecifierType : I32EnumAttr<
+ "LocalitySpecifierType",
+ "Type of a locality specifier", [
+ LocalitySpecTypeLocal,
+ LocalitySpecTypeLocalInit
+ ]> {
+ let genSpecializedAttr = 0;
+ let cppNamespace = "::fir";
+}
+
+def LocalitySpecifierTypeAttr : EnumAttr<FIROpsDialect, LocalitySpecifierType,
+ "locality_specifier_type"> {
+ let assemblyFormat = "`{` `type` `=` $value `}`";
+}
+
#endif // FIR_DIALECT_FIR_ATTRS
diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td
index 0ba985641069b..aea57d2e8dd71 100644
--- a/flang/include/flang/Optimizer/Dialect/FIROps.td
+++ b/flang/include/flang/Optimizer/Dialect/FIROps.td
@@ -3485,6 +3485,137 @@ def fir_BoxTotalElementsOp
let hasCanonicalizer = 1;
}
+def YieldOp : fir_Op<"yield",
+ [Pure, ReturnLike, Terminator,
+ ParentOneOf<["LocalitySpecifierOp"]>]> {
+ let summary = "loop yield and termination operation";
+ let description = [{
+ "fir.yield" yields SSA values from the fir dialect op region and
+ terminates the region. The semantics of how the values are yielded is
+ defined by the parent operation.
+ }];
+
+ let arguments = (ins Variadic<AnyType>:$results);
+
+ let builders = [
+ OpBuilder<(ins), [{ build($_builder, $_state, {}); }]>
+ ];
+
+ let assemblyFormat = "( `(` $results^ `:` type($results) `)` )? attr-dict";
+}
+
+def fir_LocalitySpecifierOp : fir_Op<"local", [IsolatedFromAbove]> {
+ let summary = "Provides declaration of [first]private logic.";
+ let description = [{
+ This operation provides a declaration of how to implement the
+ localization of a variable. The dialect users should provide
+ which type should be allocated for this variable. The allocated (usually by
+ alloca) variable is passed to the initialization region which does everything
+ else (e.g. initialization of Fortran runtime descriptors). Information about
+ how to initialize the copy from the original item should be given in the
+ copy region, and if needed, how to deallocate memory (allocated by the
+ initialization region) in the dealloc region.
+
+ Examples:
+
+ * `local(x)` would not need any regions because no initialization is
+ required by the standard for i32 variables and this is not firstprivate.
+ ```mlir
+ fir.local {type = local} @x.localizer : i32
+ ```
+
+ * `local_init(x)` would be emitted as:
+ ```mlir
+ fir.local {type = local_init} @x.localizer : i32 copy {
+ ^bb0(%arg0: !fir.ref<i32>, %arg1: !fir.ref<i32>):
+ // %arg0 is the original host variable.
+ // %arg1 represents the memory allocated for this private variable.
+ ... copy from host to the localized clone ....
+ fir.yield(%arg1 : !fir.ref<i32>)
+ }
+ ```
+
+ * `local(x)` for "allocatables" would be emitted as:
+ ```mlir
+ fir.local {type = local} @x.privatizer : !some.type init {
+ ^bb0(%arg0: !some.pointer<!some.type>, %arg1: !some.pointer<!some.type>):
+ // initialize %arg1, using %arg0 as a mold for allocations.
+ // For example if %arg0 is a heap allocated array with a runtime determined
+ // length and !some.type is a runtime type descriptor, the init region
+ // will read the array length from %arg0, and heap allocate an array of the
+ // right length and initialize %arg1 to contain the array allocation and
+ // length.
+ fir.yield(%arg1 : !some.pointer<!some.type>)
+ } dealloc {
+ ^bb0(%arg0: !some.pointer<!some.type>):
+ // ... deallocate memory allocated by the init region...
+ // In the example above, this will free the heap allocated array data.
+ fir.yield
+ }
+ ```
+
+ There are no restrictions on the body except for:
+ - The `dealloc` regions has a single argument.
+ - The `init` & `copy` regions have 2 arguments.
+ - All three regions are terminated by `fir.yield` ops.
+ The above restrictions and other obvious restrictions (e.g. verifying the
+ type of yielded values) are verified by the custom op verifier. The actual
+ contents of the blocks inside all regions are not verified.
+
+ Instances of this op would then be used by ops that model directives that
+ accept data-sharing attribute clauses.
+
+ The `sym_name` attribute provides a symbol by which the privatizer op can be
+ referenced by other dialect ops.
+
+ The `type` attribute is the type of the value being localized. This type
+ will be implicitly allocated in MLIR->LLVMIR conversion and passed as the
+ second argument to the init region. Therefore the type of arguments to
+ the regions should be a type which represents a pointer to `type`.
+
+ The `locality_specifier_type` attribute specifies whether the localized
+ corresponds to a `local` or a `local_init` specifier.
+ }];
+
+ let arguments = (ins SymbolNameAttr:$sym_name,
+ TypeAttrOf<AnyType>:$type,
+ LocalitySpecifierTypeAttr:$locality_specifier_type);
+
+ let regions = (region AnyRegion:$init_region,
+ AnyRegion:$copy_region,
+ AnyRegion:$dealloc_region);
+
+ let assemblyFormat = [{
+ $locality_specifier_type $sym_name `:` $type
+ (`init` $init_region^)?
+ (`copy` $copy_region^)?
+ (`dealloc` $dealloc_region^)?
+ attr-dict
+ }];
+
+ let builders = [
+ OpBuilder<(ins CArg<"mlir::TypeRange">:$result,
+ CArg<"mlir::StringAttr">:$sym_name,
+ CArg<"mlir::TypeAttr">:$type)>
+ ];
+
+ let extraClassDeclaration = [{
+ /// Get the type for arguments to nested regions. This should
+ /// generally be either the same as getType() or some pointer
+ /// type (pointing to the type allocated by this op).
+ /// This method will return Type{nullptr} if there are no nested
+ /// regions.
+ mlir::Type getArgType() {
+ for (mlir::Region *region : getRegions())
+ for (mlir::Type ty : region->getArgumentTypes())
+ return ty;
+ return nullptr;
+ }
+ }];
+
+ let hasRegionVerifier = 1;
+}
+
def fir_DoConcurrentOp : fir_Op<"do_concurrent",
[SingleBlock, AutomaticAllocationScope]> {
let summary = "do concurrent loop wrapper";
diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp
index 05ef69169bae5..65ec730e134c2 100644
--- a/flang/lib/Optimizer/Dialect/FIROps.cpp
+++ b/flang/lib/Optimizer/Dialect/FIROps.cpp
@@ -4909,6 +4909,105 @@ void fir::BoxTotalElementsOp::getCanonicalizationPatterns(
patterns.add<SimplifyBoxTotalElementsOp>(context);
}
+//===----------------------------------------------------------------------===//
+// LocalitySpecifierOp
+//===----------------------------------------------------------------------===//
+
+llvm::LogicalResult fir::LocalitySpecifierOp::verifyRegions() {
+ mlir::Type argType = getArgType();
+ auto verifyTerminator = [&](mlir::Operation *terminator,
+ bool yieldsValue) -> llvm::LogicalResult {
+ if (!terminator->getBlock()->getSuccessors().empty())
+ return llvm::success();
+
+ if (!llvm::isa<fir::YieldOp>(terminator))
+ return mlir::emitError(terminator->getLoc())
+ << "expected exit block terminator to be an `fir.yield` op.";
+
+ YieldOp yieldOp = llvm::cast<YieldOp>(terminator);
+ mlir::TypeRange yieldedTypes = yieldOp.getResults().getTypes();
+
+ if (!yieldsValue) {
+ if (yieldedTypes.empty())
+ return llvm::success();
+
+ return mlir::emitError(terminator->getLoc())
+ << "Did not expect any values to be yielded.";
+ }
+
+ if (yieldedTypes.size() == 1 && yieldedTypes.front() == argType)
+ return llvm::success();
+
+ auto error = mlir::emitError(yieldOp.getLoc())
+ << "Invalid yielded value. Expected type: " << argType
+ << ", got: ";
+
+ if (yieldedTypes.empty())
+ error << "None";
+ else
+ error << yieldedTypes;
+
+ return error;
+ };
+
+ auto verifyRegion = [&](mlir::Region ®ion, unsigned expectedNumArgs,
+ llvm::StringRef regionName,
+ bool yieldsValue) -> llvm::LogicalResult {
+ assert(!region.empty());
+
+ if (region.getNumArguments() != expectedNumArgs)
+ return mlir::emitError(region.getLoc())
+ << "`" << regionName << "`: "
+ << "expected " << expectedNumArgs
+ << " region arguments, got: " << region.getNumArguments();
+
+ for (mlir::Block &block : region) {
+ // MLIR will verify the absence of the terminator for us.
+ if (!block.mightHaveTerminator())
+ continue;
+
+ if (failed(verifyTerminator(block.getTerminator(), yieldsValue)))
+ return llvm::failure();
+ }
+
+ return llvm::success();
+ };
+
+ // Ensure all of the region arguments have the same type
+ for (mlir::Region *region : getRegions())
+ for (mlir::Type ty : region->getArgumentTypes())
+ if (ty != argType)
+ return emitError() << "Region argument type mismatch: got " << ty
+ << " expected " << argType << ".";
+
+ mlir::Region &initRegion = getInitRegion();
+ if (!initRegion.empty() &&
+ failed(verifyRegion(getInitRegion(), /*expectedNumArgs=*/2, "init",
+ /*yieldsValue=*/true)))
+ return llvm::failure();
+
+ LocalitySpecifierType dsType = getLocalitySpecifierType();
+
+ if (dsType == LocalitySpecifierType::Local && !getCopyRegion().empty())
+ return emitError("`local` specifiers do not require a `copy` region.");
+
+ if (dsType == LocalitySpecifierType::LocalInit && getCopyRegion().empty())
+ return emitError(
+ "`local_init` specifier require at least a `copy` region.");
+
+ if (dsType == LocalitySpecifierType::LocalInit &&
+ failed(verifyRegion(getCopyRegion(), /*expectedNumArgs=*/2, "copy",
+ /*yieldsValue=*/true)))
+ return llvm::failure();
+
+ if (!getDeallocRegion().empty() &&
+ failed(verifyRegion(getDeallocRegion(), /*expectedNumArgs=*/1, "dealloc",
+ /*yieldsValue=*/false)))
+ return llvm::failure();
+
+ return llvm::success();
+}
+
//===----------------------------------------------------------------------===//
// DoConcurrentOp
//===----------------------------------------------------------------------===//
diff --git a/flang/test/Fir/do_concurrent.fir b/flang/test/Fir/do_concurrent.fir
index 8e80ffb9c7b0b..4e55777402428 100644
--- a/flang/test/Fir/do_concurrent.fir
+++ b/flang/test/Fir/do_concurrent.fir
@@ -90,3 +90,22 @@ func.func @dc_2d_reduction(%i_lb: index, %i_ub: index, %i_st: index,
// CHECK: fir.store %[[J_IV_CVT]] to %[[J]] : !fir.ref<i32>
// CHECK: }
// CHECK: }
+
+
+fir.local {type = local} @local_privatizer : i32
+
+// CHECK: fir.local {type = local} @[[LOCAL_PRIV_SYM:local_privatizer]] : i32
+
+fir.local {type = local_init} @local_init_privatizer : i32 copy {
+^bb0(%arg0: !fir.ref<i32>, %arg1: !fir.ref<i32>):
+ %0 = fir.load %arg0 : !fir.ref<i32>
+ fir.store %0 to %arg1 : !fir.ref<i32>
+ fir.yield(%arg1 : !fir.ref<i32>)
+}
+
+// CHECK: fir.local {type = local_init} @[[LOCAL_INIT_PRIV_SYM:local_init_privatizer]] : i32
+// CHECK: ^bb0(%[[ORIG_VAL:.*]]: !fir.ref<i32>, %[[LOCAL_VAL:.*]]: !fir.ref<i32>):
+// CHECK: %[[ORIG_VAL_LD:.*]] = fir.load %[[ORIG_VAL]]
+// CHECK: fir.store %[[ORIG_VAL_LD]] to %[[LOCAL_VAL]] : !fir.ref<i32>
+// CHECK: fir.yield(%[[LOCAL_VAL]] : !fir.ref<i32>)
+// CHECK: }
|
Adds support for lowering `do concurrent` nests from PFT to the new `fir.do_concurrent` MLIR op as well as its special terminator `fir.do_concurrent.loop` which models the actual loop nest. To that end, this PR emits the allocations for the iteration variables within the block of the `fir.do_concurrent` op and creates a region for the `fir.do_concurrent.loop` op that accepts arguments equal in number to the number of the input `do concurrent` iteration ranges. For example, given the following input: ```fortran do concurrent(i=1:10, j=11:20) end do ``` the changes in this PR emit the following MLIR: ```mlir fir.do_concurrent { %22 = fir.alloca i32 {bindc_name = "i"} %23:2 = hlfir.declare %22 {uniq_name = "_QFsub1Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) %24 = fir.alloca i32 {bindc_name = "j"} %25:2 = hlfir.declare %24 {uniq_name = "_QFsub1Ej"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) fir.do_concurrent.loop (%arg1, %arg2) = (%18, %20) to (%19, %21) step (%c1, %c1_0) { %26 = fir.convert %arg1 : (index) -> i32 fir.store %26 to %23#0 : !fir.ref<i32> %27 = fir.convert %arg2 : (index) -> i32 fir.store %27 to %25#0 : !fir.ref<i32> } } ```
4374004
to
1211438
Compare
Adds a new `fir.local` op to model `local` and `local_init` locality specifiers. This op is a clone of `omp.private`. In particular, this new op also models the privatization/localization logic of an SSA value in the `fir` dialect just like `omp.private` does for OpenMP.
be899d5
to
09f3a12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please could you add a test for the verifier failures.
On it .... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM aside from the two nits and, more importantly, the tests that @tblah has requested
ParentOneOf<["LocalitySpecifierOp"]>]> { | ||
let summary = "loop yield and termination operation"; | ||
let description = [{ | ||
"fir.yield" yields SSA values from the fir dialect op region and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: sed s/from the fir dialect op region/from a fir dialect op region/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
* `local(x)` for "allocatables" would be emitted as: | ||
``` | ||
fir.local {type = local} @x.privatizer : !some.type init { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultra Nit : sed s/@x.privatizer/@x.localizer/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates
…138505) Adds a new `fir.local` op to model `local` and `local_init` locality specifiers. This op is a clone of `omp.private`. In particular, this new op also models the privatization/localization logic of an SSA value in the `fir` dialect just like `omp.private` does for OpenMP. PR stack: - llvm/llvm-project#137928 - llvm/llvm-project#138505 (this PR) - llvm/llvm-project#138506 - llvm/llvm-project#138512 - llvm/llvm-project#138534 - llvm/llvm-project#138816
@@ -94,10 +94,11 @@ struct IncrementLoopInfo { | |||
template <typename T> | |||
explicit IncrementLoopInfo(Fortran::semantics::Symbol &sym, const T &lower, | |||
const T &upper, const std::optional<T> &step, | |||
bool isUnordered = false) | |||
bool isConcurrent = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unordered is also used for array operation. how is this handled now?
@@ -120,7 +121,7 @@ struct IncrementLoopInfo { | |||
const Fortran::lower::SomeExpr *upperExpr; | |||
const Fortran::lower::SomeExpr *stepExpr; | |||
const Fortran::lower::SomeExpr *maskExpr = nullptr; | |||
bool isUnordered; // do concurrent, forall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is forall treated as do concurrent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some post commit questions.
Adds a new
fir.local
op to modellocal
andlocal_init
locality specifiers. This op is a clone ofomp.private
. In particular, this new op also models the privatization/localization logic of an SSA value in thefir
dialect just likeomp.private
does for OpenMP.PR stack:
do concurrent
loop nests tofir.do_concurrent
#137928fir.local
op for locality specifiers #138505 (this PR)fir.do_concurrent.loop
#138506fir.do_concurrent
locality specs tofir.do_loop ... unordered
#138512