-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[Clang][OpenMP] Support for dispatch construct (Sema & Codegen) support #131838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…clauses depend, novariants & nocontext.
1) Changing comments from // to /// 2) Using STLExtras like llvm::is_contained(). 3) Insteading of combining function comments for 2 functions provided them separately before the function definitions.
Instead of using nullptr for initilizalition StmtResult using an uninitialized StmtResult. Unnecessary StmtResult & llvm::ArrayRef variables.
2) Changed comments in transformDispatchDirective() in SemaOpenMP.cpp 3) Adding checks in clang/test/OpenMP/dispatch_messages.cpp for a) depend clause b) nocontext, novariants & depend clauses occuring in the same line of dispatch construct. 4) Removing debugging statements (kept under #if 0 and #endif) in OMPContext.h.
nocontext(c1) novariants(c2) depend(inout:a)".
'nocontext' clause is ignored, only 'novariants' clause is applied to first occuring clause among 'nocontext' or 'novariants' is applied This is being done because of usage of llvm::any_of & llvm::is_contained instead of if statements.
This PR is a continuation of the work in #117904. |
Do you want me to move the changes from CodeGen back into SemaOpenMP.cpp and avoid AnnotateAttr? ( In one of your feedbacks you had indicated that I should move the helper codes to CodeGen ). With changes that I have now, I was looking at:
|
The codegen for each(virtual) directive should live on codegen. But(!) Sema should emit separate (specific) CapturedStmts for each such virtual region. Check, how the combined directives work. Sema defines multiple CapturedStmt for each combined directive, and then codegen emits the code for each(virtual) standalone directive, using CapturedStmt for this particular (virtual) standalone directive. The list of captures of each sub-directives might be different, so Sema should provide different CapturedStmt for each emitted sub-directive. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
I have added an extra virtual region and 2 CapturedStmt-s are present. |
modified: clang/lib/Sema/SemaOpenMP.cpp
modified: clang/lib/CodeGen/CGStmtOpenMP.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to target region codegen
clang/lib/CodeGen/CGStmtOpenMP.cpp
Outdated
@@ -4528,6 +4528,115 @@ void CodeGenFunction::EmitOMPMasterDirective(const OMPMasterDirective &S) { | |||
emitMaster(*this, S); | |||
} | |||
|
|||
static Expr *getInitialExprFromCapturedExpr(Expr *Cond) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need it? CapturedExpr should be emitted as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is my mistake in naming the function as getInitialExprFromCapturedExpr(). It should have been getCapturedExprFromImplicitCastExpr( ). The function extracts the OMPCapturedExpr from within the clause.
clause->getCondition() does not give me the OMPCapturedExpr.
(gdb) NoContextC->getCondition()->dump()
ImplicitCastExpr 0x678640 'int' <LValueToRValue>
`-DeclRefExpr 0x678620 'int' lvalue OMPCapturedExpr 0x6785a0 '.capture_expr.' 'int'
NoContextC
is of type OMPNocontextClause *
.
If there is a simpler way to get the OMPCapturedExpr please tell me.
At present, I will rename the function name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to dig for the captured expr here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While generating the If-else statement in EmitIfElse(), I am using
llvm::Value *CondValue = CGF->EvaluateExprAsBool(Condition);
and
CGF->Builder.CreateCondBr(CondValue, ThenBlock, ElseBlock);
The EvaluateExprAsBool
needs a CapturedExpr.
Using NoContextC->getCondition()
results in a core dump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means, the context (capturedstmt) does not know about this capturedexpr, which means it is captured in the wrong context. You're not fixing the problem, you're overcoming it with this hack. Need to capture the expression in the correct context and emit in the correct context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use S.clauses() to get the list of all the clauses (type OMPClause) present in the dispatch directive. The example of NoContextC of type OMPClause contains ImplicitCastExpr. An OMPClause cannot be changed to contain capturedExpr. I think I am not able to understand where do you want me to change it. Can you please clarify?
getInitialExprFromCapturedExpr() to getCapturedExprFromImplicitCastExpr() 2) Removing extra additions from clang/docs/ReleaseNotes.rst.
@@ -11899,6 +11899,9 @@ def err_omp_clause_requires_dispatch_construct : Error< | |||
"'%0' clause requires 'dispatch' context selector">; | |||
def err_omp_append_args_with_varargs : Error< | |||
"'append_args' is not allowed with varargs functions">; | |||
def warn_omp_dispatch_clause_novariants_nocontext : Warning< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is a warning, not an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message is to indicate the user that if novariants and nocontext are specified then novariants is taken into account. This warning is to inform the user and not an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it is not an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not need to stop execution if novariants and nocontext occur together. In a dispatch directive like #pragma omp dispatch nocontext(c1) novariants(c2)
then I do not find a way to generate
if(condition)
foo();
else
foo_variant
For the "condition" in if
statement I do not know a way to use c1 & c2 together. Hence I preferred to indicate this to user and use novariants(c2)
. I do not want to stop the compilation with an error message because the spec does not indicate what to do in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be just c1 || c2
? What's the problem in emitting the combined expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this in the standard:
If do-not-use-variant evaluates to true, no function variant is selected for the target-call of the
13 dispatch region associated with the novariants clause even if one would be selected
14 normally.
Doe it mean that novariant has higher priority than nocontext?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relates only to the novariants clause only. However, I can extrapolate the spec to mean whatever I have implemented. It will be good to provide a warning.
If you feel that warning is not needed then I can remove it but, the user should not be left wondering why nocontext was not considered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dreachem thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The effect of the nocontext clause is to not add the dispatch construct to the construct trait set. It will only matter once we support dispatch in a construct trait selector (support for this is not required in 5.2, but is required in 6.0).
Assuming we support dispatch in a construct trait selector, then this is what the various combinations would mean:
novariants(0), nocontext(0): Allow selective variant substitution according to context match
novariants(0), nocontext(1): Allow selective variant substitution according to context match (NO MATCH
for selectors that require **dispatch** trait)
novariants(1), nocontext(0): Always call the base function (shuts off any variant substitution for the call)
novariants(1), nocontext(1): Always call the base function (shuts off any variant substitution for the call)
In summary, novariants takes precedence and nocontext can be ignored ONLY if the do-not-use-variant argument evaluates to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alexey,
I have added support for clauses novariants and nocontext occuring on the same dispatch directive and removed the warning message.
Can you please review the code?
--Sunil
CGF->EmitBranchOnBoolExpr().
dispatch directive.
Hmm, this PR is much shorter than it used to be. |
Changes to be committed: modified: clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
The initial PR where you had given your comments was 117904 . I had made a push to wrong branch and as such 117904 was closed. I Could not re-open it. I ended up creating a new PR. The current PR has all the changes. I have included a pointer to 117904 in the beginning of this PR. |
// } | ||
// } | ||
// | ||
static void emitIfElse(CodeGenFunction *CGF, Stmt *AssociatedStmt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rather doubt it will correctly with constructors/destructors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example of constructors/destructors where it will not work? The codegen for the nocontext
& novariant
clauses occurring together is present in the lit test case. I have written the code according to truth table mentioned by Deepak:
novariants(0), nocontext(0): Allow selective variant substitution according to context match
novariants(0), nocontext(1): Allow selective variant substitution according to context match (NO MATCH
for selectors that require **dispatch** trait)
novariants(1), nocontext(0): Always call the base function (shuts off any variant substitution for the call)
novariants(1), nocontext(1): Always call the base function (shuts off any variant substitution for the call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to add tests with classes, which evaluate to boolean, and create immediate class instance and see, that the destructors are called correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexey-bataev Both @SunilKuravinakop and I are unclear what test you're asking for here.
When you say "test with classes", do you mean that the called function (and its variant) should have a local declaration of a variable of class type? Such as:
void f_v()
{
MyType2 v; // MyType2 constructor called on v
...
// MyType2 destructor called on v
}
#pragma omp declare variant(f_v) match(construct={dispatch})
void f()
{
MyType v; // MyType constructor called on v
...
// MyType destructor called on v
}
int main()
{
#pragma omp dispatch
f();
}
But then, what do you mean by "which evaluate to boolean"?
Can you provide a quick sketch of what this test should be doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I thought about the condition, which involves instantiation of the class, to check that the codegen handles correctly constructors/destructors calls (that's always a problem in C++). Is it possible to create a condition, that includes class, something like:
class AAA {
...
};
#pragma omp dispatch if(AAA()) <---- the constructor and then destructor is called here
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification. So something like this?
#include <iostream>
int main()
{
class IsPositive {
public:
IsPositive(int x) : _x(x) {
std::cout << "constructing IsPositive(" << x << ") object\n";
}
~IsPositive() {
std::cout << "destructing IsPositive(" << _x << ") object\n";
}
operator bool() const { return _x > 0; }
private:
int _x;
};
// case 1
#pragma omp parallel if(IsPositive(10))
{
std::cout << "[1] in parallel\n";
}
// case 2
auto cond1 = IsPositive(5);
#pragma omp parallel if(cond1)
{
std::cout << "[2] in parallel\n";
}
// case 3
#pragma omp parallel if(IsPositive(-10))
{
std::cout << "[3] in parallel\n";
}
// case 4
auto cond2 = IsPositive(-5);
#pragma omp parallel if(cond2)
{
std::cout << "[4] in parallel\n";
}
The output for the above with OMP_NUM_THREADS=2
is:
constructing IsPositive(10) object
[1] in parallel
[1] in parallel
destructing IsPositive(10) object
constructing IsPositive(5) object
[2] in parallel
[2] in parallel
constructing IsPositive(-10) object
[3] in parallel
destructing IsPositive(-10) object
constructing IsPositive(-5) object
[4] in parallel
destructing IsPositive(-5) object
destructing IsPositive(5) object
So, the destructor for the instantiated class object used in the if
condition is called upon completing each parallel
region. On the other hand, if I move the condition expression outside the directive (as for cases 2 and 4), then the destructor for the instantiated class is called at the end of the function.
Does this match the expected behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, something like this but for dispatch. We need to handle classes construction/deconstruction correctly, I just want to be sure that we don't miss anything here and avoid bugs, that will require significant efforts to fix/rework later.
static Expr *replaceWithNewTraitsOrDirectCall(CapturedDecl *CDecl, | ||
Expr *NewExpr) { | ||
Expr *CurrentCallExpr = nullptr; | ||
Stmt *CallExprStmt = CDecl->getBody(); | ||
|
||
if (BinaryOperator *BinaryCopyOpr = dyn_cast<BinaryOperator>(CallExprStmt)) { | ||
CurrentCallExpr = BinaryCopyOpr->getRHS(); | ||
BinaryCopyOpr->setRHS(NewExpr); | ||
} else { | ||
CurrentCallExpr = dyn_cast<Expr>(CallExprStmt); | ||
CDecl->setBody(NewExpr); | ||
} | ||
|
||
return CurrentCallExpr; | ||
} | ||
|
||
static Expr *transformCallInStmt(Stmt *StmtP, bool NoContext = false) { | ||
Expr *CurrentExpr = nullptr; | ||
if (auto *CptStmt = dyn_cast<CapturedStmt>(StmtP)) { | ||
CapturedDecl *CDecl = CptStmt->getCapturedDecl(); | ||
|
||
CallExpr *NewCallExpr = nullptr; | ||
for (const auto *attr : CDecl->attrs()) { | ||
if (NoContext) { | ||
if (const auto *annotateAttr = | ||
llvm::dyn_cast<clang::AnnotateAttr>(attr); | ||
annotateAttr && annotateAttr->getAnnotation() == "NoContextAttr") { | ||
NewCallExpr = llvm::dyn_cast<CallExpr>(*annotateAttr->args_begin()); | ||
} | ||
} else { | ||
if (const auto *annotateAttr = | ||
llvm::dyn_cast<clang::AnnotateAttr>(attr); | ||
annotateAttr && annotateAttr->getAnnotation() == "NoVariantsAttr") { | ||
NewCallExpr = llvm::dyn_cast<CallExpr>(*annotateAttr->args_begin()); | ||
} | ||
} | ||
} | ||
|
||
CurrentExpr = replaceWithNewTraitsOrDirectCall(CDecl, NewCallExpr); | ||
} | ||
return CurrentExpr; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think these functions should not be needed, if something is required, it should be build in Sema and stored in AST. If you need to replace some AST values by some LVM IR values, use OpaqueValue nodes, which can be replaced in codegen by special RAIIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please clarify your comment based on my following points?
- There is a pre-computed call trait (pre-computed in Sema) stored in AnnotateAttr. I am using those AnnotateAttrs in
transformCallInStmt
. InreplaceWithNewTraitsOrDirectCall
I am trying to set the call based on BinaryExpr under the dispatch directive. - If you want me to pre-compute and store it in Sema I will have to clone the CDecl and store it. Earlier, in the helper functions (in Sema), I created new AST attributes, but according to your suggestion now I will have to store it in AnnotateAttr.
- An OpaqueValueExpr node copies the pointer of the SourceExpr. This does not help in noting the alternative call trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with these transformations is that sometimes they do not work as expected and miss some cases. Especially it happens with C++, which has lots of idioms like classes (especially inheritance), templates, implicit conversions (functions), etc. All these tree traversal techniques tend to be not fully correct in many cases. That;s why the preferred way to build helper expressions is doing it in Sema, because Sema knows how to handle all these cases correctly, but codegen does not. That's why I'm asking about compatibility with C++, it has lots of corner cases, and it would be good somehow to handle them in this patch rather than fixing it later
// } | ||
// } | ||
// | ||
static void emitIfElse(CodeGenFunction *CGF, Stmt *AssociatedStmt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to add tests with classes, which evaluate to boolean, and create immediate class instance and see, that the destructors are called correctly
@@ -0,0 +1,364 @@ | |||
// expected-no-diagnostics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests with classes and function members are required, as well as classes, used in conditions
Support for dispatch construct (Sema & Codegen) support. Support for clauses: depend, novariants & nocontext.