Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[clang] Mark some language options as benign. #131569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

matts1
Copy link
Contributor

@matts1 matts1 commented Mar 17, 2025

I'm fairly certain that the options in this CL are benign, as I don't believe they affect the AST.

  • RTTI - shouldn't affect the AST, should only affect codegen
  • Trivial var init - also should only affect codegen
  • Stack protector - also codegen
  • Exceptions - Since exceptions do allow new things in the AST, but I'm pretty sure that they can differ in parent and child safely, I marked it as compatible instead.

I welcome any input from someone more familiar with this than me, as I might be wrong.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Mar 17, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 17, 2025

@llvm/pr-subscribers-clang

Author: Matt (matts1)

Changes

I'm fairly certain that the options in this CL are benign, as I don't believe they affect the AST.

  • RTTI - shouldn't affect the AST, should only affect codegen
  • Trivial var init - also should only affect codegen
  • Stack protector - also codegen
  • Exceptions - Since exceptions do allow new things in the AST, but I'm pretty sure that they can differ in parent and child safely, I marked it as compatible instead.

I welcome any input from someone more familiar with this than me, as I might be wrong.


Full diff: https://github.com/llvm/llvm-project/pull/131569.diff

2 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+1)
  • (modified) clang/include/clang/Basic/LangOptions.def (+9-9)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 372a95c80717c..bcd5df2f2edc0 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -295,6 +295,7 @@ Bug Fixes to C++ Support
 - Clang no longer crashes when a coroutine is declared ``[[noreturn]]``. (#GH127327)
 - Clang now uses the parameter location for abbreviated function templates in ``extern "C"``. (#GH46386)
 - Clang now correctly parses ``if constexpr`` expressions in immediate function context. (#GH123524)
+- Clang modules now allow a module and its user to have a larger variety of configurations.
 
 Improvements to C++ diagnostics
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def
index 383440ddbc0ea..beefc944959a1 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -139,9 +139,9 @@ ENUM_LANGOPT(AltivecSrcCompat, AltivecSrcCompatKind, 2,
 LANGOPT(ConvergentFunctions, 1, 1, "Assume convergent functions")
 LANGOPT(AltiVec           , 1, 0, "AltiVec-style vector initializers")
 LANGOPT(ZVector           , 1, 0, "System z vector extensions")
-LANGOPT(Exceptions        , 1, 0, "exception handling")
-LANGOPT(ObjCExceptions    , 1, 0, "Objective-C exceptions")
-LANGOPT(CXXExceptions     , 1, 0, "C++ exceptions")
+COMPATIBLE_LANGOPT(Exceptions        , 1, 0, "exception handling")
+COMPATIBLE_LANGOPT(ObjCExceptions    , 1, 0, "Objective-C exceptions")
+COMPATIBLE_LANGOPT(CXXExceptions     , 1, 0, "C++ exceptions")
 LANGOPT(EHAsynch          , 1, 0, "C/C++ EH Asynch exceptions")
 ENUM_LANGOPT(ExceptionHandling, ExceptionHandlingKind, 3,
              ExceptionHandlingKind::None, "exception handling")
@@ -149,8 +149,8 @@ LANGOPT(IgnoreExceptions  , 1, 0, "ignore exceptions")
 LANGOPT(ExternCNoUnwind   , 1, 0, "Assume extern C functions don't unwind")
 LANGOPT(AssumeNothrowExceptionDtor , 1, 0, "Assume exception object's destructor is nothrow")
 LANGOPT(TraditionalCPP    , 1, 0, "traditional CPP emulation")
-LANGOPT(RTTI              , 1, 1, "run-time type information")
-LANGOPT(RTTIData          , 1, 1, "emit run-time type information data")
+BENIGN_LANGOPT(RTTI              , 1, 1, "run-time type information")
+BENIGN_LANGOPT(RTTIData          , 1, 1, "emit run-time type information data")
 LANGOPT(MSBitfields       , 1, 0, "Microsoft-compatible structure layout")
 LANGOPT(MSVolatile        , 1, 0, "Microsoft-compatible volatile loads and stores")
 LANGOPT(Freestanding, 1, 0, "freestanding implementation")
@@ -397,13 +397,13 @@ BENIGN_ENUM_LANGOPT(ExternDeclNoDLLStorageClassVisibility, VisibilityFromDLLStor
 BENIGN_LANGOPT(SemanticInterposition        , 1, 0, "semantic interposition")
 BENIGN_LANGOPT(HalfNoSemanticInterposition, 1, 0,
                "Like -fno-semantic-interposition but don't use local aliases")
-ENUM_LANGOPT(StackProtector, StackProtectorMode, 2, SSPOff,
+BENIGN_ENUM_LANGOPT(StackProtector, StackProtectorMode, 2, SSPOff,
              "stack protector mode")
-ENUM_LANGOPT(TrivialAutoVarInit, TrivialAutoVarInitKind, 2, TrivialAutoVarInitKind::Uninitialized,
+BENIGN_ENUM_LANGOPT(TrivialAutoVarInit, TrivialAutoVarInitKind, 2, TrivialAutoVarInitKind::Uninitialized,
              "trivial automatic variable initialization")
-VALUE_LANGOPT(TrivialAutoVarInitStopAfter, 32, 0,
+BENIGN_VALUE_LANGOPT(TrivialAutoVarInitStopAfter, 32, 0,
              "stop trivial automatic variable initialization after the specified number of instances. Must be greater than 0.")
-VALUE_LANGOPT(TrivialAutoVarInitMaxSize, 32, 0,
+BENIGN_VALUE_LANGOPT(TrivialAutoVarInitMaxSize, 32, 0,
              "stop trivial automatic variable initialization if var size exceeds the specified size (in bytes). Must be greater than 0.")
 ENUM_LANGOPT(SignedOverflowBehavior, SignedOverflowBehaviorTy, 2, SOB_Undefined,
              "signed integer overflow handling")

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to TrivialAuto* seem fine, everything else is observable through macros :(

@matts1
Copy link
Contributor Author

matts1 commented Mar 17, 2025

Thanks for the review, what you said makes sense. This is extremely important for us that we can get this to work though, so I was wondering if there was some way we could work around it. I'll use the rtti flag for all future examples.

Firstly, to check that I understand correctly, essentially the only thing that compiling a PCH file does is to generate an AST and export some symbols, so literally the only thing that compiling a module with -frtti vs -fno-rtti would do is that one the former would parse with __cpp_rtti defined.

So from what I can tell, you would end up with the following compilation steps:

clang++ ... -o mod_a.pcm
clang++ ... -fmodule-file mod_a.pcm -o mod_b.pcm
clang++ -fmodule-file mod_a.pcm -fmodule_file mod_b.pcm -o <binary/object file>

The first and second's -frtti would decide whether the macro was enabled in their specific parts of the AST, and the third's -frtti would determine whether to actually enable RTTI in the codegen.

I believe it's extremely justified to disallow a module compiled with RTTI to depend on one without (and vice versa), by default. However, I'm currently trying to understand if there's a specific harm from allowing this when you have the -Wno-module-file-config-mismatch. It seems to work just fine in our chromium repo when compiling a bunch of stuff with no exceptions that depends upon the standard library built with exceptions (though I haven't gotten to running the code yet).

@matts1 matts1 force-pushed the push-wnplruunuzny branch from e32d927 to f53415b Compare March 17, 2025 13:54
@matts1
Copy link
Contributor Author

matts1 commented Mar 17, 2025

For now I've removed everything but the TrivialAuto*

@atetubou
Copy link
Contributor

Could you update issue description too?

@cor3ntin
Copy link
Contributor

The issue is that if we allow the preprocessor's state to differ across modules, then the resulting compiled units can be arbitrary (and subtly incompatible).

I wonder if a possible solution is to record which macros are used (ie, are expanded or appear in #ifdef / defined)
And ONLY serialize the set of macros that are used.
That way, if your program never uses __SSP__, for example, then your module would not be incompatible just because SPP has been defined.

I think that's worth pondering a bit more @Bigcheese

@Bigcheese
Copy link
Contributor

The general issue with changing this to benign is that it ends up being non-deterministic and buggy for implicitly built modules. I think what we want here is COMPATIBLE_LANGOPT. By default it will still feed into the context hash, but the compiler won't reject loading such modules. If Chrome plans to use explicitly built modules then this is fine.

There are quite a few options we could change to compatible, but the big issues is in making sure that Clang won't crash when we do that. I think for RTTI it's fine as that should only impact the predefine and codegen, not the shape of the AST. Sadly we have no good way to test this in general, just on a case by case basis when we hit issues.

@cor3ntin
Copy link
Contributor

@Bigcheese marking just Trivial var init benign seems fine to me, right?

@matts1 matts1 force-pushed the push-wnplruunuzny branch from f53415b to c7839a7 Compare April 15, 2025 02:21
@matts1
Copy link
Contributor Author

matts1 commented Apr 15, 2025

ping, I've just re-resolved the release notes conflicts

@matts1
Copy link
Contributor Author

matts1 commented Apr 15, 2025

The issue is that if we allow the preprocessor's state to differ across modules, then the resulting compiled units can be arbitrary (and subtly incompatible).

I wonder if a possible solution is to record which macros are used (ie, are expanded or appear in #ifdef / defined) And ONLY serialize the set of macros that are used. That way, if your program never uses __SSP__, for example, then your module would not be incompatible just because SPP has been defined.

IIUC, you're saying that codegen options that create macros are benign when they are never read. This statement seems correct and would probably be a good optimization, however it won't solve all use cases.

In our use case, for example, we want our code built without exceptions to depend upon libc++ built with exceptions. As libc++ reads __cpp_exceptions, this won't work. I think that the correct thing to do would be to, similarly to other options which only affect a single macro, turn it into a compatible langopt, with a future potential optimization that if the macro is never read it can be turned into a benign langopt.

What do you think @cor3ntin? I understand that potential subtle compatibility issues might be a concern for you, but I think there does need to be a way to achieve this. If you feel concerned, we could lock it behind a compiler flag, or an experimental compiler flag to see if there's any subtle issues before enabling it for everyone, or whatever you think would make you comfortable with doing this.

@mizvekov
Copy link
Contributor

I think one option would be to allow differences in macros, and rely on the ODR checker to catch when that would cause problems.

@cor3ntin cor3ntin requested a review from AaronBallman April 15, 2025 07:51
@cor3ntin
Copy link
Contributor

I think this looks fine but I'd like @Bigcheese @erichkeane @AaronBallman to look at it too.

@erichkeane
Copy link
Collaborator

I think this looks fine but I'd like @Bigcheese @erichkeane @AaronBallman to look at it too.

This looks fine as far as I can tell. Though, I'm not sure I have sufficient knowledge to be comfortable enough with the implications to approve this. @Bigcheese and @ChuanqiXu9 might be the most knowledgeable.

@atetubou
Copy link
Contributor

atetubou commented May 7, 2025

Better to update description of this PR to state only about trivial var init flags?

Comment on lines +401 to +405
BENIGN_ENUM_LANGOPT(TrivialAutoVarInit, TrivialAutoVarInitKind, 2, TrivialAutoVarInitKind::Uninitialized,
"trivial automatic variable initialization")
VALUE_LANGOPT(TrivialAutoVarInitStopAfter, 32, 0,
BENIGN_VALUE_LANGOPT(TrivialAutoVarInitStopAfter, 32, 0,
"stop trivial automatic variable initialization after the specified number of instances. Must be greater than 0.")
VALUE_LANGOPT(TrivialAutoVarInitMaxSize, 32, 0,
BENIGN_VALUE_LANGOPT(TrivialAutoVarInitMaxSize, 32, 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this may doesn't matter for header modules, but for named modules, it may affect generated codes. But I feel we should have other mechanism to detect and diagnose the inconsistent configs. (at least warning. Ideally emit diagnostics on need). I do meet such error diagnostics which stops people to use modules. So LGTM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question here even for named modules is if it impacts the pcm file, not the .o. We already have the case where the pcm for a named module is built by an entirely different build system and compiler than the one that built the .o file associated with the module.

Copy link
Contributor

@Bigcheese Bigcheese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with Chuanqi's release notes request.

Separately, I'd also like to see the rtti and exceptions stuff be marked compatible, although with some testing first of what happens when people do create AST differences with the macros. It's fine to error (as long as the diagnostic isn't terrible), I just want to know that we won't just crash.

I'm  fairly certain that the options in this CL are benign, as I don't believe they affect the AST.
* RTTI - shouldn't affect the AST, should only affect codegen
* Trivial var init - also should only affect codegen
* Stack protector - also codegen
* Exceptions - Since exceptions do allow new things in the AST, but I'm pretty sure that they can differ in parent and child safely, I marked it as compatible instead.

I welcome any input from someone more familiar with this than me, as I might be wrong.
@matts1 matts1 force-pushed the push-wnplruunuzny branch from c7839a7 to 670c08b Compare May 8, 2025 00:51
@matts1 matts1 requested a review from Bigcheese May 8, 2025 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants