Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[BUG] NDK r25c clang generates invalid code/crashes when optimizing for size (-Os) #1862

@SanjaLV

Description

@SanjaLV

Description

Android NDK r25c produces invalid code (and crashes if we mark certain functions as noinline) for arm64-v8a target architecture when compiling with -Os (optimize size) compiler flag.

Bellow is attached minimized/striped sample that shows the problem (it uses functions from libtomcrypt).

Link to the repo repository: https://github.com/SanjaLV/NDK_r25c_repro

Prerequisites:

  1. Linux/macOS machine
  2. ANDROID_HOME env variable that will point to Android SDK root.
  3. "ndk;25.2.9519653" / "ndk;25.1.8937393" installed in sdkmanager
  4. System clang compiler with UBSAN/ASAN.
  5. arm64-v8a Android device/emulator (emulator was tested only on Apple M1 machine) connected to ADB

How to reproduce (invalid code):

  1. Run test_manual.sh
  2. Observe that REMOTE_0s_LOG.txt differs from REMOVE_02_LOG.txt

How to reproduce (compiler crash):

  1. Open REPRO.c in the editor of your choice
  2. Change define on line 10 to #define MAKE_COMPILER_CRASH 1
  3. Run test_manual.sh
  4. Observe that clang will crash trying to compile REPRO.c with -0s

Crash backtrace should look like:

Program received signal SIGSEGV, Segmentation fault.
0x00000000065bec3e in llvm::VPTransformState::get(llvm::VPValue*, llvm::VPIteration const&) ()
(gdb) bt
#0  0x00000000065bec3e in llvm::VPTransformState::get(llvm::VPValue*, llvm::VPIteration const&) ()
#1  0x00000000065be8b9 in llvm::InnerLoopVectorizer::scalarizeInstruction(llvm::Instruction*, llvm::VPReplicateRecipe*, llvm::VPIteration const&, bool, llvm::VPTransformState&) ()
#2  0x00000000065be760 in llvm::VPReplicateRecipe::execute(llvm::VPTransformState&) ()
#3  0x00000000065be31e in llvm::VPBasicBlock::execute(llvm::VPTransformState*) ()
#4  0x00000000065be047 in llvm::VPRegionBlock::execute(llvm::VPTransformState*) ()
#5  0x00000000065bdf6c in llvm::VPRegionBlock::execute(llvm::VPTransformState*) ()
#6  0x00000000066fabf1 in llvm::VPlan::execute(llvm::VPTransformState*) ()
#7  0x00000000062fd91e in llvm::LoopVectorizationPlanner::executePlan(llvm::ElementCount, unsigned int, llvm::VPlan&, llvm::InnerLoopVectorizer&, llvm::DominatorTree*) ()
#8  0x000000000663a48f in llvm::LoopVectorizePass::processLoop(llvm::Loop*) ()
#9  0x0000000005edebea in llvm::LoopVectorizePass::runImpl(llvm::Function&, llvm::ScalarEvolution&, llvm::LoopInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::BlockFrequencyInfo&, llvm::TargetLibraryInfo*, llvm::DemandedBits&, llvm::AAResults&, llvm::AssumptionCache&, std::__1::function<llvm::LoopAccessInfo const& (llvm::Loop&)>&, llvm::OptimizationRemarkEmitter&, llvm::ProfileSummaryInfo*)
    ()
#10 0x0000000005eddbb9 in llvm::LoopVectorizePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) ()
#11 0x0000000005edd89b in ?? ()
#12 0x0000000005c6168a in llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) ()
#13 0x0000000005c61521 in clang::TemplateDeclInstantiator::VisitDecl(clang::Decl*) ()
#14 0x0000000005ec7112 in llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) ()
#15 0x0000000005ec6dd1 in ?? ()
#16 0x00000000063538c6 in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) ()
#17 0x00000000065d66c8 in clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) ()
#18 0x00000000060524d5 in ?? ()
#19 0x0000000005ea25a9 in clang::ParseAST(clang::Sema&, bool, bool) ()
#20 0x00000000063c128d in clang::FrontendAction::Execute() ()
#21 0x00000000063c112d in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) ()
#22 0x00000000063c1541 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) ()
#23 0x00000000066a9f54 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) ()
#24 0x00000000066a6de3 in ?? ()
#25 0x00000000066754a5 in main ()

Context:

Originally discovered that upgrading NDK from version r25b to r25c changes return values of certain cryptographic functions. After some investigations, we found that the first function where the return value changes with NDK r25c was mp_montgomery_reduce. Then we minimized the code by fixing the input argument to mp_montgomery_reduce. (These values are not unique any valid random values will work too, as long as P is odd).
We tried to compare generated assembly, but clangs inlines the majority of the calls, so that complicates the investigation, thus we tried to apply noinline attribute. Which resulted in a compiler crash during Loop vectorization.


Comparing clang_source_info.md of both NDK, I can spot a few patches regarding arm64 vectorization:

- [[AArch64] Use simd mov to materialize big fp constants](https://android.googlesource.com/toolchain/llvm_android/+/91fdeab43d29b1f228113859da8ee238bc8c2f16/patches/cherry/7a605ab7bfbc681c34335684f45b7da32d495db1.patch)
- [[AArch64] Emit vector FP cmp when LE is used with fast-math](https://android.googlesource.com/toolchain/llvm_android/+/91fdeab43d29b1f228113859da8ee238bc8c2f16/patches/cherry/bf268a05cd9294854ffccc3158c0e673069bed4a.patch)
- [Loop-Vectorizer-shouldMaximizeVectorBandwidth.patch](https://android.googlesource.com/toolchain/llvm_android/+/91fdeab43d29b1f228113859da8ee238bc8c2f16/patches/Loop-Vectorizer-shouldMaximizeVectorBandwidth.patch)

I don't have access to patches, thus cannot verify this hypothesis. Clang with debug asserts enabled might also provide additional information, but I don't know how to build NDK clang.

Feel free to ask for more information.

Many thanks,
Aleksandrs

Affected versions

r25

Canary version

No response

Host OS

Linux, Mac

Host OS version

Ubuntu 22.04

Affected ABIs

arm64-v8a

Build system

ndk-build

Other build system

No response

minSdkVersion

31 (not relevant)

Device API level

27

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions