Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[wasm-ld] Inconsistency: combineOutputSegments() merges InputChunks with differing COMDATs, but writeBody() asserts #134809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
anutosh491 opened this issue Apr 8, 2025 · 16 comments
Labels
crash Prefer [crash-on-valid] or [crash-on-invalid] lld:wasm

Comments

@anutosh491
Copy link
Contributor

Context

This is a part of the effort on running clang-repl in the browser. Check xeus-cpp-lite

In a cell block, I am trying to process multiple c++ definitions

const int var0 = 0;     // generates .rodata._ZL4var0
const C cvar0{0};       // generates .rodata._ZL5cvar0, in a different COMDAT

As a final part we have a linking step that produces the side module (incr_module_xx.wasm for each cell that is processed)

Problem :

Basically I end up with a situation where we're using wasm-ld to execute a linking process (with --emit-relocs being provided as a flag)

Now what I see is when createOutputSegments is run we end up with inputchunks with 3 different comdats (_ZL4var0, _ZL5cvar0 or empty)

And when we run CombineOutputSegments, I see segments having inputchunks with different comdats being combined without any assert or error raised

I am guessing this is justified because

void Writer::combineOutputSegments() {
// With PIC code we currently only support a single active data segment since
// we only have a single __memory_base to use as our base address. This pass
// combines all data segments into a single .data segment.
// This restriction does not apply when the extended const extension is
// available: https://github.com/WebAssembly/extended-const

But then as we are using --emit-relocs, we end up calling LinkingSection::writeBody() after finalizeSections

And here we have the following

#ifndef NDEBUG
for (const InputChunk *isec : inputSegments)
assert(isec->getComdatName() == comdat);
#endif

Hence I end up with this error

Aborted(Assertion failed: isec->getComdatName() == comdat, at: /Users/anutosh491/work/llvm-project/lld/wasm/SyntheticSections.cpp,747,writeBody)

while trying to run expressions with const in xeus-cpp-lite

Disclaimer

i) I see this happens only when we have #ifndef NDEBUG and I'm building llvm with MinSizeRel. Probably I wouldn't have encountered this with Release.

@anutosh491
Copy link
Contributor Author

cc @sbc100

My questions here end up being

  1. Should combineOutputSegments() check for COMDAT mismatches across InputChunks before merging them into one OutputSegment ? Maybe if we have #indef NDEBUG

  2. Is it ever valid to merge input segments with differing COMDATs (under certain modes)? Cause I guess that's what would happen in release mode isn't it ? (the writebody asset won't be a factor)

@anutosh491
Copy link
Contributor Author

anutosh491 commented Apr 8, 2025

I see this (73332d7#diff-e826be2acc8b58c5d040525dc8a509e90810d3edcd93190d4810e476919ef9aa)

which probably adds this

if (ctx.arg.relocatable && !segment->getComdatName().empty()) {
s = createOutputSegment(name);
} else {
if (segmentMap.count(name) == 0)
segmentMap[name] = createOutputSegment(name);
s = segmentMap[name];
}

But not sure if this is enough to do the job in my case (as can be seen in the linking step for clang-repl I have shared which should enable pic but not relocatable ... so the check doesn't affect my case)

DRY RUN

segment->name = ".rodata._ZL4var0";
segment->getComdatName() = "_ZL4var0";
ctx.arg.relocatable = false; 

StringRef name = getOutputDataSegmentName(*segment); 
// name becomes ".rodata"

if (ctx.arg.relocatable && !segment->getComdatName().empty()) {
    // SKIPPED because ctx.arg.relocatable == false
    s = createOutputSegment(name);
} else {
    // Enters this block
    if (segmentMap.count(name) == 0)
        segmentMap[name] = createOutputSegment(name);
    s = segmentMap[name];
}
s->addInputSegment(segment);
  • Both .rodata._ZL4var0 and .rodata._ZL5cvar0 will get the same OutputSegment (i.e., .rodata)

  • These chunks are added to the same segment without checking if their comdat differs.

@EugeneZelenko EugeneZelenko added lld:wasm crash Prefer [crash-on-valid] or [crash-on-invalid] and removed new issue labels Apr 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 8, 2025

@llvm/issue-subscribers-lld-wasm

Author: Anutosh Bhat (anutosh491)

**Context**

This is a part of the effort on running clang-repl in the browser. Check xeus-cpp-lite

In a cell block, I am trying to process multiple c++ definitions

const int var0 = 0;     // generates .rodata._ZL4var0
const C cvar0{0};       // generates .rodata._ZL5cvar0, in a different COMDAT

As a final part we have a linking step that produces the side module (incr_module_xx.wasm for each cell that is processed)

Problem :

Basically I end up with a situation where we're using wasm-ld to execute a linking process (with --emit-relocs being provided as a flag)

Now what I see is when createOutputSegments is run we end up with inputchunks with 3 different comdats (_ZL4var0, _ZL5cvar0 or empty)

And when we run CombineOutputSegments, I see segments having inputchunks with different comdats being combined without any assert or error raised

I am guessing this is justified because

void Writer::combineOutputSegments() {
// With PIC code we currently only support a single active data segment since
// we only have a single __memory_base to use as our base address. This pass
// combines all data segments into a single .data segment.
// This restriction does not apply when the extended const extension is
// available: https://github.com/WebAssembly/extended-const

But then as we are using --emit-relocs, we end up calling LinkingSection::writeBody() after finalizeSections

And here we have the following

#ifndef NDEBUG
for (const InputChunk *isec : inputSegments)
assert(isec->getComdatName() == comdat);
#endif

Hence I end up with this error

Aborted(Assertion failed: isec->getComdatName() == comdat, at: /Users/anutosh491/work/llvm-project/lld/wasm/SyntheticSections.cpp,747,writeBody)

while trying to run expressions with const in xeus-cpp-lite

Disclaimer

i) I see this happens only when we have #ifndef NDEBUG and I'm building llvm with MinSizeRel. Probably I wouldn't have encountered this with Release.

@sbc100
Copy link
Collaborator

sbc100 commented Apr 8, 2025

I think you are most likely just running into bug / limitation of --emit-relocs. This flag doesn't get too much testing and adds a fair amount of complexity so its not totally surprising that there are still issues with it.

We should fix the assert.

But also not that you can skip the section combining if use the extended const feature:

// When outputting PIC code each segment lives at at fixes offset from the
// `__memory_base` import. Unless we support the extended const expression we
// can't do addition inside the constant expression, so we much combine the
// segments into a single one that can live at `__memory_base`.
if (ctx.isPic && !ctx.arg.extendedConst && !ctx.arg.sharedMemory) {
// In shared memory mode all data segments are passive and initialized
// via __wasm_init_memory.
log("-- combineOutputSegments");
combineOutputSegments();
}

To enable a feature like that you just need to build at least on of your object files with that feature enabled, or link with --extra-features=..

@anutosh491
Copy link
Contributor Author

Hey Sam,

Thanks a lot for your reply.

I would like to take 2 steps back and explain as to how I ended up trying --emit-relocs on top of the flags we already have in wasm.cpp (cause I wasn't facing this issue with while using clang-repl in the browser with llvm 19 and I see it with the latest llvm 20 releases)

So when using wasm-ld with verbose (probably can add it to the flags above) I see the following in

i) llvm 19.1.7

1: wasm-ld: Allowed feature: mutable-globals
1: wasm-ld: Allowed feature: reference-types
1: wasm-ld: Allowed feature: multivalue
1: wasm-ld: Allowed feature: sign-ext

ii) llvm 20.1.0

wasm-ld: Allowed feature: mutable-globals
wasm-ld: Allowed feature: call-indirect-overlong
wasm-ld: Allowed feature: nontrapping-fptoint
wasm-ld: Allowed feature: reference-types
wasm-ld: Allowed feature: bulk-memory
wasm-ld: Allowed feature: multivalue
wasm-ld: Allowed feature: sign-ext
wasm-ld: Allowed feature: bulk-memory-opt

This happens in the populateTargetFeatures call. So I guess by default some features are now being enabled

This bring me to

if (ctx.arg.emitRelocs ||
(ctx.arg.memoryImport.has_value() && !allowed.count("bulk-memory")))
ctx.emitBssSegments = true;

I was having emitBssSegments as true in llvm 19 but now that bulk-memory is probably being enabled by default I don't have it untill I either

i) enable --emit-relocs
ii) Or probably override the default flags through my own features.

Hence I started exploring emit-relocs (cause I don't really want to enable init_memory and not have any passive segments)

So basically I want to avoid this

if (hasPassiveInitializedSegments()) {
WasmSym::initMemory = symtab->addSyntheticFunction(
"__wasm_init_memory", WASM_SYMBOL_VISIBILITY_HIDDEN,
make<SyntheticFunction>(nullSignature, "__wasm_init_memory"));
WasmSym::initMemory->markLive();
if (ctx.arg.sharedMemory) {
// This global is assigned during __wasm_init_memory in the shared memory
// case.
WasmSym::tlsBase->markLive();

So I thought of using emit-relocs here.

@anutosh491
Copy link
Contributor Author

We should fix the assert.

I can try helping here. If I understand correctly what we need to do here is add a patch to combineOutputSegments (probably just like what we have in writeBody) to avoid merging segments where inputchunks have different comdats.

Let me know if that's what we're looking for and I'll try adding a quick patch which you could review.

@anutosh491
Copy link
Contributor Author

anutosh491 commented Apr 8, 2025

But also not that you can skip the section combining if use the extended const feature:

Ohh I hadn't explored this just yet (and is a bit new to me)

What is recommended usually in such cases ?
Should we prefer combining output segments into 1 overall data segment ?
Or its not a big necessity and we can probably skip any combining segment fully ?

I ask this from a perspective of running clang-repl in the browser.

As you know while doing this
i) We have a main module (possibly clang-repl.wasm)
ii) And every cell or code block generates a side module (incr_module_xx.wasm) loaded on top of the main module.

So technically in such cases I guess I just need to focus on the
i) correctness/preciseness of the wasm side modules
ii) probably the efficiency with which they can load

So as this is an iterative process where we keep on generting wasm modules and loading them I probably need some guidance on what's the goto option here !

@sbc100
Copy link
Collaborator

sbc100 commented Apr 8, 2025

Why do you care if emitBssSegments is true or now? Ideally you would it to be false since it makes the binaries smaller.

@sbc100
Copy link
Collaborator

sbc100 commented Apr 8, 2025

What is recommended usually in such cases ?
Should we prefer combining output segments into 1 overall data segment ?
Or its not a big necessity and we can probably skip any combining segment fully ?

Ideally the linker would not combine the output segments. Output section combining is only needed in certain cases due to lack the extended const proposal. Hopefully once that feature becomes more widespread we can enable the feature by default and output section combining will be disabled by default at that point.

@anutosh491
Copy link
Contributor Author

anutosh491 commented Apr 8, 2025

Why do you care if emitBssSegments is true or now? Ideally you would it to be false since it makes the binaries smaller.

Maybe I can try explaining the problem at hand.

So firstly the use case of running clang-repl in the browser is being tested currently in down stream project like CppInterop and xeus-cpp. And now that we are moving to llvm 20, I see the following errors

Let's consider two tests for started (abstraction of tests in cppinterop that work with llvm 19)

// Test 1
Cpp::CreateInterpreter(); // creates the interpreter
Cpp::Process("namespace N {} class C{}; int I;");   // nothing but clang-repl's ParseAndExecute
// Test 2
Cpp::CreateInterpreter(); // creates the interpreter
Cpp::Process("some code");

Now what should happen (running test1)

i) CreateIntepreter : creates the interpreter and this basically creates incr_module_0.wasm (some initialization step) . This doesn't have any segment or anything hence hasPassiveInitializedSegments shouldn't do anything.
ii) Process: should create incr_module_1.wasm and this has some segments and this eventually gets hasPassiveInitializedSegments to return true and hence this thing is now executed

if (hasPassiveInitializedSegments()) {
WasmSym::initMemory = symtab->addSyntheticFunction(
"__wasm_init_memory", WASM_SYMBOL_VISIBILITY_HIDDEN,
make<SyntheticFunction>(nullSignature, "__wasm_init_memory"));
WasmSym::initMemory->markLive();
if (ctx.arg.sharedMemory) {
// This global is assigned during __wasm_init_memory in the shared memory
// case.
WasmSym::tlsBase->markLive();

Nothing wrong untill now !! But as soon as we go to Test 2 .... the tests are failing with

1: Aborted(Assertion failed: hasPassiveInitializedSegments(), at: /Users/anutosh491/work/llvm-project/lld/wasm/Writer.cpp,1299,createInitMemoryFunction)

Reason:

Again going back to CreateIntepreter : this should again create incr_module_0.wasm and technically there is no segment here so hasPassiveInitializedSegments should return false which it does . But things go south cause I don't know how but this check inside run passes

    if (WasmSym::initMemory) {
      createInitMemoryFunction();
    }

I don't know how initMemory is still alive (I would guess we are completely in a new state space/dimension where a previously active initMemory shouldn't affect us)

So test 2 fails inside createInitMemoryFunction

void Writer::createInitMemoryFunction() {
  LLVM_DEBUG(dbgs() << "createInitMemoryFunction\n");
  assert(WasmSym::initMemory);
  assert(hasPassiveInitializedSegments());

So each test in itself is running perfectly fine. Its just that i don't understand why initMemory is still alive. Technically it shouldn't be and only process should make it alive I'd say. Is it not resetting properly or something ? not sure !

This is the reason I have been trying to avoid hasPassiveInitializedSegments as a whole.

@anutosh491
Copy link
Contributor Author

anutosh491 commented Apr 8, 2025

This can be seen here inside xeus-cpp-lite too

Image

Ignore the first log there is which I added for debugging.

But apart from that the same error.

Explaining what is happening here ( kinda opposite .... we process first then create interpreter call)

  1. When we run a cell we are processing stuff
  2. So this basically processes the creation of an interpreter
  3. Hence overall the cell would give us an incr_module_1.wasm
  4. And inside that incr_module_1.wasm there would be a CreateInterpreter function call
  5. Which should go and probably create an incr_module_0.wasm (which is the initialization that our Interpreter does, nothing special)
  6. But then again here we are again doing the linking step in wasm.cpp from scratch and initMemory shouldn't be enabled. So again hasPassiveInitializedSegments is false but initMemory is enabled !
  7. What was expected here is that incr_module_0.wasm gets created and probably is loaded on top of the main module and then incr_module_1.wasm's createIntepreter call is complete and that loads on top of the main module.

Not sure why this is happening. The same doesn't happen with our current link based on llvm 19 (obviously because here we're using emitBssSegments as bulk memory is not enabled by default)

@anutosh491
Copy link
Contributor Author

anutosh491 commented Apr 8, 2025

So technically the above case (#134809 (comment)) is something like this

  1. There is a main module : main.wasm
  2. There is a X.wasm side module we want to load on top of the main module
  3. X.wasm has a function making another Y.wasm internally (and if that is sucess then probably X.wasm is formed correctly and then can be loaded on top of the main module)
  4. Now the state of forming X.wasm and the one forming Y.wasm are totally different correct ? Obviously one is responsible for the other but then in both cases fresh linkage is taking place and each arg should be respected individually .

So here hasPassiveInitializedSegments ends up being false but initMemory is active from the previous case.
I hope i am trying to give enough perspective and explanation for the error I see

Cause hasPassiveInitializedSegments is the function enabling initMemory

So if I just possibly add this

-  if (WasmSym::initMemory)
+  if (WasmSym::initMemory && hasPassiveInitializedSegments()) {
      createInitMemoryFunction();
    }

The failures I am facing go away. I probably should check if hasPassiveInitializedSegments is false above and based on that turn of initMemory or something for starters.

So yeah everything basically boils down to clearing the state of the execution above/previously and hopefully having a clear slate while executing the linking next in link. Do you know of why this might not be happening ?

@anutosh491
Copy link
Contributor Author

If I am thinking correctly probably for the above case (and for relevant cases), we might not be resetting WasmSym or clearing it to start with a fresh run ?

cc @sbc100

@anutosh491
Copy link
Contributor Author

anutosh491 commented Apr 9, 2025

Okay I realized, WasmSym act a shared global singleton, maintaining its identity across different modules and hence once a symbol is markedasLive it will remain live and affect the upcoming module.

I made a PR trying to change this behaviour #134970

@anutosh491
Copy link
Contributor Author

anutosh491 commented Apr 10, 2025

I think you are most likely just running into bug / limitation of --emit-relocs. This flag doesn't get too much testing and adds a fair amount of complexity so its not totally surprising that there are still issues with it.

We should fix the assert.

do you think atleast for starters we should add a patch in combineOutputSegments just like the one in writeBody to atleast assert the difference between comdats if #ifndef NDEBUG is enabled ?

#ifndef NDEBUG
for (const InputChunk *isec : inputSegments)
assert(isec->getComdatName() == comdat);
#endif

This atleast starts introducing some consistency. Cause otherwise we're allowing a possibly faulty combine knowingly and then later raising an assert for it saying we are wrong (whereas this should probably not be allowed in the first step itself isn't it ?)

Can make a quick patch addressing this as per what you think !

MaskRay pushed a commit that referenced this issue Apr 25, 2025
…134970)

Towards
##134809 (comment)

This change moves WasmSym from a static global struct to an instance
owned by Ctx, allowing it to be reset cleanly between linker runs. This
enables safe support for multiple invocations of wasm-ld within the same
process

Changes done 

- Converted WasmSym from a static struct to a regular struct with
instance members.

- Added a std::unique_ptr<WasmSym> wasmSym field inside Ctx.

- Reset wasmSym in Ctx::reset() to clear state between links.

- Replaced all WasmSym:: references with ctx.wasmSym->.

- Removed global symbol definitions from Symbols.cpp that are no longer
needed.

Clearing wasmSym in ctx.reset() ensures a clean slate for each link
invocation, preventing symbol leakage across runs—critical when using
wasm-ld/lld as a reentrant library where global state can cause subtle,
hard-to-debug errors.

---------

Co-authored-by: Vassil Vassilev <[email protected]>
jyli0116 pushed a commit to jyli0116/llvm-project that referenced this issue Apr 28, 2025
…lvm#134970)

Towards
#llvm#134809 (comment)

This change moves WasmSym from a static global struct to an instance
owned by Ctx, allowing it to be reset cleanly between linker runs. This
enables safe support for multiple invocations of wasm-ld within the same
process

Changes done 

- Converted WasmSym from a static struct to a regular struct with
instance members.

- Added a std::unique_ptr<WasmSym> wasmSym field inside Ctx.

- Reset wasmSym in Ctx::reset() to clear state between links.

- Replaced all WasmSym:: references with ctx.wasmSym->.

- Removed global symbol definitions from Symbols.cpp that are no longer
needed.

Clearing wasmSym in ctx.reset() ensures a clean slate for each link
invocation, preventing symbol leakage across runs—critical when using
wasm-ld/lld as a reentrant library where global state can cause subtle,
hard-to-debug errors.

---------

Co-authored-by: Vassil Vassilev <[email protected]>
@anutosh491
Copy link
Contributor Author

Hey @sbc100 @MaskRay

I think something the above (#134809 (comment)) could be added to ensure consistency.
Let me know if it makes sense. Shall address it and close the issue !

llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this issue May 6, 2025
… context (#134970)

Towards
#llvm/llvm-project#134809 (comment)

This change moves WasmSym from a static global struct to an instance
owned by Ctx, allowing it to be reset cleanly between linker runs. This
enables safe support for multiple invocations of wasm-ld within the same
process

Changes done

- Converted WasmSym from a static struct to a regular struct with
instance members.

- Added a std::unique_ptr<WasmSym> wasmSym field inside Ctx.

- Reset wasmSym in Ctx::reset() to clear state between links.

- Replaced all WasmSym:: references with ctx.wasmSym->.

- Removed global symbol definitions from Symbols.cpp that are no longer
needed.

Clearing wasmSym in ctx.reset() ensures a clean slate for each link
invocation, preventing symbol leakage across runs—critical when using
wasm-ld/lld as a reentrant library where global state can cause subtle,
hard-to-debug errors.

---------

Co-authored-by: Vassil Vassilev <[email protected]>
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this issue May 6, 2025
…lvm#134970)

Towards
#llvm#134809 (comment)

This change moves WasmSym from a static global struct to an instance
owned by Ctx, allowing it to be reset cleanly between linker runs. This
enables safe support for multiple invocations of wasm-ld within the same
process

Changes done 

- Converted WasmSym from a static struct to a regular struct with
instance members.

- Added a std::unique_ptr<WasmSym> wasmSym field inside Ctx.

- Reset wasmSym in Ctx::reset() to clear state between links.

- Replaced all WasmSym:: references with ctx.wasmSym->.

- Removed global symbol definitions from Symbols.cpp that are no longer
needed.

Clearing wasmSym in ctx.reset() ensures a clean slate for each link
invocation, preventing symbol leakage across runs—critical when using
wasm-ld/lld as a reentrant library where global state can cause subtle,
hard-to-debug errors.

---------

Co-authored-by: Vassil Vassilev <[email protected]>
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this issue May 6, 2025
…lvm#134970)

Towards
#llvm#134809 (comment)

This change moves WasmSym from a static global struct to an instance
owned by Ctx, allowing it to be reset cleanly between linker runs. This
enables safe support for multiple invocations of wasm-ld within the same
process

Changes done 

- Converted WasmSym from a static struct to a regular struct with
instance members.

- Added a std::unique_ptr<WasmSym> wasmSym field inside Ctx.

- Reset wasmSym in Ctx::reset() to clear state between links.

- Replaced all WasmSym:: references with ctx.wasmSym->.

- Removed global symbol definitions from Symbols.cpp that are no longer
needed.

Clearing wasmSym in ctx.reset() ensures a clean slate for each link
invocation, preventing symbol leakage across runs—critical when using
wasm-ld/lld as a reentrant library where global state can cause subtle,
hard-to-debug errors.

---------

Co-authored-by: Vassil Vassilev <[email protected]>
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this issue May 6, 2025
…lvm#134970)

Towards
#llvm#134809 (comment)

This change moves WasmSym from a static global struct to an instance
owned by Ctx, allowing it to be reset cleanly between linker runs. This
enables safe support for multiple invocations of wasm-ld within the same
process

Changes done 

- Converted WasmSym from a static struct to a regular struct with
instance members.

- Added a std::unique_ptr<WasmSym> wasmSym field inside Ctx.

- Reset wasmSym in Ctx::reset() to clear state between links.

- Replaced all WasmSym:: references with ctx.wasmSym->.

- Removed global symbol definitions from Symbols.cpp that are no longer
needed.

Clearing wasmSym in ctx.reset() ensures a clean slate for each link
invocation, preventing symbol leakage across runs—critical when using
wasm-ld/lld as a reentrant library where global state can cause subtle,
hard-to-debug errors.

---------

Co-authored-by: Vassil Vassilev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crash Prefer [crash-on-valid] or [crash-on-invalid] lld:wasm
Projects
None yet
Development

No branches or pull requests

4 participants