Add WebAssembly Linker Backend (with WasmGC and Wasm ExceptionHandling)

Still WIP writing, thinking about adding a wasm (gc) support based on Scala.js 

## Overview
WebAssembly support in Scala.js was discussed in the presentation titled ["Scala.js and WebAssembly, a tale of the dangers of the sea" by Sébastien Doeraene, which can be found on YouTube [here](https://www.youtube.com/watch?v=QsOHofFJpig).
The presentation highlighted that in 2019, there were certain aspects lacking in the WebAssembly support for Scala: WasmGC was quite early stage like phase 1 or 0 at that time.

In late 2023, the [WasmGC extension](https://github.com/WebAssembly/gc/blob/main/proposals/gc/Overview.md) became the default in Chrome (V8)[^chrome] and Firefox[^firefox].

The [Exception Handling proposal](https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md) is now available on many WebAssembly (Wasm) engines, including those with JavaScript engines as embedders[1](https://chat.openai.com/c/d72d80af-b898-412b-b898-b267cc267f8f#user-content-fn-features). Given this development, it is an opportune moment to reconsider WebAssembly support for Scala in 2024. Notably, various garbage-collected languages such as [OCaml](https://github.com/ocaml-wasm/wasm_of_ocaml), [Kotlin](https://kotlinlang.org/docs/wasm-overview.html), [Java](https://github.com/google/j2cl/blob/master/docs/getting-started-j2wasm.md), and [Dart](https://github.com/dart-lang/sdk/blob/main/pkg/dart2wasm/README.md) support WebAssembly utilizing WasmGC.


[^chrome]: [WebAssembly Garbage Collection (WasmGC) now enabled by default in Chrome  |  Blog  |  Chrome for Developers](https://developer.chrome.com/blog/wasmgc)
[^firefox]: [Firefox 120.0, See All New Features, Updates and Fixes](https://www.mozilla.org/en-US/firefox/120.0/releasenotes/).
[^features]: [Feature Extensions - WebAssembly](https://webassembly.org/features/)

This proposal suggests adding a new linker backend designed to compile linked sjsir modules into WebAssembly using

- [WasmGC](https://github.com/WebAssembly/gc/blob/main/proposals/gc/Overview.md) (that depends on)
  - [Reference Types](https://github.com/WebAssembly/gc/blob/main/proposals/reference-types/Overview.md)
  - [Typed Function References](https://github.com/WebAssembly/gc/blob/main/proposals/function-references/Overview.md)
- [Exception handling](https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md)
- Probably [Threads and atomics](https://github.com/WebAssembly/threads) for concurrency.

## Why Wasm?
Wasm was initially designed for faster performance close to native code execution within web browsers. However, its usecases extend far beyond the browser, owing to its robust security features and portability. Also, the introduction of [WASI](https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webassembly-system-interface/) further expands its range of use cases.

- Faster code execution in browser
- Plugins
- Cloud
- Edge
- IoT
- Interop with other languages

<details>

- Faster code execution in browser
  - Wasm is faster than JavaScript? According to the [Kotlin/Wasm benchmark](https://github.com/Kotlin/kotlin-wasm-benchmarks), Kotlin/Wasm is faster than Kotlin/JS in many cases. As you can see it's just **20-30%** faster, not that much.
  - It's worth noting that, for faster execution, it's importatnt to depend on WasmGC and Wasm Exception Handling (EH). If we were to include our own GC and EH code to the Wasm module, it could lead to a performance decrease[^gc-and-eh].
- Plugins
  - Wasm is good for plugin systems because **it can dynamically load the plugin (module)**, **security-aware (sandboxing)**, and **support multiple languages**
  - Usecases
    - [Istioldie 1.13 / Redefining extensibility in proxies - introducing WebAssembly to Envoy and Istio](https://istio.io/v1.13/blog/2020/wasm-announce/)
    - [Shopify Functions overview](https://shopify.dev/docs/apps/functions)
- Cloud
  - Wasm is also good fit for micro-services because of it's faster code loading (being able to stream compile) and secure-by-default.
  - [Fermyon Cloud](https://www.fermyon.com/) for wasm on cloud, and [Krustlet](https://krustlet.dev/) for wasm on k8s.
- Edge
  - A number of edge function platforms start supporting Wasm: [Fastly Edge Cloud Platform](https://www.fastly.com/products), [Wasmer edge](https://wasmer.io/products/edge).
  - [WasmEdge](https://github.com/WasmEdge/WasmEdge) is wasm runtime dedicated for Edge computing.
- IoT
  - [WAMR](https://github.com/bytecodealliance/wasm-micro-runtime) is a wasm runtime for embedding, IoT, or Edge.
  - [Wasmachine: Bring IoT up to Speed with A WebAssembly OS | IEEE Conference Publication | IEEE Xplore](https://ieeexplore.ieee.org/document/9156135)
- Interop with other languages
  - [WebAssembly Component Model](https://github.com/WebAssembly/component-model) proposal enables to link wasm modules developed by different programming languages in future.


For more details: [Exploring WebAssembly outside the browser - Atamel.Dev](https://atamel.dev/posts/2023/06-20_explore_wasm_outside_browser/)

</details>

[^gc-and-eh]: TeaVM is suffering from the problem [No, WASM is slow. As a developer of TeaVM I can claim this. First of all, JS eng... | Hacker News](https://news.ycombinator.com/item?id=19592837) and the author thinks that Wasm GC and EH will improve the situation [Chrome is now released with Wasm GC enabled by default - TeaVM](https://groups.google.com/g/teavm/c/VkKMHCJB6Jw/m/ptoTExQaBwAJ)


## Other ways to compile from Scala to Wasm?

Why do we propose compiling WebAssembly from Scala.js (SJSIR) when there are various methods to compile Scala to WebAssembly?

- Compile JVM to Wasm
  - [CheerpJ](https://cheerpj.com/), a browser-based JVM, compiles JVM to Wasm. While this is great for modernizing legacy JVM applications, the large size of modern JVMs might not be ideal for faster code execution and writing small executables like plugins or extensions. Also, JS-interop would not be easy.
- Compile Java bytecode to Wasm
  - [TeaVM](https://teavm.org/) AOT compiles Java bytecode to JavaScript and Wasm.
  - While current implementation ships with full-blown GC and exception handling which slows down the execution performance[^gc-and-eh], it's definitely a promising project.
  - Other differences between Scala.js vs TeaVM would be the [same as 9 years ago](https://groups.google.com/g/scala-js/c/3jbX9ajIbHM)?
  - [sbt plugin](https://github.com/sbt-teavm/sbt-teavm) is also available.
- Compile JS engine to Wasm
  - [Javy](https://github.com/bytecodealliance/javy) compiles JS to Wasm module: running the given JS code on the [QuickJS](https://bellard.org/quickjs/) embedded into the Wasm module.
  - Good workaround for building WASI module (that works high-level constructs such as GC, EH, and async) from Scala at this moment.
  - Slower performance due to the lack of JIT compilation, and the module size still tend to larger.
- Compile LLVM IR to Wasm
  - [It's been explored in ScalaNative project](https://github.com/scala-native/scala-native/issues/603) that compiles Scala to LLVM IR (and then native binary); with Emscripten or WASI-SDK compiles the LLVM IR to WebAssembly.
  - However, it turned out that there's no way of expressing the WASM GC primitives in there. Which means, we need to ship full-blown GC code to the wasm module, and it would decrease the performance (as TeaVM struggles) and makes it tricky to interop with JS objects in GC context.
  - > Can WasmGC adopt a similar toolchain model as WasmMVP, and in particular use LLVM? Unfortunately, no, since LLVM does not support WasmGC (some amount of support [has been explored](https://github.com/Igalia/ref-cpp), but it is hard to see how full support could even work). Also, many GC languages do not use LLVM–there is a wide variety of compiler toolchains in that space. And so we need something else for WasmGC. (from https://v8.dev/blog/wasm-gc-porting)
- Compile NIR to Wasm
  - After considering the above candidates, there were 2 choices on my mind: NIR (a ScalaNative intermediate language) to Wasm, or from sjsir (a Scala.js intermediate language) to Wasm.
  - I haven't explored enough the possibility of NIR to Wasm TBH, but compiling from SJSIR to Wasm seems easier for some reasons.
    - Easier JS interop: while Wasm is going beyond the browser embeddings, the easy JS-interop is still a primary usecase. While ScalaNative has a good interop with native code on LLVM layer, Scala.js API would be a better designed for JS-interop.
    - NIR might be too low-level for compiling to WasmGC (not sure): NIR is basically a high-level LLVM IR, while WasmGC is kind of Java-bytecode-like high-level language. Though I haven't explored enough, It seems sjsir is high-level enough for compiling to WasmGC.
    - Following other GC-languages' choice: J2CL, Kotlin/Wasm, and wasm_of_ocaml are all JS-compiler, customized to emit WasmGC.



## How?

Add a new implementation of [org.scalajs.linker.standard.LinkerBackEnd](https://github.com/scala-js/scala-js/blob/188b945d05d80bb2665b1ad52161535fc5147979/linker/shared/src/main/scala/org/scalajs/linker/standard/LinkerBackend.scala) that compiles to WasmGC.

This design is based on the observation that how Kotlin/Wasm and J2CL compile high-level constructs to WebAssembly. It's worth noting that the design might undergo changes during implementation.

A few notes on WasmGC and Kotlin/Wasm:

- [Exploring WAT Files Generated from Kotlin/Wasm | Rikito Taniguchi](https://tanishiking.github.io/posts/kotlin-wasm-deep-dive/#enum-and-pattern-match)
- [New types and instructions to be introduced in WasmGC | Rikito Taniguchi](https://tanishiking.github.io/posts/wasm-gc/)

### Class definition

```scala
class Base(p: Int):
  def foo(): Int = 1
```

The class definition will be represented as a `struct` type in WasmGC.

```wasm
(type $Base_t (sub $java.lang.Object (struct
    (field (ref $Base.vtable_t)) ;; vtable
    (field (ref null struct)) ;; itable
    (field (mut i32)) ;; typeInfo
    (field (mut i32)) ;; hashCode
    (field (mut i32)) ;; p
))) ;; hashCode
```

- Same as [Kotlin/Wasm](https://seb.deleuze.fr/introducing-kotlin-wasm/) and J2CL, the class definition will have `vtable` and `itable`.
- Regarding `itable`, will explain more in `interface call` section.
- Will have `typeInfo`, and `hashCode` fields, following `Kotlin/Wasm`, but it might not be needed, let's see.
- And, there'll be a fields for the class fields.

The vtable contains the function references to the methods.

```wasm
;; Type definitions of vtable and methods
(type $Base.vtable_t (sub $java.lang.Object.vtable_t (struct
    (field (ref null $Base_foo_t)))))
(type $Base_foo_t (func (param (ref null $Base)) (result i32)))

;; the vtables will be defined as global struct.
(global $Base.vtable_g (ref $Base.vtable_t)
    ref.func $Base.foo_fun
    struct.new $Base.vtable_t)
(func $Base.foo_fun (type $Base_foo_t)
    (param $this (ref null $Base_t)) (result i32)
    i32.const 1
    return)
```

The constructor will setup vtalbe, itable, and initialize the fields.

```wasm
global.get $Base.vtable_g ;; vtable
ref.null struct ;; itable (it's gonna be null reference because it doesn't implement any interfaces)
i32.const 0 ;; typeinfo (how to calculate it? TODO)
i32.const 0 ;; hashCode (will be calculated and cached when we call hashCode)
        
local.get $p
struct.new $Base_t
```

---

### Virtual call

```scala
class Base(p: Int):
  def foo(): Int = 1

class Derived(p: Int) extends Base(p):
  override def foo(): Int = 2

object Test:
  def box(): Unit =
    val d = new Derived(1)
    bar(d)
  def bar(f: Base): Int = f.foo()
```

The definition of `Derived` will be like

```wasm
;; Same as Base_t except super class is Base_t and vtable is Derived.vtable
(type $Derived_t (sub $Base_t (struct
    (field (ref $Derived.vtable_t)) (field (ref null struct)) (field (mut i32)) (field (mut i32)))))
(type $Derived.vtable_t (sub $Base.vtable_t (struct
    (field (ref null $Base_foo_t)))))
(type $Base_foo_t (func (param (ref null $Base_t)) (result i32)))
```

The `bar` method (that contains virtual call to `foo`) will be

```wasm
(type $bar_t (func (param (ref null $Base_t))))
(func $bar_fun (type $bar_t)
    (param $0_f (ref null $Base_t)) (result i32)
    ;; push two receiver instance of `Base` type.
    ;; one is for getting function reference from vtable
    ;; another one is for the receiver argument for the foo method
    local.get $0_f  ;; type: Base
    local.get $0_f  ;; type: Base
    struct.get $Base_t 0  ;; push vtable of Base to the stack
    struct.get $Base.vtable_t 0 ;; push function reference to foo
    call_ref (type $Base_foo_t) ;; call the function reference using `call_ref`
    return)
```

Why we don't use `call_indirect` as [Rust does](https://fitzgeraldnick.com/2018/04/26/how-does-dynamic-dispatch-work-in-wasm.html)?

- WebAssembly's `table` is basically one big virtual table in a module (in Wasm 1.0), which is untyped alternative to typed function references.
- Even if we register functions in `table`, the classes still need to the pointer (table index) to the methods. So, what's the point of using `call_indirect` with WasmGC?

---

### Interface call

```scala
trait Animal:
  def sound(): Unit

class Cat extends Animal:
  def sound(): Unit = {}

def baz(animal: Animal) = animal.sound()
```

The `Cat` class will have an itable 

```wasm
;; Cat's itable has an pointer to `Animal`'s itable
(type $Cat.classITable_t (struct
    (field (ref null $Animal.itable_t))))
(type $Animal.itable_t (struct (field (ref null $Animal_sound_t))))

(global $Cat.classITable_g (ref $Cat.classITable_t)
    ref.func $Cat_sound_fun ;; function ref to `Cat.sound` implementation
    struct.new $Animal.itable_t
    struct.new $Cat.classITable_t)
```

The interface call site (`baz` method) will be looks like:

```wasm
(func $baz_fun (type $baz_fun_t)
    ;; the static interface will be typed as `java.lang.Object`
    (param $0_b (ref null $java.lang.Object))
    ;; same as virtual call, one for get itable, and one for receiver
    local.get $0_b  ;; type: Animal
    local.get $0_b  ;; type: Animal
    
    struct.get $java.lang.Object 1 ;; get itable
    ref.cast $Cat.classITable ;; need to cast because the given static interface is Object
    struct.get $Cat.classITable 0 ;; get the Animal.itable
    struct.get $Animal.itable_t 0 ;; get the function reference to `Cat.sound_fun`
    call_ref (type $Animal_sound_fun_t)
    ;; ...
    return)
```

The method to call will be searched for based on the signature at compile time, and we'll just access to the itables by index.

---

### concurrency

I haven't yet delved into this area much, but it seems [webassembly native threads](https://github.com/WebAssembly/threads) feature is already [at phase 4](https://github.com/WebAssembly/proposals?tab=readme-ov-file#phase-4---standardize-the-feature-wg), and available at most of popular runtimes including wasmtime [^features] thanks to [wasi-threads](https://bytecodealliance.org/articles/wasi-threads)

<img width="750" alt="Screenshot 2024-01-10 at 21 03 51" src="https://github.com/scala-js/scala-js/assets/9353584/b15fdc8e-fb72-4bc1-be25-133fc8f2bd6f">

Image from [WebAssembly Threads - HTTP 203 - YouTube](https://www.youtube.com/watch?v=x9RP-M6q2Mg)

Focus on sindle-threaded at first, and eventually support multi-threading later on.

---

### exception handling

Relies on [wasm native exception-handling](https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md)

Following the Kotlin/Wasm lowering strategy https://github.com/JetBrains/kotlin/blob/4786c945d933c82c9560a9923f33effc59a80093/compiler/ir/backend.wasm/src/org/jetbrains/kotlin/backend/wasm/lower/TryCatchCanonicalization.kt#L24-L67

For try catch
```kotlin
// From this:
//    try {
//        ...exprs
//    } catch (e: Foo) {
//        ...exprs
//    } catch (e: Bar) {
//        ...exprs
//    }
// We get this:
//    try {
//        ...exprs
//    } catch (e: Throwable) {
//        when (e) {
//            is Foo -> ...exprs
//            is Bar -> ...exprs
//        }
//    }
// https://github.com/JetBrains/kotlin/blob/4786c945d933c82c9560a9923f33effc59a80093/compiler/ir/backend.wasm/src/org/jetbrains/kotlin/backend/wasm/lower/TryCatchCanonicalization.kt#L24-L67
```

We'll have only one `exception tag` in the module who's type is `java.lang.Throwable`. When we throw an exception, it's always compiled to `throw 0` with an operand of type (or subtype of) `java.lang.Throwable`.

```wasm
(tag $tag (param (ref null $java.lang.Throwable)))) ;; whose tag idx is 0
```

Also, the catch clause in Wasm always catch the `java.lang.Throwable` with `catch 0`, and then validate the exception's type. If none of catch clauses (in Scala) caught an exception, rethrow the exception.

#### try/catch

For example, (`ExceptionA extends Exception` and `ExceptionB extends Exception`)

```scala
try {
  throw Exception()
} catch (e: ExceptionA) {
} catch (e: ExceptionB) {
}
```

This will be compiled to

```wasm
(local $0_merged_catch_param (ref null $java.lang.Throwable)
try
    call $java.lang.Exception.<init> ;; construct exception and push to the stack
    throw 0 ;; throw an exception
catch 0
    local.tee $0_merged_catch_param ;; thrown exception
    ref.test $ExceptionA_t ;; test if the thrown exception is a subtype of ExceptionA
    if ;; catch(e: ExceptionA) { ... }
        ;; ...
    else
        local.get $0_merged_catch_param
        ref.test $ExceptionB___type_44
        if ;; catch (e: ExceptionB) { ... }
            ;; ...
            else ;; if none of catch clauses catch the exception
                local.get $0_merged_catch_param
                throw 0 ;; rethrow it
            end
        end
    end)
```

#### finally

```kotlin
// With finally we transform this:
//    try {
//        ...exprs
//    } catch (e: Throwable) {
//        ...exprs
//    } finally {
//        ...<finally exprs>
//    }
// Into something like this (tmp variable is used only if we return some result):
//    val tmp = block { // this is where we return if we return from original try/catch with the result
//      try {
//        try {
//            return@block ...exprs
//        } catch (e: Throwable) {
//            return@block ...exprs
//        }
//     }
//     catch (e: Throwable) {
//       ...<finally exprs>
//       throw e // rethrow exception if it happened inside of the catch statement
//     }
//   }
//   ...<finally exprs>
//   tmp // result
// https://github.com/JetBrains/kotlin/blob/4786c945d933c82c9560a9923f33effc59a80093/compiler/ir/backend.wasm/src/org/jetbrains/kotlin/backend/wasm/lower/TryCatchCanonicalization.kt#L24-L67
```

- The `try/catch` inside, is the normal try/catch handling described above
- The `try/catch` outside is for if the exception isn't caught by any catch clauses, do something in finally clause and rethrow the exception.

```scala
try {
    throw Exception()
} catch (e: Exception) {
} finally { println("hello") }
```

```wasm
block (result (ref null $Unit))
    try
        try
            ;; construct exception
            call $Exception.<init>
            throw 0
        catch 0
             local.tee $0_merged_catch_param
             ref.test $Exception
             if
                 ;; catch(e: Exception) { ... }
             else
                 local.get $0_merged_catch_param
                 throw 0
              end
              br 2 ;; jump to outside of the block
         end
         unreachable
    catch 0
          ;; println("hello")
          ;; push the caught exception to stack
          throw 0
     end
     unreachable
end
drop

;; println("hello")
```



---

### JS interop

TBD

## Q&A
- Is WASI support is in scope?
  - Yes, but I haven't explored yet how to support WASI, let's focus on wasm for JS embeddings first, and then WASI.

---

Any advises or questions are welcome, especially from someone knows more about Scala.js / SJSIR internal.

## Related

- https://github.com/scala-js/scala-js/issues/1747

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add WebAssembly Linker Backend (with WasmGC and Wasm ExceptionHandling) #4928

Overview

Why Wasm?

Other ways to compile from Scala to Wasm?

How?

Class definition

Virtual call

Interface call

concurrency

exception handling

try/catch

finally

JS interop

Q&A

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add WebAssembly Linker Backend (with WasmGC and Wasm ExceptionHandling) #4928

Description

Overview

Why Wasm?

Other ways to compile from Scala to Wasm?

How?

Class definition

Virtual call

Interface call

concurrency

exception handling

try/catch

finally

JS interop

Q&A

Related

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions