JS builtin naming guidelines

In the most recent CG meeting there was a discussion around a naming scheme for JS builtins.

Some previous discussion [here](https://github.com/WebAssembly/js-text-encoding-builtins/issues/4) and [here](https://github.com/WebAssembly/esm-integration/issues/118).

Right now there is only one standardized builtin set `wasm:js-string` and the following proposed additions:
```
// js-primitive-builtins
wasm:js-number
wasm:js-undefined
wasm:js-symbol
wasm:js-boolean

// custom-descriptors
wasm:js-prototypes

// js-text-encoder-builtins
wasm:text-encoding
wasm:text-decoding
```

There hasn't been a hard set of guidelines written for how to name builtins, and so it's been a bit free form. Over the long term this will probably lead to a mess, so let's try and get consensus on one. The original JS-string-builtins proposal did have a section discussing when [to add builtins or not](https://github.com/WebAssembly/spec/blob/main/proposals/js-string-builtins/Overview.md#goals-for-builtins), but did not fully spelling out the naming. That was my bad, and so I'll try to rectify that here.

We should have both of these guidelines written down and committed somewhere once we have consensus.

## Design questions

1. What scheme should we use?

Right now we have two schemes registered with IANA that we could use: `wasm` and `wasm-js`. We could use `wasm-js` for all JS builtins. However I think it's likely there are other namespaces we may want to use in the future, and it'd be a big hurdle to have to register these all with IANA. So instead, I'd propose we lean into URLs and namespace with paths:

e.g. we'd pick an overal format of:

`wasm:$namespace/$set` `$member`

There was a discussion around absolute/relative URIs in the meeting (i.e. should it be `wasm:/namespace/set`?). From others doing the URI/URL lawyering, it appears the `wasm:namespace/set` is unambiguously an absolute URI (in RFC 3986 terms) and also absolute in WHATWG URL terms. So I'd prefer to leave it without an initial slash.

2. What namespace should we use?

A related but distinct question is what to use as the `$namespace`. Right now we're using `js`. This makes sense for all the things defined by the JS language, but gets a little odd for the `js-text-encoder-builtins`. Those are adapting features provided by the [WHATWG encoding standard](https://encoding.spec.whatwg.org/), not ECMA 262 (although as far as I'm aware, every production JS environment provides the API).

One option would be to use the standard body `whatwg` or maybe just `web`. However, there is precedent for features moving between standards bodies. Typed arrays were originally a Khronos WebGL feature before getting adopted by ECMA262. It's also not clear to end-users the organization of standard bodies or differences between web/js.

I think the best option here is to just use 'js' as a namespace because whether or not they're defined in a JS or WHATWG or Khronos spec, they are defined in terms of JS language interfaces that are exposed on the host and adapted by the Wasm JS-API.

You could think of the namespace as being for an individual embedder of the core spec. In the future we could introduce other namespaces like `c-api` or `component` for other embedder specifications if they wanted to define builtins.

3. What should the name/organization of sets be?

The next question is what should the `$name` be and when do we need a new name vs reusing one?

From the original js-string-builtins proposal there was a goal of:
```
In the cases the primitive already has a name, we should re-use it and not invent a new one
```
I think that's what we should follow here.

For cases where we are adapting a concrete JS interface that has a name on the global, I think we should just pick that name. So if we're adapting the `String` interface on the global we use `string`. For the text-encoding ones we would use `text-encoder` or `text-decoder`. Casing will be discussed later.

There are two benefits to this:
1. we don't need to bikeshed names
2. there are backwards compat requirements on exposed interfaces, it's unlikely they will be renamed

I think it's preferable to not create our own names or tie them to a wasm proposal's name. The naming of wasm proposals are historical accidents and shouldn't be exposed to end users.

The weirdest case to apply this to is the `configureAll` builtin from custom-descriptors. I think we could possibly have it under a `js/object` interface because it's really just apply properties to a bunch of JS objects? But more thought is needed.

4. Should we allow nesting of sets?

esm-integration chose `wasm:js/string/constants` as the name for the builtin set you can use to import any arbitrary JS string using the 'importedStringConstants' feature.

This is a bit of a special case that I think we can allow, as the `importedStringConstants` feature is tied to the JS `String` interface, while being a special concept in the wasm JS-API.

The other case I could foresee is when JS namespace objects (like Math or WebAssembly) are involved. In those cases, we could also allow nesting. e.g. `wasm:js/webassembly/tag`

Other than that, I'd say we should disallow nesting.

5. How should we name members of a set?

For the actual members of a set, we should also try to re-use the exact naming of the equivalent JS API.

So for adapting String.prototype.concat we should just use `concat` and not 'fix' it and use `concatenate`.

This is tricky for cases where JS has operators (e.g. equals/compare) or where the builtin is defining a JS-Wasm data conversion (e.g. fromCharCodeArray). In those cases I'd say it's okay to choose your own name, but you should look for historical precedent in previous builtins.

6. What casing should we use?

Finally, what casing should we use for the `$namespace`, `$set`, and `$member` components?

For namespace I'd lean towards lowercase/kebab-case as that matches how most URLs are styled.
For set, we have been doing lowercase/kebab-case (e.g. text-encoding). However I realize now that if we're trying to match a specific JS builtins name on the global, we probably should match the case too (just like we do with members). So it should probably be `TextEncoder` not `text-encoder`.
For the member, the same logic applies and we should continue doing camelCase like we are right now.

This is probably the least important/most debatable question.

7. What name should we select a builtin with in the JS-API?

The CompileOptions dictionary takes a list of names to enabled builtin sets. I'd recommend that we take the `$namespace/$set` string (e.g. `js/string`) as the key.

## Conclusion

The only hard backwards-compat constraint right now is `wasm:js-string` which has shipped in the JS-API. We would need to leave that as an alias. But going forward we could get them all aligned to the new scheme.

In summary we'd have:
```
// renamed js-string-builtins
wasm:js/String

// js-primitive-builtins
wasm:js/Number
wasm:js/Undefined
wasm:js/Symbol
wasm:js/Boolean

// custom-descriptors
wasm:js/Object

// js-text-encoder-builtins
wasm:js/TextEncoder
wasm:js/TextDecoder
```

cc @sjrd @tlively @guybedford @daxpedda as champions of proposals that define builtins.

Begin bikeshed!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JS builtin naming guidelines #1584

Design questions

Conclusion

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

JS builtin naming guidelines #1584

Description

Design questions

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions