-
-
Notifications
You must be signed in to change notification settings - Fork 34.2k
Move ESM loaders off-thread #44710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move ESM loaders off-thread #44710
Conversation
|
Review requested:
|
2265b79 to
ce94ba6
Compare
|
Does each worker thread get its own loader thread? E.g. if an app launches 2x worker threads, there will be 6x threads total: one main, 2x worker, 3x loader? |
|
The current design puts all loaders in the same worker thread (eg there are a total of 2 threads: the main and the worker). Internal things are handled via the main thread; userland things are handled by the worker thread. |
|
I think that's slightly different than what I'm asking about. If a node app uses the worker threads API to launch worker threads, does each get its own loader thread? (where each loader thread may have one or more loaders running within it) Not does each loader get a thread, but does each user thread get a loader thread. |
I think the current state (before this PR) is that loaders always execute in the main thread, even if they’re “for” user code that is in a worker thread. So I would think that after this PR, loaders would run inside their worker thread, regardless of whether they’re customizing main or worker thread user code. So in other words, for an app using custom loaders where user code that spawns three workers, there are five threads total: main, loaders, and the three workers spawned by the user code. This is just my guess, others can please correct me if I’m mistaken. |
|
I don't know how it's implemented but I would expect that each Node.js thread (main and workers) has its own separate loaders thread, at least for two reasons:
|
|
First pass this seems fine but likely should investigate a thread per
loader URL/some key. Having it per thread they instrument would be more
costly if you spawn/tear down threads.
…On Tue, Sep 27, 2022, 1:24 PM Michaël Zasso ***@***.***> wrote:
I don't know how it's implemented but I would expect that each Node.js
thread (main and workers) has its own separate loaders thread, at least for
two reasons:
- Worker threads are supposed to be as isolated as possible from the
main thread and from each other
- You can spawn a worker thread with a different set of --loader flags.
—
Reply to this email directly, view it on GitHub
<#44710 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABZJI7YII3ZMWWAGIMOC6LWAM3WJANCNFSM6AAAAAAQPNEM7E>
.
You are receiving this because you are on a team that was mentioned.Message
ID: ***@***.***>
|
|
What remains to be done? |
|
Something in this implementation is causing node to hang on startup. We suspect it's some kind of circular dependency, and the subsequent dependency smacks into the Atomics lock (preventing the rest of the flow to complete, which would release the initial lock). I think this circular dep is between ESMLoader and Worker (it's rather difficult to troubleshoot as output is swallowed). There is a working PoC, so we know this works in principle (and that the Atomics otherwise behave appropriately, so the problem lies with integrating it into node). This was on hold whilst I was on holiday. I'm working on it today and tomorrow (but will be away on a business trip next week). |
|
I found the circular dep:
|
|
Gah, it wasn't a circular dependency 🤦♂️ it was setting Lines 141 to 149 in 9836c67
which caused the worker to spawn with nothing in it. (I shouldn't have copied it over from the previous attempt at off-threading, which was later hard-coding the I'm not sure if the internal worker should go through the There is (now was) a circular dependency issue after the empty worker issue is addressed in node/lib/internal/main/worker_thread.js Line 142 in 9836c67
that's now fixed too.
|
|
What is the recommended way to pass data from the main thread to a loader? Thanks in advance for any reply |
|
I see it is documented here (sorry for bothering) https://nodejs.org/api/esm.html#globalpreload I'll play around with this |
|
We are unable to access the app path with ESM loaders being off threaded in v20 easily. See nodejs/help#4190 |
|
Looks like loaders no longer have access to process.argv |
Correct; that is how worker threads work. There is a dedicated issue for this as well as a proposal to address (which is already linked). I'm locking this as most of the recent comments are telling us the sky is blue. |
|
It looks like this should be marked as a |
|
Marking it as semver-major would be wrong I think, because it's an experimental API, but it could be labelled as |
Resolves #43658
To-dos:
Moduleserialisation issue (Proxies aren’t serialisable)import.meta.resolve()to synchronousNotable changes:
Custom ESM loader hooks run on dedicated thread
ESM hooks supplied via loaders (
--experimental-loader=./foo.mjs) now run in a dedicated thread, isolated from the main thread. This provides a separate scope for loaders and ensures no cross-contamination between loaders and application code. A few things to know:globalPreloadhook’sport. Global variables are not shared between scopes.Synchronous
import.meta.resolve()In alignment with browser behavior, this function now returns synchronously. Despite this, user loader
resolvehooks can still be defined as async functions (or as sync functions, if the author prefers). Even when there are asyncresolvehooks loaded,import.meta.resolvewill still return synchronously for application code.Contributed by Anna Henningsen, Antoine du Hamel, Geoffrey Booth, Guy Bedford, Jacob Smith, and Michaël Zasso in #44710