Codestin Search App

jasnell · 2026-03-01T18:37:08Z

Opening this for discussion. Not intending to land this yet. It adds an implementation of the "new streams" to core and adds support to FileHandle with tests and benchmarks just to explore implementation feasibility, performance, etc.

It's worth noting that the performance of the FileHandle benchmarked added, that reads files, converts them to upper case and then compresses them, is on par with node.js streams and twice as fast as web streams. (tho... web streams are not perf optimized in any way so take that 2x with a grain of salt). The majority of the perf cost in the benchmark is due to compression overhead. Without the compression transform, the new stream can be up to 15% faster than reading the file with classic node.js streams.

The main thing this shows is that the new streams impl can (a) perform reasonably and (b) sit comfortably alongside the existing impls without any backwards compat concerns.

Benchmark runs:

fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=1048576 api="classic": 0.4520276595366672
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=16777216 api="classic": 0.5974527572097321
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=67108864 api="classic": 0.6425952035725405
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=1048576 api="webstream": 0.1911778984563999
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=16777216 api="webstream": 0.2179878501077266
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=67108864 api="webstream": 0.2446390516960688
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=1048576 api="pull": 0.5118129753083176
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=16777216 api="pull": 0.6280697056085692
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=67108864 api="pull": 0.596177892010514
--- 
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=1048576 api="classic": 0.44890689503274533
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=16777216 api="classic": 0.5922959407897667
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=67108864 api="classic": 0.6151916200977057
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=1048576 api="webstream": 0.22796906713941217
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=16777216 api="webstream": 0.2517499148269662
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=67108864 api="webstream": 0.2613608248108332
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=1048576 api="pull": 0.4725187688512099
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=16777216 api="pull": 0.5180217625521253
fs/bench-filehandle-pull-vs-webstream.js n=5 filesize=67108864 api="pull": 0.616770183722841

nodejs-github-bot · 2026-03-01T18:37:12Z

Review requested:

@nodejs/performance
@nodejs/streams

doc/api/fs.md

ronag

Super impressed! This is amazing.

One note. Since this is supposed to be "web compatible" it looks to me like everything is based on Uint8Array which is a bit unfortunate for Node. Could the node implementation use Buffer it would still be compatible it's just that we can access the Buffer prototype methods without doing hacks like Buffer.prototype.write.call(...).

ronag · 2026-03-02T11:14:38Z

Also could you do some mitata based benchmarks so that we can see the gc and memory pressure relative to node streams?

ronag · 2026-03-02T11:16:32Z

Another thing, in the async generator case, can we pass an optional AbortSignal? i.e. async function * (src, { signal }). We maybe could even check the function signature and if it doesn't take a second parameter don't allocate the abort controller at all.

jasnell · 2026-03-02T18:11:21Z

One note. Since this is supposed to be "web compatible" it looks to me like everything is based on Uint8Array which is a bit unfortunate for Node. Could the node implementation use Buffer it would still be compatible it's just that we can access the Buffer prototype methods without doing hacks like Buffer.prototype.write.call(...).

This makes me a bit nervous for code portability. If some one starts working with this in node.js, they would end up writing code that depends on the values being Buffer and not just Uint8Array. They go to move that to another runtime implementation or standalone impl like https://github.com/jasnell/new-streams and suddenly that assumption breaks.

benjamingr

just to explore implementation feasibility, performance, etc

Sounds fine as this isn't exposed outside at the time

benjamingr · 2026-03-02T18:34:54Z

lib/internal/streams/new/push.js

+
+    // Buffer is full
+    switch (this._backpressure) {
+      case 'strict':


I'm not sure strict should be the default and not block here.

That'll be a big part of the discussion around this. A big part of the challenge with web streams is that backpressure can be fully ignored. One of the design principles for this new approach is to apply it strictly by default. We'll need to debate this. Recommend opening an issue at https://github.com/jasnell/new-streams

benjamingr · 2026-03-02T18:35:37Z

lib/internal/streams/new/push.js

+      return this._bytesWritten;
+    }
+
+    this._writerState = 'closed';


A lot of these state variables can be optimized to const numbers (or a bit map overall) to make the size of these classes smaller per instance, especially for many small streams

Yeah, I haven't yet taken an optimization pass over all this. Wanted to focus on correctness and feasibility first. but noted!

benjamingr · 2026-03-02T18:37:32Z

lib/internal/streams/new/push.js

+// PushQueue - Internal Queue with Chunk-Based Backpressure
+// =============================================================================
+
+class PushQueue {


Come to think of it, I'm wondering why a push stream should buffer at all - almost other implementations of push streams I'm aware of don't do this.

Another good discussion ;-) ... Worth opening an issue at https://github.com/jasnell/new-streams

that said... it buffers because throughput matters. A zero-buffer rendezvous (producer blocks until consumer takes each chunk) means the producer and consumer are perfectly lock-stepped and neither can work while the other is working. That's correct but slow. The buffer decouples them so work can overlap:

Producer writes chunk 1 into a slot, immediately starts producing chunk 2

Consumer reads chunk 1, starts processing it

Producer finishes chunk 2, drops it in the second slot

Both sides are working in parallel

Especially in javascript, at least some degree of buffering is required.

The way it works in the new-streams prototype is straightforward. Imagine a bucket being filled by a pipe. The highWaterMark defines how many slots in the bucket there are AND how many slots in the pipe there are. With highWaterMark: 2, the bucket has 2 slots and the pipe has 2 slots. Backpressure is signaled when both the bucket and pipeline slots are full. With strict mode by default, backpressure is signaled by rejecting additional writes. When block mode, the pipeline just keeps growing and you have to manually pay attention to the write promises.

I've tested out a ton of different strategies and this one consistently has demonstrated to be the easiest to reason about and easiest to optimize around.

benjamingr · 2026-03-02T18:38:12Z

lib/internal/streams/new/pull.js

+ * @yields {Uint8Array}
+ */
+async function* flattenTransformYieldAsync(value) {
+  if (value instanceof Uint8Array) {


All these instanceof checks don't work cross-realm

Yep, I've got a note on this locally already. If we decide to move forward with this that'll be one of the outstanding issues to address

benjamingr · 2026-03-02T18:39:04Z

lib/internal/streams/new/pull.js

+    return;
+  }
+  // Check for async iterable first
+  if (isAsyncIterable(value)) {


Are you sure you want the timing to be different between sync and async here? Async iterables normalize this (you can for await a sync iterable) but the perf might suffer

That's something that's still to be fully evaluated. I've run maybe a few dozen scenarios through analysis and haven't encountered one yet where it causes an actual problem but it needs to be fully explored.

benjamingr

sorry meant to approve, regardless of design changes/suggestions regarding timing and a lot of other stuff as experimental this is fine.

I would maybe update the docs to emphasize the experimentality even further than normal

Refactors the cancelation per updates in the design doc

jasnell · 2026-03-03T05:35:39Z

@ronag ... implemented a couple of mitata benchmarks in the https://github.com/jasnell/new-streams repo (the reference impl)

--

Memory Benchmark Results

Environment: Node 25.6.0, Intel Xeon w9-3575X, --expose-gc, mitata with .gc('inner')

Per-Operation Allocations (New Streams vs Web Streams)

Scenario	Speed	Heap/iter (new)	Heap/iter (web)
Push write/read (1K x 4KB)	2.24x faster	2.06 MB	1.43 MB
Pull + transform (1K x 4KB)	2.44x faster	334 KB	5.57 MB
pipeTo + transform (1K x 4KB)	3.15x faster	303 KB	7.47 MB
Broadcast 2 consumers (500 x 4KB)	1.04x faster	1.92 MB	1.81 MB
Large pull 40MB (10K x 4KB)	1.26x faster	2.62 MB	52.35 MB

Pipeline scenarios (pull, pipeTo) show the biggest gains: 16-25x less heap because transforms are inline function calls, not stream-to-stream pipes with internal queues. Push is faster but uses slightly more heap due to batch iteration (Uint8Array[]). Broadcast/tee are comparable at this scale.

Sustained Load (97.7 MB volume)

Scenario	Peak Heap (new)	Peak Heap (web stream)
pipeTo + transform	6.9 MB	50.6 MB
Broadcast 2 consumers	0.5 MB	42.8 MB
Push write/read	5.9 MB	2.5 MB
Pull + transform	6.1 MB	2.8 MB

pipeTo and broadcast show the largest sustained-load heap difference. Web Streams' pipeThrough chain buffers ~50% of total volume in flight; new streams' pipeTo pulls synchronously through the transform. Broadcast's shared ring buffer (0.5 MB) vs tee's per-branch queues (42.8 MB).

Zero retained memory for both APIs after completion -- no leaks.

jedwards1211 · 2026-03-03T06:35:24Z

@ronag passing a signal to an async generator allows the underlying source to abort it, but we're lacking a builtin way for the consumer iterating the async generator to safely cancel the stream. It can .return() its iterator when it's done, but that won't break the async generator and until it receives the next chunk, which isn't guaranteed to happen if the underlying source is something nondeterministic like pubsub events. In this case, there would be leaks that are kind of awkward to blame on user error.

Barring an improvement at the language level, the consumer can only safely cancel the underlying source if it has a reference to an AbortController that signals it.

WHATWG Streams don't have this problem if the consumer .cancel()s their reader, though they do if the consumer is async iterating them.

Happy to create examples to reproduce this if it's not clear what I'm talking about.

stream: prototype for new stream implementation

1c50f49

jasnell requested review from mcollina and ronag March 1, 2026 18:37

nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Mar 1, 2026

This comment was marked as off-topic.

Sign in to view

bjohansebas reviewed Mar 1, 2026

View reviewed changes

doc/api/fs.md Show resolved Hide resolved

ronag approved these changes Mar 2, 2026

View reviewed changes

benjamingr reviewed Mar 2, 2026

View reviewed changes

benjamingr approved these changes Mar 2, 2026

View reviewed changes

jasnell mentioned this pull request Mar 2, 2026

recommend avoiding async generators, they can easily leak resources jasnell/new-streams#8

Open

jasnell added 2 commits March 2, 2026 18:47

stream: updates to stream/new impl

ce48b2b

Refactors the cancelation per updates in the design doc

stream: clarify backpressure details in stream_new

6b594e7

Uh oh!

Conversation

jasnell commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nodejs-github-bot commented Mar 1, 2026

Uh oh!

This comment was marked as off-topic.

Uh oh!

Uh oh!

ronag left a comment

Choose a reason for hiding this comment

Uh oh!

ronag commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ronag commented Mar 2, 2026

Uh oh!

jasnell commented Mar 2, 2026

Uh oh!

benjamingr left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benjamingr left a comment

Choose a reason for hiding this comment

Uh oh!

jasnell commented Mar 3, 2026

Memory Benchmark Results

Uh oh!

jedwards1211 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

jasnell commented Mar 1, 2026 •

edited

Loading

ronag commented Mar 2, 2026 •

edited

Loading

jedwards1211 commented Mar 3, 2026 •

edited

Loading