Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit cbdb0d7

Browse files
committed
fix: make walkDOM iterative to prevent stack overflow (GHSA-2v35-w6hq-6mfw)
Replace the recursive `walkDOM` implementation with an explicit stack of frames, each carrying `{node, context, phase}`. The `phase` field is one of two constants added to the function — `walkDOM.ENTER` (call `enter`, schedule children) and `walkDOM.EXIT` (call `exit`) — making the two-phase lifecycle of each frame self-documenting. Children are pushed by traversing from `lastChild` via `previousSibling`, which naturally places `firstChild` on top without requiring an intermediate array. The public API, `walkDOM.STOP` sentinel, and all five callers (`_visitNode`, `getTextContent`, `cloneNode`, `importNode`, `serializeToString`) are unchanged. The test script is updated to cap the Node.js stack at 256 KB via `--stack-size=256`, lowering the recursive-overflow threshold to ~2,600 frames. A new `test/recursion-regression.test.js` consolidates all deep-tree regression guards (GHSA-2v35-w6hq-6mfw) and enforces the constrained test environment via invariant checks. Depth tests covering all five callers at a 3,000-node tree (which exceeds the constrained overflow threshold) confirm the fix and guard against regression. `docs/walk-dom.md` is carried over from the 0.9.x branch. GHSA-2v35-w6hq-6mfw
1 parent 0b543d3 commit cbdb0d7

4 files changed

Lines changed: 243 additions & 18 deletions

File tree

docs/walk-dom.md

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# `walkDOM` — iterative tree traversal
2+
3+
**Source**: [`lib/dom.js`](../lib/dom.js)
4+
5+
`walkDOM` visits every node in a DOM subtree in depth-first order, calling
6+
caller-supplied `enter` and `exit` callbacks. It uses an explicit stack instead
7+
of recursion, so it is safe on arbitrarily deep trees.
8+
9+
## Stack frame
10+
11+
Each item on the work stack is a frame with three fields:
12+
13+
| Field | Purpose |
14+
|-----------|------------------------------------------------------|
15+
| `node` | The DOM node to process |
16+
| `context` | Opaque value threaded through the traversal |
17+
| `phase` | `ENTER` — call `enter`, schedule children and exit |
18+
| | `EXIT` — call `exit` for the matching `enter` |
19+
20+
Each frame starts its life as `ENTER` and, after being processed, causes an
21+
`EXIT` frame for the same node to be pushed. The `EXIT` frame fires only after
22+
all descendants have been fully processed — which is exactly the post-order
23+
guarantee.
24+
25+
## Control flow
26+
27+
```mermaid
28+
flowchart TD
29+
start([push ENTER frame for root]) --> loop
30+
31+
loop{stack empty?} -- no --> pop[pop frame]
32+
loop -- yes --> done([return])
33+
34+
pop --> phase{frame.phase}
35+
36+
phase -- ENTER --> callEnter["childContext = enter(node, context)"]
37+
callEnter --> check{childContext?}
38+
39+
phase -- EXIT --> callExit["call exit(node, context)\nif provided"]
40+
callExit --> loop
41+
42+
check -- STOP --> ret([return STOP])
43+
check -- otherwise --> pushExit["push EXIT frame\n{node, childContext}"]
44+
pushExit --> descend{childContext\nnull or undefined?}
45+
descend -- yes\nskip children --> loop
46+
descend -- no --> pushChildren["push ENTER frames\nfrom lastChild via previousSibling\n(firstChild lands on top)"]
47+
pushChildren --> loop
48+
```
49+
50+
`EXIT` is always pushed — even when children are skipped — so `exit` is called
51+
once for every `enter`, symmetric and unconditional.
52+
53+
## Why pushing EXIT before children guarantees post-order
54+
55+
The stack is last-in-first-out. When an `ENTER` frame is processed, the following are
56+
pushed in this order:
57+
58+
1. `EXIT` frame for the current node
59+
2. `ENTER` frames for children, traversed from `lastChild` via `previousSibling` — so `firstChild` lands on top and is processed first
60+
61+
Because children are on top, they are processed first. The parent's `EXIT`
62+
frame sits below all of them and fires only after the last descendant finishes.
63+
64+
### Stack trace
65+
66+
```xml
67+
<root>
68+
<A>
69+
<A1/>
70+
</A>
71+
<B/>
72+
</root>
73+
```
74+
75+
| Step | Actions | Stack after <br/>(list order, last item = next to pop) |
76+
|------|-------------------------------------------------------------------------------------------------|-----------------------------------------------------------------|
77+
| init | push `ENTER(root)` | `ENTER(root)` |
78+
| 1 | pop `ENTER(root)`,<br/> `enter(root)`,<br/> push `EXIT(root)`, `ENTER(B)`, `ENTER(A)` | `EXIT(root)`,<br/> `ENTER(B)`,<br/> `ENTER(A)` |
79+
| 2 | pop `ENTER(A)`,<br/> `enter(A)`,<br/> push `EXIT(A)`, `ENTER(A1)` | `EXIT(root)`,<br/> `ENTER(B)`,<br/> `EXIT(A)`,<br/> `ENTER(A1)` |
80+
| 3 | pop `ENTER(A1)`,<br/> `enter(A1)`,<br/> push `EXIT(A1)` | `EXIT(root)`,<br/> `ENTER(B)`,<br/> `EXIT(A)`,<br/> `EXIT(A1)` |
81+
| 4 | pop `EXIT(A1)`,<br/> `exit(A1)` | `EXIT(root)`,<br/> `ENTER(B)`,<br/> `EXIT(A)` |
82+
| 5 | pop `EXIT(A)`,<br/> `exit(A)` | `EXIT(root)`,<br/> `ENTER(B)` |
83+
| 6 | pop `ENTER(B)`,<br/> `enter(B)`,<br/> push `EXIT(B)` | `EXIT(root)`,<br/> `EXIT(B)` |
84+
| 7 | pop `EXIT(B)`,<br/> `exit(B)` | `EXIT(root)` |
85+
| 8 | pop `EXIT(root)`,<br/> `exit(root)` | _(empty)_ |
86+
87+
Call order: `enter(root)``enter(A)``enter(A1)``exit(A1)``exit(A)``enter(B)``exit(B)``exit(root)`.
88+
89+
## Context isolation
90+
91+
The value returned by `enter` becomes `childContext`. It is stored in:
92+
93+
- the `EXIT` frame, so `exit` receives exactly what `enter` returned
94+
- all children's `ENTER` frames, so each child starts with the parent's output
95+
96+
Siblings share the same `childContext` reference from their common parent.
97+
Callers that need per-element isolation must produce a fresh value inside
98+
`enter` (e.g. `namespaces.slice()` in `serializeToString`).
99+
100+
## DOM mutation during traversal
101+
102+
Only mutations to a node's **own children** inside its `enter` callback are
103+
supported. Because `lastChild` and `previousSibling` are read after `enter` returns, any
104+
children added or removed there are correctly reflected when the walker
105+
schedules the next level of frames.
106+
107+
Mutating anything else — siblings of the current node, ancestors, or unrelated
108+
subtrees — produces unpredictable results. Nodes already queued on the stack
109+
are visited regardless of subsequent DOM changes; nodes inserted outside the
110+
current child list are never queued and therefore never visited. Neither
111+
`enter` nor `exit` is guaranteed to be called for such nodes.
112+
113+
## `walkDOM.STOP`
114+
115+
Returning `walkDOM.STOP` from `enter` causes the function to return `STOP`
116+
immediately, discarding the rest of the stack. No further `enter` or `exit`
117+
calls are made — including any pending `EXIT` frames for ancestors.

lib/dom.js

Lines changed: 49 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -617,40 +617,58 @@ function _visitNode(node, callback) {
617617
* sibling traversal continues normally.
618618
* 3. If `enter` returns `walkDOM.STOP`, the entire traversal is aborted immediately — no
619619
* further `enter` or `exit` calls are made.
620-
* 4. `firstChild` is read **after** `enter` returns, so `enter` may safely modify the node's
621-
* child list before the walker descends.
620+
* 4. `lastChild` and `previousSibling` are read **after** `enter` returns, so `enter` may
621+
* safely modify the node's own child list before the walker descends. Modifying siblings of
622+
* the current node or any other part of the tree produces unpredictable results: nodes already
623+
* queued on the stack are visited regardless of DOM changes, and newly inserted nodes outside
624+
* the current child list are never visited.
622625
* 5. Calls `callbacks.exit(node, context)` (if provided) after all of a node's children have
623626
* been visited, passing the same `context` that `enter`
624627
* returned for that node.
625628
*
626-
* Note: this implementation is recursive. It will be converted to an iterative form in a later
627-
* step to eliminate stack-overflow risk on very deep trees.
629+
* This implementation uses an explicit stack and does not recurse — it is safe on arbitrarily
630+
* deep trees.
628631
*
629632
* @param {Node} node
630633
* Root of the subtree to walk.
631634
* @param {*} context
632635
* Initial context value passed to the root node's `enter`.
633636
* @param {{ enter: function(Node, *): *, exit?: function(Node, *): void }} callbacks
634637
* @returns {void | walkDOM.STOP}
638+
* @see ../docs/walk-dom.md.
635639
*/
636640
function walkDOM(node, context, callbacks) {
637-
var childContext = callbacks.enter(node, context);
638-
if (childContext === walkDOM.STOP) {
639-
return walkDOM.STOP;
640-
}
641-
if (childContext !== null && childContext !== undefined) {
642-
var child = node.firstChild;
643-
while (child) {
644-
var next = child.nextSibling;
645-
if (walkDOM(child, childContext, callbacks) === walkDOM.STOP) {
641+
// Each stack frame is {node, context, phase}:
642+
// walkDOM.ENTER — call enter, then push children
643+
// walkDOM.EXIT — call exit
644+
var stack = [{ node: node, context: context, phase: walkDOM.ENTER }];
645+
while (stack.length > 0) {
646+
var frame = stack.pop();
647+
if (frame.phase === walkDOM.ENTER) {
648+
var childContext = callbacks.enter(frame.node, frame.context);
649+
if (childContext === walkDOM.STOP) {
646650
return walkDOM.STOP;
647651
}
648-
child = next;
652+
// Push exit frame before children so it fires after all children are processed (Last In First Out)
653+
stack.push({ node: frame.node, context: childContext, phase: walkDOM.EXIT });
654+
if (childContext === null || childContext === undefined) {
655+
continue; // skip children
656+
}
657+
// lastChild is read after enter returns, so enter may modify the child list.
658+
var child = frame.node.lastChild;
659+
// Traverse from lastChild backwards so that pushing onto the stack
660+
// naturally yields firstChild on top (processed first).
661+
while (child) {
662+
stack.push({ node: child, context: childContext, phase: walkDOM.ENTER });
663+
child = child.previousSibling;
664+
}
665+
} else {
666+
// frame.phase === walkDOM.EXIT
667+
if (callbacks.exit) {
668+
callbacks.exit(frame.node, frame.context);
669+
}
649670
}
650671
}
651-
if (callbacks.exit) {
652-
callbacks.exit(node, childContext);
653-
}
654672
}
655673

656674
/**
@@ -660,6 +678,20 @@ function walkDOM(node, context, callbacks) {
660678
* @type {symbol}
661679
*/
662680
walkDOM.STOP = Symbol('walkDOM.STOP');
681+
/**
682+
* Phase constant for a stack frame that has not yet been visited.
683+
* The `enter` callback is called and children are scheduled.
684+
*
685+
* @type {number}
686+
*/
687+
walkDOM.ENTER = 0;
688+
/**
689+
* Phase constant for a stack frame whose subtree has been fully visited.
690+
* The `exit` callback is called.
691+
*
692+
* @type {number}
693+
*/
694+
walkDOM.EXIT = 1;
663695

664696
function Document(){
665697
this.ownerDocument = this;

package.json

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@
2727
"index.d.ts",
2828
"lib"
2929
],
30+
"config": {
31+
"test_stack_size": 256
32+
},
3033
"scripts": {
3134
"lint": "eslint lib test",
3235
"format": "prettier --write test",
@@ -35,7 +38,7 @@
3538
"start": "nodemon --watch package.json --watch lib --watch test --exec 'npm --silent run test && npm --silent run lint'",
3639
"stryker": "stryker run",
3740
"stryker:dry-run": "stryker run -m '' --reporters progress",
38-
"test": "jest",
41+
"test": "node --stack-size=$npm_package_config_test_stack_size ./node_modules/.bin/jest",
3942
"testrelease": "npm test && eslint lib",
4043
"version": "./changelog-has-version.sh",
4144
"release": "np --no-yarn --test-script testrelease --branch release-0.8.x patch"

test/recursion-regression.test.js

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
'use strict'
2+
3+
const { describe, test, expect, beforeAll } = require('@jest/globals')
4+
const { DOMImplementation, walkDOM } = require('../lib/dom')
5+
const { XMLSerializer } = require('../lib')
6+
const pkgJson = require('../package.json')
7+
8+
// Must exceed the recursive-overflow threshold at the configured stack size
9+
// (~2,600 frames at 256 KB) so that re-introducing any recursive tree walk
10+
// causes these tests to fail.
11+
const DEEP_TREE_DEPTH = 3000
12+
13+
test('npm_package_config_test_stack_size env var matches package.json config.test_stack_size', () => {
14+
expect(process.env.npm_package_config_test_stack_size).toBe(
15+
`${pkgJson.config.test_stack_size}`
16+
)
17+
})
18+
test('test script uses $npm_package_config_test_stack_size', () => {
19+
expect(pkgJson.scripts.test).toMatch(
20+
' --stack-size=$npm_package_config_test_stack_size'
21+
)
22+
})
23+
test('recursive function overflows within DEEP_TREE_DEPTH frames', () => {
24+
function throwsAtLevel(lvl) {
25+
var nextLvl = (lvl || 0) + 1
26+
try {
27+
return throwsAtLevel(nextLvl)
28+
} catch (e) {
29+
return nextLvl
30+
}
31+
}
32+
expect(throwsAtLevel()).toBeLessThanOrEqual(DEEP_TREE_DEPTH)
33+
})
34+
35+
describe('deep tree stack overflow guard (GHSA-2v35-w6hq-6mfw)', () => {
36+
var deepRoot
37+
beforeAll(() => {
38+
var doc = new DOMImplementation().createDocument(null, 'root')
39+
var current = doc.documentElement
40+
for (var i = 0; i < DEEP_TREE_DEPTH; i++) {
41+
var child = doc.createElement('n')
42+
current.appendChild(child)
43+
current = child
44+
}
45+
deepRoot = doc.documentElement
46+
})
47+
48+
test('walkDOM', () => {
49+
expect(() =>
50+
walkDOM(deepRoot, null, {
51+
enter: function () {
52+
return 'ctx'
53+
},
54+
})
55+
).not.toThrow()
56+
})
57+
test('getElementsByTagName', () => {
58+
expect(() => deepRoot.getElementsByTagName('n')).not.toThrow()
59+
})
60+
test('textContent', () => {
61+
expect(() => void deepRoot.textContent).not.toThrow()
62+
})
63+
test('serializeToString', () => {
64+
expect(() => new XMLSerializer().serializeToString(deepRoot)).not.toThrow()
65+
})
66+
test('cloneNode(true)', () => {
67+
expect(() => deepRoot.cloneNode(true)).not.toThrow()
68+
})
69+
test('importNode(node, true)', () => {
70+
var destDoc = new DOMImplementation().createDocument(null, 'dest')
71+
expect(() => destDoc.importNode(deepRoot, true)).not.toThrow()
72+
})
73+
})

0 commit comments

Comments
 (0)