While auditing a project depending on pulldown-cmark, we ran the dos-fuzzer included in this repository against a generated corpus. We identified multiple input patterns that trigger superlinear (quadratic or worse) CPU growth, reproducible across independent runs.
All data below is unmodified fuzzer output. A payload under 5 KB is sufficient to produce a significant CPU spike on any service parsing untrusted Markdown.
Methodology
We ran dos-fuzzer over 600,000+ patterns (~9,000 patterns/sec). We report only the six highest-scoring results here.
Findings
Every high-scoring pattern shares a common structural property involving a specific Markdown construct combined with line boundary characters. This holds consistently across both runs. Specific patterns and timing data are being withheld until a fix is available.
Root cause hypothesis
We have not audited the source in depth, you are better placed to confirm, but the empirical pattern is unambiguous: every high-scoring case involves [^ at or near a line boundary within an unclosed structure. Our hypothesis is that the parser retries the footnote reference match from each newline position, producing O(n) retries each costing O(n). Unclosed HTML tags in the surrounding context may compound this by preventing early termination.
Suggested mitigations
- Bail out of footnote reference parsing after a maximum number of backtrack steps
- Or memoize failed match positions to avoid retrying known-failing positions
- Short-term: cap nesting/repetition depth in the footnote reference parser
- Consider adding
dos-fuzzer to CI to catch regressions
Reproduction
Corpus files are available privately on request to maintainers.
Impact
Any application using pulldown-cmark to parse untrusted Markdown is potentially affected (wikis, comment systems, documentation platforms, CI log renderers). A single < 5 KB request is sufficient to spike a CPU core.
Notes
We are happy to test proposed patches against our fuzzer corpus. Let us know if you'd prefer to continue in a private channel.
While auditing a project depending on pulldown-cmark, we ran the
dos-fuzzerincluded in this repository against a generated corpus. We identified multiple input patterns that trigger superlinear (quadratic or worse) CPU growth, reproducible across independent runs.All data below is unmodified fuzzer output. A payload under 5 KB is sufficient to produce a significant CPU spike on any service parsing untrusted Markdown.
Methodology
We ran
dos-fuzzerover 600,000+ patterns (~9,000 patterns/sec). We report only the six highest-scoring results here.Findings
Every high-scoring pattern shares a common structural property involving a specific Markdown construct combined with line boundary characters. This holds consistently across both runs. Specific patterns and timing data are being withheld until a fix is available.
Root cause hypothesis
We have not audited the source in depth, you are better placed to confirm, but the empirical pattern is unambiguous: every high-scoring case involves
[^at or near a line boundary within an unclosed structure. Our hypothesis is that the parser retries the footnote reference match from each newline position, producing O(n) retries each costing O(n). Unclosed HTML tags in the surrounding context may compound this by preventing early termination.Suggested mitigations
dos-fuzzerto CI to catch regressionsReproduction
Corpus files are available privately on request to maintainers.
Impact
Any application using pulldown-cmark to parse untrusted Markdown is potentially affected (wikis, comment systems, documentation platforms, CI log renderers). A single < 5 KB request is sufficient to spike a CPU core.
Notes
We are happy to test proposed patches against our fuzzer corpus. Let us know if you'd prefer to continue in a private channel.