Conversation
|
Thanks for this pull request. |
There was a problem hiding this comment.
It would be nice if the bytesize would be present in the 'raise' statement, so that one can at least adjust the @@entity_expansion_text_limit.
"entity expansion has grown too large: size: XY exceeded @@entity_expansion_text_limit"
There was a problem hiding this comment.
It'll be helpful but let's work on it in a separated PR.
|
This fixes the @naitoh The current Could you use the following for index 5e3ad75..aeef268 100644
--- a/test/test_sax.rb
+++ b/test/test_sax.rb
@@ -145,17 +145,19 @@ module REXMLTests
</member>
XML
+ REXML::Security.entity_expansion_limit = 100000
sax = REXML::Parsers::SAX2Parser.new(source)
- assert_raise(RuntimeError.new("number of entity expansions exceeded, processing aborted.")) do
- sax.parse
- end
+ sax.parse
+ assert_equal(11111, sax.entity_expansion_count)
- REXML::Security.entity_expansion_limit = 100
+ REXML::Security.entity_expansion_limit = @default_entity_expansion_limit
sax = REXML::Parsers::SAX2Parser.new(source)
assert_raise(RuntimeError.new("number of entity expansions exceeded, processing aborted.")) do
sax.parse
end
- assert_equal(101, sax.entity_expansion_count)
+ assert do
+ sax.entity_expansion_count > @default_entity_expansion_limit
+ end
end
def test_with_default_entityI think that we need another approach something like the following: diff --git a/lib/rexml/parsers/baseparser.rb b/lib/rexml/parsers/baseparser.rb
index 28810bf..699ed91 100644
--- a/lib/rexml/parsers/baseparser.rb
+++ b/lib/rexml/parsers/baseparser.rb
@@ -547,22 +547,31 @@ module REXML
[Integer(m)].pack('U*')
}
matches.collect!{|x|x[0]}.compact!
+ if filter
+ matches.reject! do |entity_reference|
+ filter.include?(entity_reference)
+ end
+ end
if matches.size > 0
sum = 0
- matches.each do |entity_reference|
- unless filter and filter.include?(entity_reference)
- entity_value = entity( entity_reference, entities )
- if entity_value
- re = Private::DEFAULT_ENTITIES_PATTERNS[entity_reference] || /&#{entity_reference};/
- rv.gsub!( re, entity_value )
- sum += rv.bytesize
- if sum > Security.entity_expansion_text_limit
- raise "entity expansion has grown too large"
- end
- else
- er = DEFAULT_ENTITIES[entity_reference]
- rv.gsub!( er[0], er[2] ) if er
+ matches.tally.each do |entity_reference, n|
+ entity_expansion_count_before = @entity_expansion_count
+ entity_value = entity( entity_reference, entities )
+ entity_expansion_count_delta =
+ @entity_expansion_count - entity_expansion_count_before
+ if n > 1
+ record_entity_expansion(entity_expansion_count_delta * (n - 1))
+ end
+ if entity_value
+ re = Private::DEFAULT_ENTITIES_PATTERNS[entity_reference] || /&#{entity_reference};/
+ rv.gsub!( re, entity_value )
+ sum += rv.bytesize
+ if sum > Security.entity_expansion_text_limit
+ raise "entity expansion has grown too large"
end
+ else
+ er = DEFAULT_ENTITIES[entity_reference]
+ rv.gsub!( er[0], er[2] ) if er
end
end
rv.gsub!( Private::DEFAULT_ENTITIES_PATTERNS['amp'], '&' )
@@ -572,8 +581,8 @@ module REXML
private
- def record_entity_expansion
- @entity_expansion_count += 1
+ def record_entity_expansion(delta=1)
+ @entity_expansion_count += delta
if @entity_expansion_count > Security.entity_expansion_limit
raise "number of entity expansions exceeded, processing aborted."
end
Could you use this? diff --git a/test/test_sax.rb b/test/test_sax.rb
index 5e3ad75..d2bc231 100644
--- a/test/test_sax.rb
+++ b/test/test_sax.rb
@@ -102,10 +102,12 @@ module REXMLTests
class EntityExpansionLimitTest < Test::Unit::TestCase
def setup
@default_entity_expansion_limit = REXML::Security.entity_expansion_limit
+ @default_entity_expansion_text_limit = REXML::Security.entity_expansion_text_limit
end
def teardown
REXML::Security.entity_expansion_limit = @default_entity_expansion_limit
+ REXML::Security.entity_expansion_text_limit = @default_entity_expansion_text_limit
end
class GeneralEntityTest < self
@@ -124,6 +126,17 @@ module REXMLTests
</member>
XML
+ REXML::Security.entity_expansion_limit = 100_000
+ REXML::Security.entity_expansion_text_limit = 1_000_000_000
+ sax = REXML::Parsers::SAX2Parser.new(source)
+ text_size = nil
+ sax.listen(:characters, ["member"]) do |text|
+ text_size = text.size
+ end
+ sax.parse
+ assert_equal(300002, text_size)
+
+ REXML::Security.entity_expansion_text_limit = @default_entity_expansion_text_limit
sax = REXML::Parsers::SAX2Parser.new(source)
assert_raise(RuntimeError.new("entity expansion has grown too large")) do
sax.parseCould you do similar one for |
* Reject filtered matches earlier in the loop * Improve `#unnormalize` by removing redundant calls to `rv.gsub!` * Improve `entity_expansion_limit` tests Co-Authored-By: Sutou Kouhei <[email protected]>
4fd8b6b to
83be597
Compare
Ah, I see, you are right!
I've added your suggested changes to 83be597, but I moved the calculation of Since issue #193 was handled by #195, I'll edit the PR description to match the current situation. |
BaseParser#unnormalize and fix sum calculationBaseParser#unnormalize
|
Ah, diff --git a/lib/rexml/parsers/baseparser.rb b/lib/rexml/parsers/baseparser.rb
index 342f948..0ac243a 100644
--- a/lib/rexml/parsers/baseparser.rb
+++ b/lib/rexml/parsers/baseparser.rb
@@ -8,6 +8,22 @@ require "strscan"
module REXML
module Parsers
+ unless [].respond_to?(:tally)
+ module EnumerableTally
+ refine Enumerable do
+ def tally
+ counts = {}
+ each do |item|
+ counts[item] ||= 0
+ counts[item] += 1
+ end
+ counts
+ end
+ end
+ end
+ using EnumerableTally
+ end
+
if StringScanner::Version < "3.0.8"
module StringScannerCaptures
refine StringScanner do |
`#tally` doesn't exist in Ruby 2.5 and 2.6 * Refine `Enumerable` to support `#tally` in `REXML::Parsers` Co-Authored-By: Sutou Kouhei <[email protected]>
Thanks for the patch, added in b0949d8 |
|
@naitoh Could you review this before we merge this? |
I have checked this PR. |
|
Thanks. |
The current implementation of
#unnormalizeiterates over matched entity references that already has been substituted. With these changes we will reduce the number of redundant calls torv.gsub!.#unnormalizeby removing redundant calls torv.gsub!entity_expansion_limittestsExample:
Before this PR, the example above would require 100 iterations. After this PR, 1 iteration.