-
Notifications
You must be signed in to change notification settings - Fork 76
Comparing changes
Open a pull request
base repository: ruby/rexml
base: v3.4.1
head repository: ruby/rexml
compare: master
- 10 commits
- 13 files changed
- 5 contributors
Commits on Feb 16, 2025
-
Configuration menu - View commit details
-
Copy full SHA for b97e454 - Browse repository at this point
Copy the full SHA b97e454View commit details
Commits on Mar 2, 2025
-
Improve CDATA parse performance (#244)
## Why? GitHub: fix #243 ## Benchmark (Comparison with rexml 3.4.1) ``` $ benchmark-driver benchmark/parse_cdata.yaml Calculating ------------------------------------- rexml 3.4.1 master 3.4.1(YJIT) master(YJIT) dom 648.361 1.178k 591.590 1.046k i/s - 100.000 times in 0.154235s 0.084913s 0.169036s 0.095627s sax 699.061 1.378k 651.148 1.196k i/s - 100.000 times in 0.143049s 0.072549s 0.153575s 0.083611s pull 699.271 1.379k 660.275 1.210k i/s - 100.000 times in 0.143006s 0.072527s 0.151452s 0.082622s stream 701.725 1.383k 659.483 1.228k i/s - 100.000 times in 0.142506s 0.072307s 0.151634s 0.081455s Comparison: dom master: 1177.7 i/s master(YJIT): 1045.7 i/s - 1.13x slower rexml 3.4.1: 648.4 i/s - 1.82x slower 3.4.1(YJIT): 591.6 i/s - 1.99x slower sax master: 1378.4 i/s master(YJIT): 1196.0 i/s - 1.15x slower rexml 3.4.1: 699.1 i/s - 1.97x slower 3.4.1(YJIT): 651.1 i/s - 2.12x slower pull master: 1378.8 i/s master(YJIT): 1210.3 i/s - 1.14x slower rexml 3.4.1: 699.3 i/s - 1.97x slower 3.4.1(YJIT): 660.3 i/s - 2.09x slower stream master: 1383.0 i/s master(YJIT): 1227.7 i/s - 1.13x slower rexml 3.4.1: 701.7 i/s - 1.97x slower 3.4.1(YJIT): 659.5 i/s - 2.10x slower ``` - YJIT=ON : 1.76x - 1.83x faster - YJIT=OFF : 1.82x - 1.97x faster Reported by Masamune. Thanks!!! Co-authored-by: Sutou Kouhei <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 64a709e - Browse repository at this point
Copy the full SHA 64a709eView commit details
Commits on Mar 3, 2025
-
Improve comment parse performance (#245)
## Benchmark (Comparison with rexml 3.4.1) ``` $ benchmark-driver benchmark/parse_comment.yaml Calculating ------------------------------------- rexml 3.4.1 master 3.4.1(YJIT) master(YJIT) top_level 999.440 5.058k 922.416 3.340k i/s - 100.000 times in 0.100056s 0.019770s 0.108411s 0.029936s in_doctype 1.063k 4.890k 980.498 3.341k i/s - 100.000 times in 0.094116s 0.020449s 0.101989s 0.029927s after_doctype 638.321 1.304k 603.952 1.153k i/s - 100.000 times in 0.156661s 0.076710s 0.165576s 0.086748s Comparison: top_level master: 5058.2 i/s master(YJIT): 3340.5 i/s - 1.51x slower rexml 3.4.1: 999.4 i/s - 5.06x slower 3.4.1(YJIT): 922.4 i/s - 5.48x slower in_doctype master: 4890.2 i/s master(YJIT): 3341.5 i/s - 1.46x slower rexml 3.4.1: 1062.5 i/s - 4.60x slower 3.4.1(YJIT): 980.5 i/s - 4.99x slower after_doctype master: 1303.6 i/s master(YJIT): 1152.8 i/s - 1.13x slower rexml 3.4.1: 638.3 i/s - 2.04x slower 3.4.1(YJIT): 604.0 i/s - 2.16x slower ``` - YJIT=ON : 1.90x - 3.62x faster - YJIT=OFF : 2.04x - 5.06x faster
Configuration menu - View commit details
-
Copy full SHA for 4349091 - Browse repository at this point
Copy the full SHA 4349091View commit details
Commits on Mar 4, 2025
-
Improve CDATA and comment parse performance (#246)
## Why? Since `<a><!a` and `<a><!a>` are malformed node, they do not need to be checked before comments and CDATA. ## Benchmark : comment (after_doctype) ``` $ benchmark-driver benchmark/parse_comment.yaml Calculating ------------------------------------- before after before(YJIT) after(YJIT) after_doctype 1.306k 5.586k 1.152k 3.569k i/s - 100.000 times in 0.076563s 0.017903s 0.086822s 0.028020s Comparison: after_doctype after: 5585.7 i/s after(YJIT): 3568.9 i/s - 1.57x slower before: 1306.1 i/s - 4.28x slower before(YJIT): 1151.8 i/s - 4.85x slower ``` - YJIT=ON : 3.09x faster - YJIT=OFF : 4.28x faster ## Benchmark : CDATA ``` $ benchmark-driver benchmark/parse_cdata.yaml Calculating ------------------------------------- before after before(YJIT) after(YJIT) dom 1.269k 5.548k 1.053k 3.072k i/s - 100.000 times in 0.078808s 0.018026s 0.094976s 0.032553s sax 1.399k 8.244k 1.220k 4.460k i/s - 100.000 times in 0.071458s 0.012130s 0.081958s 0.022422s pull 1.411k 8.319k 1.260k 4.806k i/s - 100.000 times in 0.070883s 0.012021s 0.079335s 0.020809s stream 1.420k 8.320k 1.254k 4.728k i/s - 100.000 times in 0.070406s 0.012019s 0.079738s 0.021149s Comparison: dom after: 5547.5 i/s after(YJIT): 3071.9 i/s - 1.81x slower before: 1268.9 i/s - 4.37x slower before(YJIT): 1052.9 i/s - 5.27x slower sax after: 8244.0 i/s after(YJIT): 4459.9 i/s - 1.85x slower before: 1399.4 i/s - 5.89x slower before(YJIT): 1220.1 i/s - 6.76x slower pull after: 8318.8 i/s after(YJIT): 4805.6 i/s - 1.73x slower before: 1410.8 i/s - 5.90x slower before(YJIT): 1260.5 i/s - 6.60x slower stream after: 8320.2 i/s after(YJIT): 4728.4 i/s - 1.76x slower before: 1420.3 i/s - 5.86x slower before(YJIT): 1254.1 i/s - 6.63x slower ``` - YJIT=ON : 2.91x - 3.80x faster - YJIT=OFF : 4.37x - 5.90x faster Co-authored-by: Sutou Kouhei <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a5f31c4 - Browse repository at this point
Copy the full SHA a5f31c4View commit details
Commits on Mar 5, 2025
-
Raise appropriate exception when failing to match start tag in DOCTYPE (
#247) ## Why? Added exception to make the process easier to understand.
Configuration menu - View commit details
-
Copy full SHA for a85203e - Browse repository at this point
Copy the full SHA a85203eView commit details
Commits on Apr 3, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 5d2606a - Browse repository at this point
Copy the full SHA 5d2606aView commit details
Commits on May 3, 2025
-
NEWS.md : Fix the mentioned of the PR in CVE-2024-35176. (#253)
I think the mentioned of CVE-2024-35176 in NEWS.md is incorrect. ``` - Improved parse performance when an attribute has many <s. - GH-126 ``` #126 looks like fixes the issue with attribute value that contains multiple '>' characters.
Configuration menu - View commit details
-
Copy full SHA for d944fa4 - Browse repository at this point
Copy the full SHA d944fa4View commit details -
Fix reverse sort in xpath_parser (#251)
The code below was failing with `REXML::XPathParser#sort': undefined method '-@' for an instance of Array` ```ruby d = REXML::Document.new("<a><b><c/><d/><x/></b><b><e/><x/></b></a>") matches = REXML::XPath.match(d, "a/b/x/preceding-sibling::node()") # Before: error # After: [<e/>, <d/>, <c/>] ``` This pull request will fix it.
Configuration menu - View commit details
-
Copy full SHA for de6f40e - Browse repository at this point
Copy the full SHA de6f40eView commit details
Commits on May 6, 2025
-
Fix duplicate responses in XPath following, following-sibling, preced…
…ing, preceding-sibling (#255) ## Why? See: #251 (comment) ## Expected values - XPath : a/d/preceding::* => ["d", "c", "b"] ```xml <a> <b/> <!-- a/d/preceding::b --> <c/> <!-- a/d/preceding::c --> <d/> <!-- a/d/preceding::d --> <d/> <!-- self --> <e/> <f/> </a> ``` - XPath : a/d/following::* => ["d", "e", "f"] ```xml <a> <b/> <c/> <d/> <!-- self --> <d/> <!-- a/d/following::d --> <e/> <!-- a/d/following::e --> <f/> <!-- a/d/following::f --> </a> ``` - XPath : a/b/x/following-sibling:* => ["c", "d", "e"] ```xml <a> <b> <x/> <!-- self --> <c/> <!-- a/b/x/following-sibling::c --> <d/> <!-- a/b/x/following-sibling::d --> </b> <b> <x/> <!-- self --> <e/> <!-- a/b/x/following-sibling::e --> </b> </a> ``` - XPath : a/b/x/following-sibling:* => ["c", "d", "x", "e"] ```xml <a> <b> <x/> <!-- self --> <c/> <!-- a/b/x/following-sibling::c --> <d/> <!-- a/b/x/following-sibling::d --> <x/> <!-- a/b/x/following-sibling::x --> <e/> <!-- a/b/x/following-sibling::e --> </b> </a> ``` - XPath : a/b/x/preceding-sibling::* => ["e", "d", "c"] ```xml <a> <b> <c/> <!-- a/b/x/preceding-sibling::c --> <d/> <!-- a/b/x/preceding-sibling::d --> <x/> <!-- self --> </b> <b> <e/> <!-- a/b/x/preceding-sibling::e --> <x/> <!-- self --> </b> </a> ``` - XPath : a/b/x/preceding-sibling::* => ["e", "x", "d", "c"] ```xml <a> <b> <c/> <!-- a/b/x/preceding-sibling::c --> <d/> <!-- a/b/x/preceding-sibling::d --> <x/> <!-- a/b/x/preceding-sibling::x --> <e/> <!-- a/b/x/preceding-sibling::e --> <x/> <!-- self --> </b> </a> ``` - XPath : //a/following-sibling:*[1] => ["w", "x", "y", "z"] ```xml <div> <div> <a/> <-- self --> <w/> <-- //a/following-sibling:*[1] --> </div> <a/> <-- self --> <x/> <-- //a/following-sibling:*[1] --> <a/> <-- self --> <y/> <-- //a/following-sibling:*[1] --> <a/> <-- self --> <z/> <-- //a/following-sibling:*[1] --> </div> ```
Configuration menu - View commit details
-
Copy full SHA for 249d770 - Browse repository at this point
Copy the full SHA 249d770View commit details
Commits on May 7, 2025
-
Deprecate accepting array as an element in XPath.match, first and each (
#252) `XPath.match`, `XPath.first`, `XPath.each`, `XPathParser#parse` and `XPathParser#match` accepted nodeset as element. This pull request changes the first parameter of these method to be an element instead of nodeset. Passing nodeset will be deprecated. ```ruby # Documented usage. OK REXML::XPath.match(element, xpath) # Undocumented usage. Deprecate in this pull request nodeset = [element] REXML::XPath.match(nodeset, xpath) ``` ### Background #249 will introduce a temporary cache. ```ruby def parse path, nodeset path_stack = @parser.parse( path ) nodeset.first.document.send(:enable_cache) do match( path_stack, nodeset ) end end ``` But the signature `XPathParser#match(path, nodeset)` does not guarantee that all nodes in the nodeset has the same root document. So cache does not work in the code below. It's still slow. ```ruby REXML::XPath.match(2.times.map { REXML::Document.new('<a>'*400+'</a>'*400) }, 'a//a') ``` The interface is holding our back, so I propose to drop accepting array as element. This change is a backward incompatibility, but it just drops undocumented feature. I think only the test code was unintentionally using this feature. ### XPath.match with array XPath.match only traverse the first element of the array for some selectors. ```ruby nodeset = [REXML::Document.new("<a><b/></a>"), REXML::Document.new("<a><c/></a>")] REXML::XPath.match(nodeset, "a/*") #=> [<b/>, <c/>] REXML::XPath.match(nodeset, "//a/*") #=> [<b/>] # I expect [<b/>, <c/>] but the second document is ignored ``` It indicates that `XPath.match` is not designed to search inside multiple nodes/documents. --------- Co-authored-by: Sutou Kouhei <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for cd575a1 - Browse repository at this point
Copy the full SHA cd575a1View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v3.4.1...master