Deprecate accepting array as an element in XPath.match, first and each #252

tompng · 2025-04-30T19:29:25Z

XPath.match, XPath.first, XPath.each, XPathParser#parse and XPathParser#match accepted nodeset as element.
This pull request changes the first parameter of these method to be an element instead of nodeset.
Passing nodeset will be deprecated.

# Documented usage. OK
REXML::XPath.match(element, xpath)

# Undocumented usage. Deprecate in this pull request
nodeset = [element]
REXML::XPath.match(nodeset, xpath)

Background

#249 will introduce a temporary cache.

def parse path, nodeset
  path_stack = @parser.parse( path )
  nodeset.first.document.send(:enable_cache) do
    match( path_stack, nodeset )
  end
end

But the signature XPathParser#match(path, nodeset) does not guarantee that all nodes in the nodeset has the same root document.
So cache does not work in the code below. It's still slow.

REXML::XPath.match(2.times.map { REXML::Document.new('<a>'*400+'</a>'*400) }, 'a//a')

The interface is holding our back, so I propose to drop accepting array as element.
This change is a backward incompatibility, but it just drops undocumented feature. I think only the test code was unintentionally using this feature.

XPath.match with array

XPath.match only traverse the first element of the array for some selectors.

nodeset = [REXML::Document.new("<a><b/></a>"), REXML::Document.new("<a><c/></a>")]

REXML::XPath.match(nodeset, "a/*")
#=> [<b/>, <c/>]

REXML::XPath.match(nodeset, "//a/*")
#=> [<b/>] # I expect [<b/>, <c/>] but the second document is ignored

It indicates that XPath.match is not designed to search inside multiple nodes/documents.

XPath.match, first, each accepted array as an element. This behavior is not documented, and making hard to optimize and refactor. The second argument of XPathParser#parse and XPathParser#match is also changed from nodeset to node

naitoh · 2025-05-03T07:38:29Z

But the signature XPathParser#match(path, nodeset) does not guarantee that all nodes in the nodeset has the same root document. So cache does not work in the code below. It's still slow.

In this case, I made further improvements to #249, which eliminated the slowness.

Drop accepting array as an element in XPath.match, first and each

How about instead of removing a feature, a deprecated message should be displayed if it is specified in an array?

What do you think @kou?

kou

I'm OK that we drop support for nodeset because we don't have enough resource to complete nodeset support.

In general, we want to keep backward compatibility as much as possible. But we can remove the feature without keeping backward compatibility because:

It's not documented
It doesn't work in some cases

But could you report a warning as @naitoh suggested something like:

diff --git a/lib/rexml/xpath_parser.rb b/lib/rexml/xpath_parser.rb
index 5eb1e5a..a2b2ef5 100644
--- a/lib/rexml/xpath_parser.rb
+++ b/lib/rexml/xpath_parser.rb
@@ -136,11 +136,12 @@ module REXML
     end
 
 
-    def match(path_stack, nodeset)
-      nodeset = nodeset.collect.with_index do |node, i|
-        position = i + 1
-        XPathNode.new(node, position: position)
+    def match(path_stack, node)
+      if node.is_a?(Array)
+        warn("REXML::XPath.XXX dropped support for nodeset...", uplevel: N)
+        node = node.first
       end
+      nodeset = [XPathNode.new(node, position: 1)]
       result = expr(path_stack, nodeset)
       case result
       when Array # nodeset

test/test_jaxen.rb

Co-authored-by: Sutou Kouhei <[email protected]>

test/xpath/test_base.rb

naitoh

LGTM
Thanks!

naitoh · 2025-05-06T07:36:53Z

@tompng
Can you update this PR's description?

tompng · 2025-05-06T07:52:09Z

@naitoh
Updated. Change the title Drop → Deprecate and mention about deprecation in the description

naitoh · 2025-05-07T12:02:40Z

Thanks!

voxik · 2025-10-31T13:41:20Z

Just FTR, this likely breaks vagrant-libvirt:

https://github.com/vagrant-libvirt/vagrant-libvirt/blob/a94ce0d7b6c90129a38435698ca97b364130313d/lib/vagrant-libvirt/action/resolve_disk_settings.rb#L53-L62

I have no idea ATM how to fix this :/

@expected

REXML 3.4.2+ deprecated accepting array as an element in `XPath.match` [1]. This led to test errors such as: ~~~ 3) VagrantPlugins::ProviderLibvirt::Action::ResolveDiskSettings#call when vm box is in use when box metadata is not available when multiple volumes in domain config should populate domain volumes with devices Failure/Error: expect(env[:domain_volumes]).to match( [ hash_including( device: 'vda', absolute_path: '/var/lib/libvirt/images/vagrant-test_default.img' ), hash_including( device: 'vdb', absolute_path: '/var/lib/libvirt/images/vagrant-test_default_1.img' ), expected [{absolute_path: "/var/lib/libvirt/images/vagrant-test_default.img", bus: "virtio", cache: "default", device: "vda", name: "vagrant-test_default.img"}] to match [#<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00007fff921b3db8 @expected={device: "vda", absolute_path: "/var/lib/libvirt/images/vagrant-test_default.img"}>, #<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00007fff921b3d40 @expected={device: "vdb", absolute_path: "/var/lib/libvirt/images/vagrant-test_default_1.img"}>, #<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00007fff921b3cc8 @expected={device: "vdc", absolute_path: "/var/lib/libvirt/images/vagrant-test_default_2.img"}>] Diff: @@ -1,4 +1,6 @@ -[hash_including(device: "vda", absolute_path: "/var/lib/libvirt/images/vagrant-test_default.img"), - hash_including(device: "vdb", absolute_path: "/var/lib/libvirt/images/vagrant-test_default_1.img"), - hash_including(device: "vdc", absolute_path: "/var/lib/libvirt/images/vagrant-test_default_2.img")] +[{absolute_path: "/var/lib/libvirt/images/vagrant-test_default.img", + bus: "virtio", + cache: "default", + device: "vda", + name: "vagrant-test_default.img"}] # ./spec/unit/action/resolve_disk_settings_spec.rb:200:in 'block (6 levels) in <top (required)>' # ./spec/support/unit_context.rb:51:in 'block (3 levels) in <top (required)>' # ./spec/support/unit_context.rb:43:in 'block (2 levels) in <top (required)>' # ./spec/support/unit_context.rb:51:in 'block (3 levels) in <top (required)>' # ./spec/support/unit_context.rb:43:in 'block (2 levels) in <top (required)>' ~~~ This changes the logic in a way, that XPath is matching against whole XML document, instead of array of XML elements. [1]: ruby/rexml#252

@expected

REXML 3.4.2+ deprecated accepting array as an element in `XPath.match` [[1]]. This led to test errors such as: ~~~ 3) VagrantPlugins::ProviderLibvirt::Action::ResolveDiskSettings#call when vm box is in use when box metadata is not available when multiple volumes in domain config should populate domain volumes with devices Failure/Error: expect(env[:domain_volumes]).to match( [ hash_including( device: 'vda', absolute_path: '/var/lib/libvirt/images/vagrant-test_default.img' ), hash_including( device: 'vdb', absolute_path: '/var/lib/libvirt/images/vagrant-test_default_1.img' ), expected [{absolute_path: "/var/lib/libvirt/images/vagrant-test_default.img", bus: "virtio", cache: "default", device: "vda", name: "vagrant-test_default.img"}] to match [#<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00007fff921b3db8 @expected={device: "vda", absolute_path: "/var/lib/libvirt/images/vagrant-test_default.img"}>, #<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00007fff921b3d40 @expected={device: "vdb", absolute_path: "/var/lib/libvirt/images/vagrant-test_default_1.img"}>, #<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00007fff921b3cc8 @expected={device: "vdc", absolute_path: "/var/lib/libvirt/images/vagrant-test_default_2.img"}>] Diff: @@ -1,4 +1,6 @@ -[hash_including(device: "vda", absolute_path: "/var/lib/libvirt/images/vagrant-test_default.img"), - hash_including(device: "vdb", absolute_path: "/var/lib/libvirt/images/vagrant-test_default_1.img"), - hash_including(device: "vdc", absolute_path: "/var/lib/libvirt/images/vagrant-test_default_2.img")] +[{absolute_path: "/var/lib/libvirt/images/vagrant-test_default.img", + bus: "virtio", + cache: "default", + device: "vda", + name: "vagrant-test_default.img"}] # ./spec/unit/action/resolve_disk_settings_spec.rb:200:in 'block (6 levels) in <top (required)>' # ./spec/support/unit_context.rb:51:in 'block (3 levels) in <top (required)>' # ./spec/support/unit_context.rb:43:in 'block (2 levels) in <top (required)>' # ./spec/support/unit_context.rb:51:in 'block (3 levels) in <top (required)>' # ./spec/support/unit_context.rb:43:in 'block (2 levels) in <top (required)>' ~~~ This changes the logic in a way, that XPath is matching against whole XML document, instead of array of XML elements. [1]: ruby/rexml#252

kou reviewed May 3, 2025

View reviewed changes

test/test_jaxen.rb Outdated Show resolved Hide resolved

test/test_jaxen.rb Outdated Show resolved Hide resolved

kou mentioned this pull request May 3, 2025

Improve using // in XPath performance #249

Merged

tompng and others added 2 commits May 5, 2025 02:33

Apply suggestions of fixing test

4de2ac1

Co-authored-by: Sutou Kouhei <[email protected]>

Ensure matching nodes for valueOf assertion to be always present

eccf0f5

kou approved these changes May 4, 2025

View reviewed changes

test/xpath/test_base.rb Outdated Show resolved Hide resolved

Add a deprecation warning for REXML::XPath.each/find/match with nodeset

e2968dc

tompng force-pushed the xpath_no_array branch from 6a02a69 to e2968dc Compare May 5, 2025 06:20

naitoh approved these changes May 6, 2025

View reviewed changes

tompng changed the title ~~Drop accepting array as an element in XPath.match, first and each~~ Deprecate accepting array as an element in XPath.match, first and each May 6, 2025

naitoh merged commit cd575a1 into ruby:master May 7, 2025
66 of 67 checks passed

tompng deleted the xpath_no_array branch May 7, 2025 12:55

voxik mentioned this pull request Oct 31, 2025

Fix REXML 3.4.2+ compatibility vagrant-libvirt/vagrant-libvirt#1861

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Deprecate accepting array as an element in XPath.match, first and each #252

Deprecate accepting array as an element in XPath.match, first and each #252

tompng commented Apr 30, 2025 •

edited

Loading

Uh oh!

naitoh commented May 3, 2025

Uh oh!

kou left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

naitoh left a comment

Uh oh!

naitoh commented May 6, 2025

Uh oh!

tompng commented May 6, 2025

Uh oh!

Uh oh!

naitoh commented May 7, 2025

Uh oh!

voxik commented Oct 31, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Deprecate accepting array as an element in XPath.match, first and each #252

Deprecate accepting array as an element in XPath.match, first and each #252

Conversation

tompng commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

XPath.match with array

Uh oh!

naitoh commented May 3, 2025

Uh oh!

kou left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

naitoh left a comment

Choose a reason for hiding this comment

Uh oh!

naitoh commented May 6, 2025

Uh oh!

tompng commented May 6, 2025

Uh oh!

Uh oh!

naitoh commented May 7, 2025

Uh oh!

voxik commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tompng commented Apr 30, 2025 •

edited

Loading

voxik commented Oct 31, 2025 •

edited

Loading