Optimization of REXML::XPathParser#step#310
Conversation
| new_nodes = {} | ||
| nodeset.each do |node| | ||
| new_nodeset = [] | ||
| new_nodes = {} |
There was a problem hiding this comment.
With this change, the result of this xpath has changed.
REXML::XPath.match(REXML::Document.new('<a id="a1"><b id="b1"><a id="a2"><b id="b2"><b id="b3"></b></b></a></b></a>'), '//a/descendant::b[1]')
# => [<b id='b1'> ... </>, <b id='b2'> ... </>] Expected (in master branch)
# => [<b id='b1'> ... </>] Actual (this pull request)Reference:
document.body.innerHTML='<a id="a1"><b id="b1"><a id="a2"><b id="b2"><b id="b3"></b></b></a></b></a>'
snapshot = document.evaluate('//a/descendant::b[1]', document, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
[...new Array(snapshot.snapshotLength)].map((_, i) => snapshot.snapshotItem(i))
// => [b#b1, b#b2]There was a problem hiding this comment.
I'm sorry.
Since this change has unintended consequences, I will cancel the change to #descendant.
Thanks!
| next if seen.key?(raw_node.object_id) | ||
| seen[raw_node.object_id] = true |
There was a problem hiding this comment.
object_id will lazily assign a new id when it is called. We can avoid this by using seen = {}.compare_by_identity instead.
There was a problem hiding this comment.
Thanks for pointing that out.
I’ll fix it.
There was a problem hiding this comment.
Using compare_by_identity, Hash will compare by its identity, (same as comparing by object_id, but without assigning actual object_id), so seen.key?(raw_node) and seen[raw_node] = true is enough.
These two code gives the same result
h = {}
h[key.object_id] = value
h[key.object_id] #=> value
h[key.dup.object_id] #=> nilh = {}.compare_by_identity
h[key] = value
h[key] #=> value
h[key.dup] #=> nil## Benchmark
```
$ benchmark-driver benchmark/xpath.yaml
before after before(YJIT) after(YJIT)
REXML::XPath.match(REXML::Document.new(xml), 'a//a') 7.182k 7.026k 4.507k 4.436k i/s - 100.000 times in 0.013924s 0.014233s 0.022190s 0.022544s
REXML::XPath.match(REXML::Document.new(xml), '//a//a') 567.537 2.663k 947.724 2.884k i/s - 100.000 times in 0.176200s 0.037556s 0.105516s 0.034670s
Comparison:
REXML::XPath.match(REXML::Document.new(xml), 'a//a')
before: 7181.8 i/s
after: 7025.9 i/s - 1.02x slower
before(YJIT): 4506.5 i/s - 1.59x slower
after(YJIT): 4435.8 i/s - 1.62x slower
REXML::XPath.match(REXML::Document.new(xml), '//a//a')
after(YJIT): 2884.3 i/s
after: 2662.7 i/s - 1.08x slower
before(YJIT): 947.7 i/s - 3.04x slower
before: 567.5 i/s - 5.08x slower
```
- YJIT=ON : 0.98x - 3.04x faster
- YJIT=OFF : 0.97x - 4.69x faster
5b3e01a to
2d3346f
Compare
Benchmark