WIP XML.jl v0.4: Rewrite of internals, streaming tokenizer, XPath support, and bug fixes #54
WIP XML.jl v0.4: Rewrite of internals, streaming tokenizer, XPath support, and bug fixes #54joshday wants to merge 18 commits intoJuliaComputing:mainfrom
Conversation
|
Hey @joshday . I've only had a very superficial look so far but it looks great. Thanks! In terms of impact on XLSX.jl, I think it looks significant. It isn't just More of a challenge will be the removal of These obviously aren't insuperable, but will likely need a bit of time while I get to grips with Thanks, Tim |
Hi @joshday, I've been a bit distracted recently by transferring XLSX.jl to JuliaData and subsequently making a v0.11 release, but my attention will be back on this again after the Easter break. I have to say I'd welcome any PR you could make on XLSX.jl to help facilitate this upgrade. Thanks! |
Summary of Changes
I revived an old rewrite I had halfway finished with the help of Claude Code. It produced some good results!
src/XMLTokenizer.jlmodule for speedy tokenizationNode{T}now parameterized by the string storage type, enabling quick reads viaSubStringor StringViews.jlXML.mmap("file.xml", LazyNode)for memory-mapped parsing of very large filesxpath(node, path)with a practical subset of XPath 1.0Downstream
@TimG1964 you are likely the most impacted with these changes. The Downstream.yml action does indicate a failure in XLSX.jl tests related to
Rawno longer existing. I'd appreciate your review here! I'm happy to submit a PR for a fix in XLSX.jl so that its ready to go before this gets merged.Addressed Issues
Benchmarks: See
benchmarks/compare.jlHere
(SS)refers to usingSubString{String}as storage type.