Skip to content

Commit d79d8be

Browse files
cigraingerclaude
andcommitted
Zero-copy parse, eliminate string copies, optimize hot paths
Performance optimizations addressing PyO3 overhead analysis: 1. Zero-copy parse for bytes input (#6): DocumentOwner enum uses PyBackedBytes to borrow directly from Python bytes object's internal buffer, avoiding a full memcpy of the XML document. str input still copies (Python str -> UTF-8 encoding required). 2. Eliminate String intermediaries (#4): All text-returning methods (xpath_text, xpath_string, .text, .tail, .get, .keys, .items, itertext, text_content, tostring) now return Py<PyString> built directly from &str slices. Skips Rust String allocation that PyO3 would then copy again into Python. 3. interned_tag_fast (#3): Hot paths (child_tags, descendant_tags, make_element_borrowed, make_elements) now accept &IndexWithMeta directly, avoiding redundant borrow_dependent() calls in tight loops. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 30b0702 commit d79d8be

File tree

2 files changed

+186
-129
lines changed

2 files changed

+186
-129
lines changed

python/simdxml/_core.pyi

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -166,8 +166,8 @@ class CompiledXPath:
166166
def parse(data: bytes | str) -> Document:
167167
"""Parse XML into a Document.
168168
169-
Accepts ``bytes`` or ``str``. Returns a Document that can be queried
170-
with XPath or traversed element-by-element.
169+
Accepts ``bytes`` or ``str``. For bytes input, the buffer is used
170+
directly (zero-copy). For str input, the string is encoded to UTF-8.
171171
"""
172172
...
173173

0 commit comments

Comments
 (0)