fix: replacing specific unicode characters with all unicode letters by HackZers7 · Pull Request #30 · chrisbottin/xml-parser

HackZers7 · 2026-05-14T08:00:47Z

What Changed

The closing-tag parsing regex was updated from:

^<\/[\w-:.\u00C0-\u00FF]+\s*>

to:

^<\/[\p{L}\w\-:.]+\s*>/u

The previous range \u00C0-\u00FF only covers a limited subset of Latin characters.
The new pattern uses \p{L} in Unicode mode (u), which properly supports letters from many writing systems and improves parsing for international tag names.

…their letter section are allowed.

chrisbottin · 2026-05-18T07:42:01Z

Thanks @HackZers7 for your contribution.

Can you also add a test similar to https://github.com/chrisbottin/xml-parser/blob/master/test/index.ts#L383 to cover the support of additional unicode letters?

HackZers7 · 2026-05-19T07:26:51Z

Added a test to check the parsing of various languages. An error that occurred for Oriental languages has also been fixed.

The test data is synthetic, generated using GPT.

chrisbottin · 2026-05-21T21:32:00Z

Thanks @HackZers7 for your contribution, the PR is merged and a new version 4.1.6 has been published.

fix: For the place of specific unicode characters, all characters of …

a430ee3

…their letter section are allowed.

HackZers7 added 2 commits May 19, 2026 10:19

chore: Added test to check individual languages for parsing.

2ca5442

fix: Fixed reading of some Oriental languages.

1a6c1bd

chrisbottin self-requested a review May 21, 2026 20:53

chrisbottin approved these changes May 21, 2026

View reviewed changes

chrisbottin merged commit 2a118b2 into chrisbottin:master May 21, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: replacing specific unicode characters with all unicode letters#30

fix: replacing specific unicode characters with all unicode letters#30
chrisbottin merged 3 commits into
chrisbottin:masterfrom
HackZers7:fix/expanded-unicode

HackZers7 commented May 14, 2026

Uh oh!

chrisbottin commented May 18, 2026

Uh oh!

HackZers7 commented May 19, 2026

Uh oh!

Uh oh!

chrisbottin commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HackZers7 commented May 14, 2026

What Changed

Uh oh!

chrisbottin commented May 18, 2026

Uh oh!

HackZers7 commented May 19, 2026

Uh oh!

Uh oh!

chrisbottin commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants