Fix emphasis delimiters inside link destinations (destination-only)#81
Fix emphasis delimiters inside link destinations (destination-only)#81dereuromark wants to merge 1 commit intomasterfrom
Conversation
Simpler alternative to PR #80 that only protects delimiters inside link destinations `](...)`, without the full link lookahead. Problem: Emphasis delimiters (`_`, `*`) inside link destinations were matched as closers, breaking link formation: `_[link](http://example.com?foo_bar=1), more text_` → `<em>[link](http://example.com?foo</em>bar=1), more text_` Fix: When scanning for emphasis closers in `parseDelimited()`, skip over link destinations by detecting `](` and finding the matching `)`. This prevents delimiters inside URLs from closing emphasis. This is a targeted fix for the common real-world issue (URLs with underscores in query parameters). It intentionally does NOT fix the bracket-text case (`_[foo_](url)`) which remains emphasis - this keeps the implementation simple per discussion in jgm/djot#375. Refs: jgm/djot#375
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #81 +/- ##
=========================================
Coverage 93.85% 93.85%
- Complexity 2115 2130 +15
=========================================
Files 77 77
Lines 5675 5699 +24
=========================================
+ Hits 5326 5349 +23
- Misses 349 350 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Performance BenchmarkRan benchmarks comparing master (no fix) vs this PR: Regular test cases (1000 iterations each)
Scaling test (pathological input
|
| n | Master | With Fix |
|---|---|---|
| 100 (301 chars) | 0.432 ms | 0.500 ms |
| 500 (1501 chars) | 7.578 ms | 7.603 ms |
Analysis
Counterintuitive finding: The fix is often faster for URLs with underscores because:
- Without fix: underscore closes emphasis early → remaining text reparsed → more work
- With fix: emphasis closes correctly at the end → single clean pass
Overhead:
- ~20% slower for
long_text(50 links with constant skipping over](...)) - ~2% slower for pathological brackets
- No measurable difference for content without links
Comparison with #80 (full lookahead):
| This PR | #80 | |
|---|---|---|
| Trigger condition | ]( only |
Every [ and  |
| Code added | ~50 lines | ~85 lines |
| Covers bracket-text case | No | Yes |
This PR has lower overhead because it only triggers on ](, not every [.
Performance Benchmark (Updated 2026-02-17)Ran benchmarks comparing master vs this PR (5 rounds, 500 iterations each, median times). Regular test cases
Note: Variations under 0.005ms (~15%) are within noise range for these micro-benchmarks. Scaling test (pathological input
|
| n | Chars | Master | With Fix | Change |
|---|---|---|---|---|
| 100 | 400 | 0.88 ms | 0.80 ms | 9% faster |
| 200 | 800 | 2.95 ms | 2.85 ms | 3% faster |
| 500 | 2000 | 16.28 ms | 15.74 ms | 3% faster |
| 1000 | 4000 | 59.75 ms | 59.12 ms | ~same |
Analysis
Key finding: The fix is faster for URLs with underscores:
- Without fix: underscore closes emphasis early → remaining text reparsed → more work
- With fix: emphasis closes correctly at the end → single clean pass
Performance characteristics:
- Significant improvement (30-44% faster) for links with underscores in URLs
- 39% faster for long_text with many links
- No measurable overhead for pathological bracket cases
- Simple cases without links show no meaningful difference
Comparison with #80 (full lookahead):
| This PR | #80 | |
|---|---|---|
| Trigger condition | ]( only |
Every [ and  |
| Code added | ~50 lines | ~85 lines |
| Covers bracket-text case | No | Yes |
This PR has lower overhead because it only triggers on ](, not every [.
Summary
Simpler alternative to #80 that only protects delimiters inside link destinations
](...), without the full link lookahead complexity._[link](url_bar)_now works_[foo_](url)remains emphasis (intentional)Problem
Emphasis delimiters (
_,*) inside link destinations were matched as closers, breaking link formation:Solution
When scanning for emphasis closers in
parseDelimited(), detect](and skip over the parentheses content. This protects delimiters inside URLs from closing emphasis.Why destination-only?
Per discussion in jgm/djot#375, the two cases are separable:
_[foo](bar_baz)__[foo_](bar)The destination case is the common real-world issue (URLs with query params). The bracket-text case is rare and arguably the current behavior (
_[foo_]→ emphasis) is intuitive.This approach:
Refs: jgm/djot#375, #80