Skip to content

Fix infinite loop in parse_md and optimize CJK processing#2

Open
patdelphi wants to merge 1 commit intolovstudio:mainfrom
patdelphi:fix/parse-md-infinite-loop-and-perf
Open

Fix infinite loop in parse_md and optimize CJK processing#2
patdelphi wants to merge 1 commit intolovstudio:mainfrom
patdelphi:fix/parse-md-infinite-loop-and-perf

Conversation

@patdelphi
Copy link
Copy Markdown

Summary

  • Fixed a critical infinite loop bug in parse_md() that caused timeouts on large documents — when a line matched no pattern and plines was empty, i was never incremented
  • Optimized CJK detection (bisect binary search), font wrapping (batch processing), inline markdown (pre-compiled regex + caching), and text measurement (shared _split_cjk_segs helper)
  • Added Performance section to README with benchmark results

Test plan

  • xin1.md (200KB, 3700 lines) converts in 1.5s (was timing out >600s)
  • xin2.md converts in 0.24s
  • Verify existing test files (01-06) still render correctly
  • Visual check that PDF output matches previous quality

🤖 Generated with Claude Code

The paragraph parsing loop in parse_md() would hang indefinitely on lines
that matched no pattern (e.g. certain metadata lines), because `i` was
never incremented when `plines` was empty. This caused timeouts on any
document with such lines — including the 200KB xin1.md test file.

Also optimizes CJK detection (bisect), font wrapping (batch processing),
inline markdown (pre-compiled regex + caching), and text measurement
(shared _split_cjk_segs helper). Result: xin1.md converts in 1.5s
instead of timing out.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant