fix: _get_offset_tokenizer immune to global fastokens patch (concurrent-pool race)#86
Merged
Merged
Conversation
load_tokenizer toggles a process-global fastokens monkeypatch per pool-slot load, holding _FASTOKENS_PATCH_LOCK only around the patch/unpatch calls, not the load. Under the concurrent renderer pool, _get_offset_tokenizer's 'vanilla' reload could race an open patch window, get an offset-less fastokens-backed tokenizer, and cache it — permanently breaking offset attribution (renderers using attribute_text_segments then raise 'fastokens does not track character offsets'). Reload with the patch forced off under _FASTOKENS_PATCH_LOCK and re-probe before caching, so a poisoned (non-offset) tokenizer is never returned or cached. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ApprovabilityVerdict: Approved This is a well-scoped bug fix for a race condition in tokenizer loading. The changes add defensive detection and retry logic using existing lock infrastructure, with clear intent and limited scope to a single function. You can customize Macroscope's approvability policy. Learn more. |
hallerite
approved these changes
Jun 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
load_tokenizer(use_fastokens=True)installs a process-global fastokens monkeypatch for the duration of each pool-slot'sfrom_pretrained, holding_FASTOKENS_PATCH_LOCKonly around the patch/unpatch calls — not around the load itself. Under the concurrent renderer pool (create_renderer_poolfans loads across a thread pool),_get_offset_tokenizer's "vanilla" reload (_load_tokenizer_via_auto→AutoTokenizer.from_pretrained) can run inside another slot's open patch window, come back with the offset-less fastokens shim, and get cached. From then on every offset attribution raisesNotImplementedError: fastokens does not track character offsets(surfaced downstream as a 502 from the interception server).Reproduced: a bare
AutoTokenizer.from_pretrainedissued during an open patch window returns an offset-less tokenizer.Fix
In
_get_offset_tokenizer, after the first load, verify offsets; if missing, reload with fastokens force-unpatched under_FASTOKENS_PATCH_LOCK(restoring the prior patch state) and re-probe before caching — so a non-offset tokenizer is never returned or cached. Verified: a reload landing in a patch window now returns an offset-capable tokenizer.Note
Fix
_get_offset_tokenizerto retry with fastokens patch disabled under a lock_has_offsets(tok)helper in renderers/base.py that verifies true offset support by checkingis_fastand attempting tokenization withreturn_offsets_mapping=True.is_fastcheck with_has_offsetsfor both initial load and post-reload validation._has_offsets, retries under_FASTOKENS_PATCH_LOCKby temporarily unpatchingfastokens, reloading the tokenizer, then restoring the patch — suppressing stdout during patch/unpatch._has_offsets; raisesRuntimeErrorotherwise.Macroscope summarized 057f008.
Note
Medium Risk
Touches tokenizer loading and global fastokens patch coordination on a hot path for hand-coded renderers; incorrect locking or restore logic could regress attribution or pool startup, but scope is narrow to
_get_offset_tokenizer.Overview
Fixes a concurrent renderer pool race where
_get_offset_tokenizercould load and cache a fastokens shim tokenizer (nooffset_mapping) ifAutoTokenizer.from_pretrainedran while another slot had the process-global fastokens patch active—breaking hand-coded renderer body/scaffold attribution downstream._get_offset_tokenizernow uses a_has_offsetsprobe (fast tokenizer + successfulreturn_offsets_mapping=True) instead of relying onis_fastalone. After the first vanilla load, if offsets are missing it reloads under_FASTOKENS_PATCH_LOCKwith fastokens temporarily unpatch (restoring prior patch state), then refuses to cache until offsets are verified; otherwise it raises a clearerRuntimeError.Reviewed by Cursor Bugbot for commit 057f008. Bugbot is set up for automated code reviews on this repo. Configure here.