Skip to content

[fix] hooks/taylorseer_cache: fix off-by-one in post-warmup compute interval#13806

Open
Dev-X25874 wants to merge 2 commits into
huggingface:mainfrom
Dev-X25874:fix/taylorseer-compute-interval-off-by-one
Open

[fix] hooks/taylorseer_cache: fix off-by-one in post-warmup compute interval#13806
Dev-X25874 wants to merge 2 commits into
huggingface:mainfrom
Dev-X25874:fix/taylorseer-compute-interval-off-by-one

Conversation

@Dev-X25874
Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes an off-by-one error in TaylorSeerCacheHook._measure_should_compute
that caused the first post-warmup denoising step to always use a cached
prediction instead of performing a fresh forward pass.

Root cause: The interval formula contained a spurious - 1:

# before (buggy)
is_compute_interval = (current_step - self.disable_cache_before_step - 1) % self.cache_interval == 0

# after (fixed)
is_compute_interval = (current_step - self.disable_cache_before_step) % self.cache_interval == 0

In Python, -1 % cache_interval is cache_interval - 1 (never zero), so
the step at current_step == disable_cache_before_step — which is documented
as the first step where caching begins — was always routed to state.predict()
rather than a real forward pass. This shifted the entire post-warmup compute
cadence by one step throughout inference.

Concrete example with disable_cache_before_step=3, cache_interval=5:

Step Before (buggy) After (fixed)
3 (first post-warmup) ❌ predict ✅ compute
4 ✅ compute ❌ predict (cache reuse)
8 ❌ predict ✅ compute (next refresh)
9 ✅ compute ❌ predict (cache reuse)

A new test file tests/hooks/test_taylorseer_cache.py is added that directly
exercises _measure_should_compute step-by-step and will fail on the original
code at step 3.

Before submitting

Who can review?

@sayakpaul

@github-actions github-actions Bot added size/M PR with diff < 200 LOC tests hooks and removed size/M PR with diff < 200 LOC labels May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant