fix: defer heavy imports in inference CLI for vLLM compat by slacki-ai · Pull Request #53 · longtermrisk/openweights

slacki-ai · 2026-03-26T14:13:04Z

Summary

Move torch, vllm, transformers, huggingface_hub, and openweights.client imports from module top-level into the __main__ guard
This allows monkey-patches to be applied before vLLM is imported — vLLM captures tqdm and tokenizer behaviour at import time, so patching after import has no effect
Adds tqdm noise reduction (rate-limited updates) and transformers.PreTrainedTokenizerBase.all_special_tokens_extended compat patch for newer transformers versions
Changes main() signature from main(config_json: str) to main(cfg, conversations) — config parsing and data loading now happen in __main__ before vLLM import

Changes

openweights/jobs/inference/cli.py — restructured import order and main() signature

Test plan

AST-based unit tests verify import structure (11 tests): heavy imports not at top level, stdlib imports preserved, main() signature correct, __main__ guard contains deferred imports
Integration: run an inference job end-to-end to verify the new import order works with vLLM

🤖 Generated with Claude Code

Move torch, vLLM, transformers, and huggingface_hub imports from module top-level to the __main__ guard. This allows monkey-patches (tqdm rate limiting, tokenizer compat) to be applied BEFORE vLLM is imported, since vLLM captures tqdm and tokenizer behaviour at import time. Also changes main() signature to accept pre-parsed (cfg, conversations) instead of a raw JSON string, and adds tqdm noise reduction and transformers all_special_tokens_extended compat patch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Keep 5 deferred-import checks (one per heavy library). Remove 6 tests: stdlib presence check, function signature checks, __main__ guard existence, and AST-dump string matching for imports inside the guard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

nielsrolf · 2026-04-01T09:30:06Z

Why do we need this? I strongly prefer imports at module level. Modifying tqdm behavior is imo probably not worth such a change. Why does the tokenizer need any monkeypatching?

slacki-ai and others added 2 commits March 26, 2026 10:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: defer heavy imports in inference CLI for vLLM compat#53

fix: defer heavy imports in inference CLI for vLLM compat#53
slacki-ai wants to merge 2 commits intolongtermrisk:v0.9from
slacki-ai:fix/inference_cli_deferred_imports

slacki-ai commented Mar 26, 2026

Uh oh!

nielsrolf commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

slacki-ai commented Mar 26, 2026

Summary

Changes

Test plan

Uh oh!

nielsrolf commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants