[HOTFIX] Use importlib.util.find_spec for pluggable worker discovery#1918
[HOTFIX] Use importlib.util.find_spec for pluggable worker discovery#1918chandrasekharan-zipstack wants to merge 3 commits intov0.163.2-hotfixfrom
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
| Filename | Overview |
|---|---|
| workers/shared/infrastructure/config/builder.py | Replaces filesystem .py check with importlib.util.find_spec(); ValueError now caught alongside ImportError; logic and exception handling are correct. |
| workers/worker.py | Core-worker directory derivation now uses to_import_path().rsplit() instead of a blind replace, correctly resolving api-deployment's hyphenated on-disk path. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["_verify_pluggable_worker_exists(worker_type)"] --> B{is_pluggable?}
B -- No --> C[return True]
B -- Yes --> D["module_path = 'pluggable_worker.{worker_type}.worker'"]
D --> E["importlib.util.find_spec(module_path)"]
E -- "returns None" --> F["logger.error\nreturn False"]
E -- "raises ValueError" --> G["except ImportError | ValueError\nreturn False"]
E -- "spec found" --> H["importlib.import_module(module_path)"]
H -- "ImportError / ValueError" --> G
H -- "OSError / AttributeError" --> I["except OSError | AttributeError\nreturn False"]
H -- success --> J["logger.debug\nreturn True"]
Reviews (4): Last reviewed commit: "[FIX] Resolve api-deployment worker dire..." | Re-trigger Greptile
_verify_pluggable_worker_exists() previously checked for the literal file `pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin has been compiled to a .so (Nuitka, Cython, or any C extension) — the module is perfectly importable but the pre-check rejects it because only the .py extension is considered. Replace the filesystem check with importlib.util.find_spec(), which is Python's standard way to ask "is this module resolvable by the import system?". It honors every registered finder — source .py, compiled .so, bytecode .pyc, namespace packages, zipimports — so the function now matches what its docstring claims: verifying the module can be loaded, not that a specific file extension is present. Behavior is preserved for existing deployments: - Images with no `pluggable_worker/<name>/` subpackage → find_spec raises ModuleNotFoundError (ImportError subclass) → returns False. - Images with source .py → find_spec resolves the .py → returns True. - Images with compiled .so → find_spec resolves the .so → returns True. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0f25e54 to
94bd763
Compare
Greptile-flagged edge case: importlib.util.find_spec() can raise ValueError (not just ImportError) when sys.modules has a partially initialised module entry with __spec__ = None from a prior failed import. Broaden the except to catch both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| # module representation (source .py, Nuitka/Cython .so, .pyc, | ||
| # namespace packages, zipimports) — unlike a filesystem check | ||
| # for a specific file extension, which breaks for compiled plugins. | ||
| module_path = f"pluggable_worker.{worker_type.value}.worker" |
There was a problem hiding this comment.
Hope this not break running worker locally insated of docker
There was a problem hiding this comment.
Tested locally — doesn't break local runs. find_spec() is actually a superset of the previous filesystem check.
Local runs go through workers/run-worker.sh, which sets export PYTHONPATH="$WORKERS_DIR:${PYTHONPATH:-}" (line 702) before launching celery. So import pluggable_worker.<name>.worker resolves the same way the launcher itself uses the dotted module name at line 444 (celery_app_module="pluggable_worker.${worker_type}.worker").
Coverage comparison vs old code:
| Scenario | Old (Path(...)/worker.py) |
New (find_spec) |
|---|---|---|
Local .py via run-worker.sh |
✅ | ✅ (PYTHONPATH set by script) |
| Docker with uncompiled plugin | ✅ | ✅ |
Docker with compiled .so (Cython/Nuitka) |
❌ (checks .py) |
✅ |
| OSS with no pluggable plugins | n/a — is_pluggable() returns False, early return |
same |
Bare python worker.py from repo root, no PYTHONPATH |
❌ | ❌ (and would fail at actual import too) |
The only way someone could get a false negative from the new code but not the old is if they ran a pluggable worker outside run-worker.sh without any PYTHONPATH setup — in that case the old check returned True but celery would have crashed on the real import a few lines later. So net change: we fail earlier with a cleaner log instead of getting a stack trace.
Also confirmed on snapshot.180420262 locally: pluggable workers (agentic_callback + bulk_download worker.so) register and run fine — previously they crash-looped because Path(...)/worker.py returned False for compiled modules.
worker.py:452 did worker_type.value.replace("-", "_") to derive the
on-disk dir name. All WorkerType enum values already use underscores,
so the replace was a no-op; for API_DEPLOYMENT whose dir is
"api-deployment" (hyphen), it resolved to "api_deployment" and the
os.path.exists() check failed. Boot then logged a spurious
"❌ Worker directory not found: /app/api_deployment" at ERROR level.
The task registration path (builder + celery autodiscover via
to_import_path) is unaffected, so this was purely log noise — but
noise at ERROR level that masks real failures in log scans.
Fix: derive the directory from the authoritative to_import_path()
which already handles the hyphen case (api_deployment -> api-deployment).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|



What
Replaces the filesystem
worker.pyexistence check inWorkerBuilder._verify_pluggable_worker_exists()withimportlib.util.find_spec().Why
The pre-import check at
workers/shared/infrastructure/config/builder.py:338hardcodedworker.pyas the file to look for. This breaks any deployment where the pluggable worker has been compiled to an extension module (e.g. Nuitka or Cython produces aworker.cpython-312-x86_64-linux-gnu.sothat replacesworker.py) — the module is perfectly importable, but the hard.pyexistence check rejects the worker beforeimportlib.import_module()ever runs.Observed error pattern (paraphrased):
This is not specific to any one obfuscation tool — the check would also reject plugins distributed as Cython
.so, bytecode-only.pyc, namespace packages, or any future loader that doesn't materialize a.pyfile on disk.How
Use Python's standard "is this module importable?" primitive instead of filesystem introspection:
find_spec()consults every registered finder (source.py, compiled.so, bytecode.pyc, namespace packages, zipimports) and returnsNonecleanly if nothing can produce the module. The subsequentimport_module()still catches findable-but-broken modules (e.g. ImportError inside the loaded module).The only semantic change: when a parent package exists but an intermediate subpackage is missing (
pluggable_worker/exists butpluggable_worker/<name>/does not),find_specraisesModuleNotFoundError. That's a subclass ofImportErrorand is already handled by the existingexcept ImportErrorblock, so the function correctly returnsFalsejust like before.Can this PR break any existing features?
No. Behavior preserved across all deployment scenarios, verified via a synthetic fixture that compiles a test worker with Nuitka and runs
_verify_pluggable_worker_exists()against it:.pypresent.sopresentFalse(file not found)False(ImportError)TrueTrueFalse← bugTrue← fixedRelevant Docs
importlib.util.find_spec