fix: stop FileCompressorTool from archiving out-of-tree symlink targets#6249
fix: stop FileCompressorTool from archiving out-of-tree symlink targets#6249theCyberTech wants to merge 2 commits into
Conversation
input_path/output_path were already allow-list validated, but when compressing a directory the per-file members were not. zipfile.write() dereferences symlinks, so a symlink inside an allowed directory pointing at an out-of-tree secret (e.g. ~/.ssh/id_rsa, /etc/passwd) had its contents copied into the archive — a confinement bypass / info-disclosure. - _compress_zip: validate each walked file against the allow-list; skip and record any whose resolved path escapes it. - _compress_tar: filter drops symlink/hardlink members (tar stores links rather than dereferencing, so this prevents shipping an out-of-tree link that resolves at extraction time). - _run: surface a "N item(s) skipped for safety" note rather than swallowing the exclusions. - tests: regression tests asserting the secret content never lands in the zip or tar archive. Also fix a pre-existing broken import that made the whole compressor test module error at collection (empty package __init__; aligned to the module-path import used by sibling tools). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Summary: This PR hardens FileCompressorTool by validating walked ZIP members and excluding tar symlink/hardlink entries to prevent out-of-tree file inclusion. No exploitable security vulnerabilities were identified in the added code.
Risk: Low risk. The changes reduce an existing path-confinement exposure without adding new authentication, authorization, external integration, or public API surfaces.
…pth) Corridor scan flagged output_path reaching the archive writer. _run does validate output_path before dispatch, but _compress_zip/_compress_tar are staticmethods reachable independently of _run — called directly they would write unvalidated. Validate at the sink so every call site is confined to the allow-list, not just the _run entrypoint. Also decouple the symlink tests from the configurable-allow-list feature (PR #6248): chdir into the working dir so the allowed root is cwd, instead of setting CREWAI_TOOLS_ALLOWED_DIRS (which this branch, off main, does not yet support). Adds direct-call tests asserting the sink rejects an out-of-root output path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Security Issues
-
Time-of-check to time-of-use (TOCTOU) on zip member writes
During directory compression, each file is validated via validate_file_path(full_path) but the returned resolved path is discarded and zipf.write(full_path, arcname) uses the original path. An attacker can swap a symlink target between validation and write, causing out-of-tree content to be archived despite the check. -
Missing allow-list validation for single-file inputs (zip and tar helpers)
When input_path is a single file and the helper methods are called directly (bypassing _run), input_path is never validated against the allowed directories. This allows archiving arbitrary files outside the allowed directory, violating the path-confinement policy.
Recommendations
- Always use the resolved path returned by validate_file_path(...) for subsequent I/O to eliminate TOCTOU windows.
- Validate input_path with validate_file_path(...) in the single-file branches of both _compress_zip and _compress_tar, and use the resolved value in write/add calls.
There was a problem hiding this comment.
During directory compression, the code validates each file path but discards the resolved, canonical path returned by validate_file_path(...). The subsequent write uses the original full_path, creating a TOCTOU window where a symlink can be swapped between validation and use, allowing out-of-tree content to be archived.
try:
validate_file_path(full_path)
except ValueError:
skipped.append(full_path)
continue
arcname = os.path.relpath(full_path, start=input_path)
zipf.write(full_path, arcname)Project guardrail 59 requires using the canonical path returned by validate_file_path() for the actual I/O to prevent TOCTOU races.
Remediation: Use the resolved path from validate_file_path(...) for the write operation.
| zipf.write(validate_file_path(full_path), arcname) |
For more details, see the finding in Corridor.
Provide feedback: Reply with whether this is a valid vulnerability or false positive to help improve Corridor's accuracy.
There was a problem hiding this comment.
When input_path points to a single file and _compress_zip is called directly (bypassing _run), the code writes the file to the archive without validating it against the allowed directory set. This permits archiving an arbitrary out-of-tree file, violating the path-confinement policy.
if os.path.isfile(input_path):
zipf.write(input_path, os.path.basename(input_path))Remediation: Validate input_path and use the returned resolved path for the write per guardrail 59.
| zipf.write(validate_file_path(input_path), os.path.basename(input_path)) |
For more details, see the finding in Corridor.
Provide feedback: Reply with whether this is a valid vulnerability or false positive to help improve Corridor's accuracy.
There was a problem hiding this comment.
In _compress_tar, when input_path is a single file or directory and this helper is called directly (bypassing _run), input_path is not validated against the allow-list before being added to the archive. Although link members are filtered, unvalidated regular files can still be archived from outside the allowed directory.
with tarfile.open(output_path, mode) as tarf: # type: ignore[call-overload]
arcname = os.path.basename(input_path)
tarf.add(input_path, arcname=arcname, filter=_drop_links)Remediation: Validate input_path and use the resolved path for tarf.add(...) in line with guardrail 59.
| tarf.add(validate_file_path(input_path), arcname=arcname, filter=_drop_links) |
For more details, see the finding in Corridor.
Provide feedback: Reply with whether this is a valid vulnerability or false positive to help improve Corridor's accuracy.
Problem
FileCompressorToolallow-list-validatesinput_path/output_path, but the per-file members of a compressed directory were never validated.zipfile.write()dereferences symlinks, so a symlink inside an allowed directory pointing at an out-of-tree secret (e.g.~/.ssh/id_rsa,/etc/passwd) had its contents copied into the archive — a path-confinement bypass / information disclosure. Threat: an agent told to "compress my project folder" exfiltrates arbitrary file contents via a planted symlink.Changes
_compress_zip): validate each walked file against the allow-list; skip + record any whose resolved path escapes it._compress_tar):filter=drops symlink/hardlink members. tar stores links rather than dereferencing, so the compress-time leak doesn't apply — but this prevents shipping an out-of-tree link that resolves at extraction time._run: surfaces anN item(s) skipped for safetynote instead of silently swallowing exclusions.__init__+ a package-level class import that never resolved), so these tests were not running in CI. Aligned the import to the module path used by sibling tool tests.Testing
pytest tests/tools/files_compressor_tool_test.py→ 17 passed, including new zip/tar regression tests asserting the secret content never lands in the archive. ruff + mypy clean.Note
Companion to #6248 (path-confinement hardening for the other file tools). Independent of it — branched off
main.🤖 Generated with Claude Code