Skip to content

fix: stop FileCompressorTool from archiving out-of-tree symlink targets#6249

Draft
theCyberTech wants to merge 2 commits into
mainfrom
fix/compressor-symlink-leak
Draft

fix: stop FileCompressorTool from archiving out-of-tree symlink targets#6249
theCyberTech wants to merge 2 commits into
mainfrom
fix/compressor-symlink-leak

Conversation

@theCyberTech

Copy link
Copy Markdown
Member

Problem

FileCompressorTool allow-list-validates input_path/output_path, but the per-file members of a compressed directory were never validated. zipfile.write() dereferences symlinks, so a symlink inside an allowed directory pointing at an out-of-tree secret (e.g. ~/.ssh/id_rsa, /etc/passwd) had its contents copied into the archive — a path-confinement bypass / information disclosure. Threat: an agent told to "compress my project folder" exfiltrates arbitrary file contents via a planted symlink.

Changes

  • zip (_compress_zip): validate each walked file against the allow-list; skip + record any whose resolved path escapes it.
  • tar (_compress_tar): filter= drops symlink/hardlink members. tar stores links rather than dereferencing, so the compress-time leak doesn't apply — but this prevents shipping an out-of-tree link that resolves at extraction time.
  • _run: surfaces an N item(s) skipped for safety note instead of silently swallowing exclusions.
  • Pre-existing test bug: the compressor test module errored at collection (empty package __init__ + a package-level class import that never resolved), so these tests were not running in CI. Aligned the import to the module path used by sibling tool tests.

Testing

pytest tests/tools/files_compressor_tool_test.py → 17 passed, including new zip/tar regression tests asserting the secret content never lands in the archive. ruff + mypy clean.

Note

Companion to #6248 (path-confinement hardening for the other file tools). Independent of it — branched off main.

🤖 Generated with Claude Code

input_path/output_path were already allow-list validated, but when
compressing a directory the per-file members were not. zipfile.write()
dereferences symlinks, so a symlink inside an allowed directory pointing
at an out-of-tree secret (e.g. ~/.ssh/id_rsa, /etc/passwd) had its
contents copied into the archive — a confinement bypass / info-disclosure.

- _compress_zip: validate each walked file against the allow-list; skip
  and record any whose resolved path escapes it.
- _compress_tar: filter drops symlink/hardlink members (tar stores links
  rather than dereferencing, so this prevents shipping an out-of-tree
  link that resolves at extraction time).
- _run: surface a "N item(s) skipped for safety" note rather than
  swallowing the exclusions.
- tests: regression tests asserting the secret content never lands in the
  zip or tar archive. Also fix a pre-existing broken import that made the
  whole compressor test module error at collection (empty package
  __init__; aligned to the module-path import used by sibling tools).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 488a13e0-da97-4a5e-ba7d-5eed1349d568

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/compressor-symlink-leak

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@corridor-security corridor-security Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: This PR hardens FileCompressorTool by validating walked ZIP members and excluding tar symlink/hardlink entries to prevent out-of-tree file inclusion. No exploitable security vulnerabilities were identified in the added code.

Risk: Low risk. The changes reduce an existing path-confinement exposure without adding new authentication, authorization, external integration, or public API surfaces.

…pth)

Corridor scan flagged output_path reaching the archive writer. _run does
validate output_path before dispatch, but _compress_zip/_compress_tar are
staticmethods reachable independently of _run — called directly they would
write unvalidated. Validate at the sink so every call site is confined to
the allow-list, not just the _run entrypoint.

Also decouple the symlink tests from the configurable-allow-list feature
(PR #6248): chdir into the working dir so the allowed root is cwd, instead
of setting CREWAI_TOOLS_ALLOWED_DIRS (which this branch, off main, does not
yet support). Adds direct-call tests asserting the sink rejects an
out-of-root output path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@corridor-security corridor-security Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issues

  • Time-of-check to time-of-use (TOCTOU) on zip member writes
    During directory compression, each file is validated via validate_file_path(full_path) but the returned resolved path is discarded and zipf.write(full_path, arcname) uses the original path. An attacker can swap a symlink target between validation and write, causing out-of-tree content to be archived despite the check.

  • Missing allow-list validation for single-file inputs (zip and tar helpers)
    When input_path is a single file and the helper methods are called directly (bypassing _run), input_path is never validated against the allowed directories. This allows archiving arbitrary files outside the allowed directory, violating the path-confinement policy.

Recommendations

  • Always use the resolved path returned by validate_file_path(...) for subsequent I/O to eliminate TOCTOU windows.
  • Validate input_path with validate_file_path(...) in the single-file branches of both _compress_zip and _compress_tar, and use the resolved value in write/add calls.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During directory compression, the code validates each file path but discards the resolved, canonical path returned by validate_file_path(...). The subsequent write uses the original full_path, creating a TOCTOU window where a symlink can be swapped between validation and use, allowing out-of-tree content to be archived.

try:
    validate_file_path(full_path)
except ValueError:
    skipped.append(full_path)
    continue
arcname = os.path.relpath(full_path, start=input_path)
zipf.write(full_path, arcname)

Project guardrail 59 requires using the canonical path returned by validate_file_path() for the actual I/O to prevent TOCTOU races.

Remediation: Use the resolved path from validate_file_path(...) for the write operation.

Suggested change
zipf.write(validate_file_path(full_path), arcname)

For more details, see the finding in Corridor.

Provide feedback: Reply with whether this is a valid vulnerability or false positive to help improve Corridor's accuracy.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When input_path points to a single file and _compress_zip is called directly (bypassing _run), the code writes the file to the archive without validating it against the allowed directory set. This permits archiving an arbitrary out-of-tree file, violating the path-confinement policy.

if os.path.isfile(input_path):
    zipf.write(input_path, os.path.basename(input_path))

Remediation: Validate input_path and use the returned resolved path for the write per guardrail 59.

Suggested change
zipf.write(validate_file_path(input_path), os.path.basename(input_path))

For more details, see the finding in Corridor.

Provide feedback: Reply with whether this is a valid vulnerability or false positive to help improve Corridor's accuracy.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _compress_tar, when input_path is a single file or directory and this helper is called directly (bypassing _run), input_path is not validated against the allow-list before being added to the archive. Although link members are filtered, unvalidated regular files can still be archived from outside the allowed directory.

with tarfile.open(output_path, mode) as tarf:  # type: ignore[call-overload]
    arcname = os.path.basename(input_path)
    tarf.add(input_path, arcname=arcname, filter=_drop_links)

Remediation: Validate input_path and use the resolved path for tarf.add(...) in line with guardrail 59.

Suggested change
tarf.add(validate_file_path(input_path), arcname=arcname, filter=_drop_links)

For more details, see the finding in Corridor.

Provide feedback: Reply with whether this is a valid vulnerability or false positive to help improve Corridor's accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant