Skip to content

Fix inverted path traversal logic and enhance torch.load security#1479

Open
RinZ27 wants to merge 1 commit intoNVIDIA:mainfrom
RinZ27:security/fix-path-traversal-and-torch-load
Open

Fix inverted path traversal logic and enhance torch.load security#1479
RinZ27 wants to merge 1 commit intoNVIDIA:mainfrom
RinZ27:security/fix-path-traversal-and-torch-load

Conversation

@RinZ27
Copy link

@RinZ27 RinZ27 commented Mar 8, 2026

PhysicsNeMo Pull Request

Description

A critical logic error in _safe_members was identified where the path traversal protection was inverted, allowing unsafe members while blocking legitimate ones. This fix correctly yields only safe members during tarball extraction.

Security of model loading is also improved by enabling weights_only=True in all core torch.load calls. This mitigates the risk of arbitrary code execution from malicious pickles in model checkpoints.

Checklist

  • Familiar with the Contributing Guidelines.
  • Verified logic correctness with local PoC for path traversal.
  • Documentation updated.
  • CHANGELOG.md updated.
  • Issue linked.

Dependencies

None.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 8, 2026

Greptile Summary

This PR fixes two security issues in physicsnemo/core/module.py: a critical logic inversion in _safe_members that was previously yielding unsafe tar members (those with .. or absolute paths) and blocking safe ones, and absent weights_only=True guards on torch.load calls that could allow arbitrary code execution from malicious pickles. The CHANGELOG has been updated to document both changes, including the potential breaking change for legacy checkpoints.

Key changes:

  • _safe_members: The if/else yield branches are swapped, adding not before the startswith check so only members whose resolved path falls inside local_path are extracted.
  • A trailing separator (os.path.join(os.path.realpath(local_path), "")) is used for resolved_local_path, correctly preventing a prefix-collision bypass where a sibling directory named with a common prefix (e.g., /tmp/tmpABCDEFmalicious/) would satisfy a bare startswith check.
  • print(...) replaced with logging.getLogger("core.module").warning(...), routing security-relevant events through the standard logging system.
  • All four torch.load calls in module.py updated to include weights_only=True.
  • CHANGELOG updated with a breaking-change note for checkpoints containing non-standard types.

Important Files Changed

Filename Overview
physicsnemo/core/module.py Fixes the inverted _safe_members path traversal logic (now correctly yields only safe members), replaces print with logging, adds a trailing separator to resolved_local_path to prevent prefix-collision bypasses, and adds weights_only=True to all four torch.load calls. Core logic is correct; minor notes below.
CHANGELOG.md Adds two bullet points under the Security section documenting the path traversal fix and the potentially breaking weights_only=True change. Appropriate and complete.

Last reviewed commit: 4f75549

Comment on lines +267 to 269
or not os.path.realpath(os.path.join(local_path, member.name)).startswith(
resolved_local_path
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startswith check missing trailing separator

The path containment check uses startswith(resolved_local_path) without a trailing separator. If the temp directory is /tmp/tmpABCDEF, a path like /tmp/tmpABCDEFmalicious/payload would incorrectly pass this check since it starts with the same prefix.

While the earlier ".." in member.name and os.path.isabs(member.name) guards prevent exploitation in the common attack vectors, this third check on its own is subtly incomplete. The canonical fix is:

Suggested change
or not os.path.realpath(os.path.join(local_path, member.name)).startswith(
resolved_local_path
)
or not os.path.realpath(os.path.join(local_path, member.name)).startswith(
resolved_local_path + os.sep
)
and os.path.realpath(os.path.join(local_path, member.name)) != resolved_local_path

A cleaner approach would be:

resolved = os.path.realpath(os.path.join(local_path, member.name))
outside_dir = not (resolved == resolved_local_path or resolved.startswith(resolved_local_path + os.sep))

so the condition covers both the exact-directory case and properly-bounded subdirectory paths.

):
yield member
else:
print(f"Skipping potentially malicious file: {member.name}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use logging.warning instead of print

Security-relevant events like skipping a potentially malicious file should go through the standard logging system rather than print(). This ensures warnings are surfaced consistently (respecting log levels, handlers, and formatters) and are not silenced or missed in non-interactive environments.

Since _safe_members is a @staticmethod, consider passing a logger or using the module-level logger:

Suggested change
print(f"Skipping potentially malicious file: {member.name}")
logging.warning(f"Skipping potentially malicious file: {member.name}")

(Make sure logging is already imported at the top of the file — it is used elsewhere in module.py.)

Comment on lines +751 to +753
model_dict = torch.load(
io.BytesIO(model_bytes), map_location=device, weights_only=True
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weights_only=True may break backward-compatible checkpoint loading

weights_only=True is a great security improvement for new checkpoints, but be aware it restricts deserialization to basic types (tensors, dicts, lists, scalars, etc.). If any existing checkpoint's model.pt was saved with custom Python objects embedded in the state dict (e.g., from a third-party integration or an older PhysicsNeMo version), loading will raise a pickle.UnpicklingError or RuntimeError.

Looking at the save code (torch.save(self.state_dict(), ...)), standard state dicts only contain tensors and should be fine. However, if subclasses override state_dict() to include custom objects, this will silently break for those users. Consider documenting this behavioral change in the changelog and/or providing a fallback mechanism for users with legacy checkpoints that contain non-standard types.

@RinZ27 RinZ27 force-pushed the security/fix-path-traversal-and-torch-load branch from e311a68 to 4f75549 Compare March 8, 2026 03:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant