-
Notifications
You must be signed in to change notification settings - Fork 434
fix(sandbox/bootstrap): GPU Landlock baseline paths and CDI spec missing diagnosis #710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pimlock
wants to merge
6
commits into
main
Choose a base branch
from
fix/cdi-gpu-tweaks
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+267
−2
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
0f7e193
fix(sandbox): add GPU device nodes and nvidia-persistenced to landloc…
pimlock eed1ae2
fix(sandbox): narrow /run/nvidia-persistenced to read-only in GPU bas…
pimlock 978de1d
fix(bootstrap): diagnose CDI specs missing when GPU passthrough fails
pimlock 5f7b8e5
fix(bootstrap): add missing CDI error patterns from observed Docker 5…
pimlock f05cdf0
fix(sandbox): add /proc and /dev/urandom to Landlock baselines for GP…
pimlock e1fb27d
refactor(sandbox): remove unnecessary virtual filesystem skip in prep…
pimlock File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -877,13 +877,97 @@ pub(crate) fn spawn_route_refresh( | |
|
|
||
| /// Minimum read-only paths required for a proxy-mode sandbox child process to | ||
| /// function: dynamic linker, shared libraries, DNS resolution, CA certs, | ||
| /// Python venv, and openshell logs. | ||
| const PROXY_BASELINE_READ_ONLY: &[&str] = &["/usr", "/lib", "/etc", "/app", "/var/log"]; | ||
| /// Python venv, openshell logs, process info, and random bytes. | ||
| /// | ||
| /// `/proc` and `/dev/urandom` are included here for the same reasons they | ||
| /// appear in `restrictive_default_policy()`: virtually every process needs | ||
| /// them. Before the Landlock per-path fix (#677) these were effectively free | ||
| /// because a missing path silently disabled the entire ruleset; now they must | ||
| /// be explicit. | ||
| const PROXY_BASELINE_READ_ONLY: &[&str] = &[ | ||
| "/usr", | ||
| "/lib", | ||
| "/etc", | ||
| "/app", | ||
| "/var/log", | ||
| "/proc", | ||
| "/dev/urandom", | ||
| ]; | ||
|
|
||
| /// Minimum read-write paths required for a proxy-mode sandbox child process: | ||
| /// user working directory and temporary files. | ||
| const PROXY_BASELINE_READ_WRITE: &[&str] = &["/sandbox", "/tmp"]; | ||
|
|
||
| /// Fixed read-only paths required when a GPU is present. | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This whole section is similar to the baseline stuff, but activated when the GPU passthrough is on. This could benefit from the mechanism we will have for providers bringing in their policies, though this is different as it applies sandbox wide and is not limited by binaries. |
||
| /// | ||
| /// `/run/nvidia-persistenced`: NVML tries to connect to the persistenced | ||
| /// socket at init time. If the directory exists but landlock denies traversal | ||
| /// (EACCES vs ECONNREFUSED), NVML returns NVML_ERROR_INSUFFICIENT_PERMISSIONS | ||
| /// even though the daemon is optional. Only read/traversal access is needed — | ||
| /// NVML connects to the existing socket but does not create or modify files. | ||
| const GPU_BASELINE_READ_ONLY_FIXED: &[&str] = &["/run/nvidia-persistenced"]; | ||
|
|
||
| /// Fixed read-write paths required when a GPU is present. | ||
| /// | ||
| /// `/dev/nvidiactl`, `/dev/nvidia-uvm`, `/dev/nvidia-uvm-tools`, | ||
| /// `/dev/nvidia-modeset`: control and UVM devices injected by CDI. | ||
| /// Landlock READ_FILE/WRITE_FILE restricts open(2) on device files even | ||
| /// when DAC permissions would otherwise allow it. Device nodes need | ||
| /// read-write because NVML/CUDA opens them with O_RDWR. | ||
| /// | ||
| /// `/proc`: CUDA writes to `/proc/<pid>/task/<tid>/comm` during `cuInit()` | ||
| /// to set thread names. Without write access, `cuInit()` returns error 304 | ||
| /// (`cudaErrorOperatingSystem`). We must use `/proc` (not `/proc/self/task`) | ||
| /// because Landlock rules bind to inodes: `/proc/self/task` in the pre_exec | ||
| /// hook resolves to the shell's PID, but child processes (python, etc.) | ||
| /// have different PIDs and thus different procfs inodes. Security impact | ||
| /// is limited — writable proc entries (`oom_score_adj`, etc.) are already | ||
| /// kernel-restricted for non-root users; Landlock is defense-in-depth. | ||
| /// | ||
| /// Per-GPU device files (`/dev/nvidia0`, `/dev/nvidia1`, …) are enumerated | ||
| /// at runtime by `gpu_baseline_read_write_paths()` since the count varies. | ||
| const GPU_BASELINE_READ_WRITE_FIXED: &[&str] = &[ | ||
| "/dev/nvidiactl", | ||
| "/dev/nvidia-uvm", | ||
| "/dev/nvidia-uvm-tools", | ||
| "/dev/nvidia-modeset", | ||
| "/proc", | ||
| ]; | ||
|
|
||
| /// Returns true if GPU devices are present in the container. | ||
| fn has_gpu_devices() -> bool { | ||
| std::path::Path::new("/dev/nvidiactl").exists() | ||
| } | ||
|
|
||
| /// Collect all GPU read-only paths (currently just the persistenced directory). | ||
| fn gpu_baseline_read_only_paths() -> Vec<std::path::PathBuf> { | ||
| GPU_BASELINE_READ_ONLY_FIXED | ||
| .iter() | ||
| .map(|p| std::path::PathBuf::from(p)) | ||
| .collect() | ||
| } | ||
|
|
||
| /// Collect all GPU read-write paths: fixed devices + per-GPU `/dev/nvidiaX`. | ||
| fn gpu_baseline_read_write_paths() -> Vec<std::path::PathBuf> { | ||
| let mut paths: Vec<std::path::PathBuf> = GPU_BASELINE_READ_WRITE_FIXED | ||
| .iter() | ||
| .map(|p| std::path::PathBuf::from(p)) | ||
| .collect(); | ||
|
|
||
| // Enumerate per-GPU device nodes injected by CDI (nvidia0, nvidia1, …). | ||
| if let Ok(entries) = std::fs::read_dir("/dev") { | ||
| for entry in entries.flatten() { | ||
| let name = entry.file_name(); | ||
| let name = name.to_string_lossy(); | ||
| if name.starts_with("nvidia") && name[6..].chars().all(|c| c.is_ascii_digit()) { | ||
| paths.push(entry.path()); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| paths | ||
| } | ||
|
|
||
| /// Ensure a proto `SandboxPolicy` includes the baseline filesystem paths | ||
| /// required for proxy-mode sandboxes. Paths are only added if missing; | ||
| /// user-specified paths are never removed. | ||
|
|
@@ -935,6 +1019,46 @@ fn enrich_proto_baseline_paths(proto: &mut openshell_core::proto::SandboxPolicy) | |
| } | ||
| } | ||
|
|
||
| if has_gpu_devices() { | ||
| for path in gpu_baseline_read_only_paths() { | ||
| let path_str = path.to_string_lossy(); | ||
| if !fs.read_only.iter().any(|p| p.as_str() == path_str.as_ref()) | ||
| && !fs | ||
| .read_write | ||
| .iter() | ||
| .any(|p| p.as_str() == path_str.as_ref()) | ||
| { | ||
| if !path.exists() { | ||
| debug!( | ||
| path = %path.display(), | ||
| "GPU baseline read-only path does not exist, skipping enrichment" | ||
| ); | ||
| continue; | ||
| } | ||
| fs.read_only.push(path_str.into_owned()); | ||
| modified = true; | ||
| } | ||
| } | ||
| for path in gpu_baseline_read_write_paths() { | ||
| let path_str = path.to_string_lossy(); | ||
| if !fs | ||
| .read_write | ||
| .iter() | ||
| .any(|p| p.as_str() == path_str.as_ref()) | ||
| { | ||
| if !path.exists() { | ||
| debug!( | ||
| path = %path.display(), | ||
| "GPU baseline read-write path does not exist, skipping enrichment" | ||
| ); | ||
| continue; | ||
| } | ||
| fs.read_write.push(path_str.into_owned()); | ||
| modified = true; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if modified { | ||
| info!("Enriched policy with baseline filesystem paths for proxy mode"); | ||
| } | ||
|
|
@@ -982,6 +1106,37 @@ fn enrich_sandbox_baseline_paths(policy: &mut SandboxPolicy) { | |
| } | ||
| } | ||
|
|
||
| if has_gpu_devices() { | ||
| for p in gpu_baseline_read_only_paths() { | ||
| if !policy.filesystem.read_only.contains(&p) | ||
| && !policy.filesystem.read_write.contains(&p) | ||
| { | ||
| if !p.exists() { | ||
| debug!( | ||
| path = %p.display(), | ||
| "GPU baseline read-only path does not exist, skipping enrichment" | ||
| ); | ||
| continue; | ||
| } | ||
| policy.filesystem.read_only.push(p); | ||
| modified = true; | ||
| } | ||
| } | ||
| for p in gpu_baseline_read_write_paths() { | ||
| if !policy.filesystem.read_write.contains(&p) { | ||
| if !p.exists() { | ||
| debug!( | ||
| path = %p.display(), | ||
| "GPU baseline read-write path does not exist, skipping enrichment" | ||
| ); | ||
| continue; | ||
| } | ||
| policy.filesystem.read_write.push(p); | ||
| modified = true; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if modified { | ||
| info!("Enriched policy with baseline filesystem paths for proxy mode"); | ||
| } | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This follows the policy from https://github.com/NVIDIA/nv-agent-env/blob/e8950e624c7e98e60e624d25e7d57d0003a8b61c/crates/openshell-policy/src/lib.rs#L432-L433