security: reject chain-symlink path traversal in tar extraction#92
security: reject chain-symlink path traversal in tar extraction#92renecannao merged 2 commits intomasterfrom
Conversation
The CVE-2020-26277 patch in unpackTarFiles only inspected each symlink in isolation: depth of slashes in linkname vs name, a FileExists probe relative to CWD, and a prefix check for absolute targets. None of these model the attack of chaining several "dirN -> dirN-1/.." symlinks so the cumulative realpath of a later symlink escapes the extraction directory, after which a regular-file entry can be written through the escaping symlink. The PoC disclosed by Wind (Ubuntu 24.04) reproduced against master and wrote /tmp/Exp.txt. Replace the per-entry heuristic with a filesystem-resolution check: resolveInsideExtractDir calls filepath.EvalSymlinks on the target path (or on its deepest existing ancestor, when the target does not exist yet) and rejects anything that resolves outside extractAbsDir. This catches the chain attack because EvalSymlinks walks through intermediate symlinks before evaluating "..", unlike filepath.Join which collapses ".." lexically. The same check now validates each entry's parent directory, so a regular-file entry whose path would pass through a previously-created escaping symlink is also rejected. Additionally refuse to overwrite an existing symlink with a regular file, and canonicalize extractAbsDir via EvalSymlinks so prefix comparisons are correct on macOS where /tmp is a symlink to /private/tmp. Regression tests cover the reported chain PoC, a single-hop relative escape, an absolute-target escape, a legitimate same-directory symlink (the pattern found in real MySQL tarballs), and the symlink-overwrite case. The now-unused pathDepth helper and its test are removed.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 28 minutes and 22 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThese changes enhance tar extraction security by implementing canonical path resolution and per-entry validation. The extract directory is canonicalized using Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request enhances the security of tar extraction by replacing a path-depth heuristic with robust filesystem-aware path resolution using filepath.EvalSymlinks. It introduces resolveInsideExtractDir to prevent symlink-chain traversal attacks and adds a check to prevent overwriting existing symlinks with regular files. Review feedback highlights a regression for relative destination paths due to os.Chdir and portability issues on Windows with manual path traversal in resolveInsideExtractDir.
| extractAbsDir, err := filepath.EvalSymlinks(extractDir) | ||
| if err != nil { | ||
| return fmt.Errorf("error defining the absolute path of '%s': %s", extractDir, err) | ||
| extractAbsDir, err = filepath.Abs(extractDir) | ||
| if err != nil { | ||
| return fmt.Errorf("error defining the absolute path of '%s': %s", extractDir, err) | ||
| } | ||
| } |
There was a problem hiding this comment.
This logic introduces a regression for relative destination paths. UnpackTar and UnpackXzTar call os.Chdir(destination) before invoking unpackTarFiles. If extractDir (which is the destination string) is relative (e.g., "out"), filepath.EvalSymlinks will look for "out/out" relative to the new working directory, which will fail or point to the wrong location. Since the process is already inside the target directory, you should resolve the absolute path of "." to establish the canonical base for security checks.
| extractAbsDir, err := filepath.EvalSymlinks(extractDir) | |
| if err != nil { | |
| return fmt.Errorf("error defining the absolute path of '%s': %s", extractDir, err) | |
| extractAbsDir, err = filepath.Abs(extractDir) | |
| if err != nil { | |
| return fmt.Errorf("error defining the absolute path of '%s': %s", extractDir, err) | |
| } | |
| } | |
| extractAbsDir, err := filepath.Abs(".") | |
| if err != nil { | |
| return fmt.Errorf("error defining the absolute path of the extraction directory: %s", err) | |
| } | |
| if resolved, err := filepath.EvalSymlinks(extractAbsDir); err == nil { | |
| extractAbsDir = resolved | |
| } |
| idx := strings.LastIndex(parent, string(os.PathSeparator)) | ||
| if idx < 0 { | ||
| return "", fmt.Errorf("cannot resolve ancestors of '%s'", target) | ||
| } | ||
| if suffix == "" { | ||
| suffix = parent[idx+1:] | ||
| } else { | ||
| suffix = parent[idx+1:] + string(os.PathSeparator) + suffix | ||
| } | ||
| parent = parent[:idx] | ||
| if parent == "" { | ||
| parent = string(os.PathSeparator) | ||
| } | ||
| if resolved, err := filepath.EvalSymlinks(parent); err == nil { | ||
| if !pathInside(resolved, extractAbsDir) { | ||
| return "", fmt.Errorf("ancestor of '%s' resolves to '%s' outside extraction directory '%s'", target, resolved, extractAbsDir) | ||
| } | ||
| combined := filepath.Join(resolved, suffix) | ||
| if !pathInside(combined, extractAbsDir) { | ||
| return "", fmt.Errorf("path '%s' would resolve to '%s' outside extraction directory '%s'", target, combined, extractAbsDir) | ||
| } | ||
| return combined, nil | ||
| } | ||
| if parent == string(os.PathSeparator) { | ||
| return "", fmt.Errorf("cannot resolve any ancestor of '%s'", target) | ||
| } | ||
| } |
There was a problem hiding this comment.
The manual path traversal using strings.LastIndex with os.PathSeparator is not portable and will fail on Windows for volume roots (e.g., C:\). When parent reaches C:\, LastIndex finds the separator at index 2, and parent[:idx] results in C:, which filepath.EvalSymlinks treats as the current directory on that drive rather than the root. It is safer to use filepath.Dir and terminate the loop when filepath.Dir(parent) == parent.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unpack/unpack_test.go`:
- Line 101: Tests call UnpackTar which changes the process working directory and
never restores it, leaking a deleted cwd into other tests; add a small test
helper (e.g., UnpackTarRestoreCWD) that saves os.Getwd(), calls
UnpackTar(tarPath, dest, SILENT), and then restores the original cwd with
os.Chdir in a defer so it always runs, and update each test call (the ones
invoking UnpackTar in unpack_test.go) to use that helper instead of calling
UnpackTar directly.
In `@unpack/unpack.go`:
- Around line 176-181: The code canonicalizes extractDir after callers call
os.Chdir, causing relative destinations to be resolved against the changed cwd;
fix by resolving the destination to an absolute, symlink-evaluated path before
changing directories: call filepath.EvalSymlinks(extractDir) (falling back to
filepath.Abs) to compute extractAbsDir prior to any os.Chdir, then pass that
canonical extractAbsDir into unpackTarFiles; apply the same change to
UnpackXzTar so both unpack functions compute and use absolute paths before
performing os.Chdir.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5f01da99-a4ab-4eb7-a910-188cf7244e2a
📒 Files selected for processing (2)
unpack/unpack.gounpack/unpack_test.go
- Canonicalize the extraction destination before os.Chdir. UnpackTar
and UnpackXzTar do os.Chdir(destination) before invoking
unpackTarFiles, so resolving the path inside the inner function was
a regression for relative destinations: EvalSymlinks would look for
"dest" relative to the already-changed cwd. A new canonicalExtractDir
helper produces an absolute, symlink-resolved path up front; both
outer functions pass that canonical path to unpackTarFiles, which no
longer re-resolves. A Test_relativeDestination case covers this.
- Make resolveInsideExtractDir's ancestor walk portable. The previous
strings.LastIndex loop would mis-handle Windows volume roots ("C:\"
would become "C:" and be interpreted as per-drive cwd). Walk with
filepath.Dir and terminate at the fixed point so "/" and "C:\" are
both handled correctly. Also use filepath.Rel/Join for the suffix
instead of manual separator string-building.
- Restore cwd after UnpackTar in tests. UnpackTar does not restore the
working directory it chdirs into; combined with t.TempDir() cleanup,
tests would leave the process pointing at a deleted directory and
leak that state into subsequent tests. A new unpackTarForTest helper
registers a t.Cleanup to os.Chdir back to the original cwd, and all
UnpackTar call sites in the tests route through it.
Promote the Unreleased entries (mysqlsh version gate, install-doc tweak) into 2.2.3 and add a SECURITY section covering the chain-symlink path-traversal fix from PR #92.
Summary
A researcher reported an arbitrary-file-write vulnerability in
unpackTarFiles(unpack/unpack.go): a malicious tarball can chain relativedirN -> dirN-1/..symlinks so the cumulative realpath of a later symlink escapes the extraction directory, then pivot via a symlink whose path depth equals the entry name (so the existingpathDepthheuristic does not fire), then write a regular-file entry through the escaping symlink. I reproduced the PoC againstmasterand it wrote a file to/tmpoutside the destination.The current checks (depth-of-slashes comparison,
common.FileExistsprobe relative to CWD,strings.HasPrefixon absolute paths) inspect each symlink in isolation and do not model cumulative filesystem-level resolution across previously-created symlinks. The CVE-2020-26277 patch does not address this.What changes
unpackTarFileswith a filesystem-resolution check:resolveInsideExtractDircallsfilepath.EvalSymlinkson the full target (or its deepest existing ancestor, when the target does not yet exist) and rejects anything outsideextractAbsDir.EvalSymlinkswalks through intermediate symlinks before evaluating.., unlikefilepath.Joinwhich collapses..lexically — that is what catches the chain attack.os.Create.os.Createwould follow it).extractAbsDirviaEvalSymlinksso prefix comparisons work where the destination sits under a symlinked mount (e.g. macOS/tmp->/private/tmp).pathDepthhelper and its test.Test plan
go test ./unpack/passesTest_symlinkChainEscapereproduces the reported PoC and asserts both the error and that no file was written outside the extraction dirTest_symlinkSingleHopEscape— relative../../target is rejected (previously caught bypathDepth, now caught by resolution)Test_symlinkAbsoluteTarget— absolute symlink outside destdir is rejectedTest_legitimateSymlinkPreserved— same-directory symlink (the pattern in real MySQL tarballs, e.g.libssl.dylib -> libssl.1.0.0.dylib) still worksTest_refuseOverwriteSymlinkWithRegular— regular-file entry cannot overwrite a pre-existing symlink/tmp) against the patched binary: rejected atdir2 -> dir1/..before any write occursSummary by CodeRabbit