Skip to content

feat: add symlinks capabilities for S3-compatible systems, and fix multipart copy source-size handling#1

Merged
alexsavio merged 39 commits into
devfrom
symlinks
Mar 2, 2026
Merged

feat: add symlinks capabilities for S3-compatible systems, and fix multipart copy source-size handling#1
alexsavio merged 39 commits into
devfrom
symlinks

Conversation

@alexsavio
Copy link
Copy Markdown
Collaborator

@alexsavio alexsavio force-pushed the symlinks branch 2 times, most recently from 92afd1c to ca440fb Compare January 27, 2026 10:20
@alexsavio alexsavio changed the base branch from master to dev February 19, 2026 13:22
@alexsavio alexsavio changed the title Symlinks feat: add symlinks capabilities for S3-compatible systems, and fix multipart copy source-size handling Feb 19, 2026
alexsavio and others added 24 commits February 23, 2026 13:20
…n save failure

updateSymlinksFile() was mutating the shared symlinksCache in place
before confirming the S3 save succeeded. If SaveSymlinksFileWithRetry
failed, the in-memory cache would contain changes never persisted,
causing phantom symlinks or missed deletions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
loadSymlinksCache() and updateSymlinksFile() were holding parent.mu
during S3 network calls, blocking all concurrent FUSE operations on the
same directory. Follow the existing GeeseFS convention (loadListing,
listObjectsFlat) of temporarily releasing the lock during I/O.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
loadSymlinksCache() was issuing a conditional GET on every call with no
TTL guard. An ls -la on 100 files could trigger 100+ S3 requests for
the same symlinks file. Now skips the network call if the cache was
loaded within StatCacheTTL.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The --symlink-attr flag was accidentally removed when adding the new
--enable-symlinks-file flags. This left SymlinkAttr as "" for all
CLI-launched mounts, breaking symlink detection via the old metadata
path. Restore the flag and wire it back into PopulateFlags.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
isNotExist, isNotModified, and isPreconditionFailed were using
strings.Contains on error messages (e.g., "404", "412"), which could
false-match unrelated errors. Now check awserr.RequestFailure status
codes first, with tight fallbacks for non-AWS backends.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous commit changed error detection to use awserr.RequestFailure
status codes, but existing tests only exercised the fallback paths via
the mock backend. Add tests covering the real S3 paths (awserr typed
errors) and verify no false positives on unrelated error messages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Renaming a virtual symlink now properly removes the entry from the source
directory's .geesefs_symlinks and adds it to the destination's, ensuring
cross-mount visibility. Adds docker integration tests for both cases.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… symlinks

The previous check (userMetadata[SymlinkAttr] != nil) could not distinguish
virtual symlinks (stored in .geesefs_symlinks, no S3 object) from S3-backed
symlinks (created via the old metadata path). This could cause orphaned S3
objects on unlink or incorrect behavior on rename.

Adds isVirtualSymlink bool to Inode, set at all virtual symlink creation
sites and checked in Unlink, Rename, and listing cleanup paths.
Extract newVirtualSymlinkInode() and applyVirtualSymlinkAttrs() helpers
to replace 9 duplicate virtual symlink creation/update blocks.
Prevents a race where another mount adds a symlink between our empty
check and the delete. The If-Match conditional write fails with 412
if the file was modified, letting the retry logic merge correctly.
Avoids map lookups, mutex lock, and string formatting on every file
lookup when debug logging is disabled.
Neither the struct nor the function were referenced anywhere.
alexsavio and others added 14 commits February 23, 2026 13:20
Randomizes sleep duration in [backoff/2, backoff) to avoid thundering
herd when multiple mounts contend on the same symlinks file.
Replace json.MarshalIndent with json.Marshal to reduce S3 storage
and bandwidth for .geesefs_symlinks files.
…ability

Use an explicit capability flag instead of hardcoded backend name
matching to determine conditional write support.
Add tests for corrupted symlinks file loading, recovery after
corruption, missing symlinks field, and forward version compatibility.
When a file grows beyond its original S3 size, copyUnmodifiedParts was
clipping part ranges to Attributes.Size (the new local size) instead of
knownSize (the source S3 object size). This caused S3 to return 400
errors for copy ranges exceeding the source object bounds.
Each symlink create/delete previously triggered an immediate S3 PUT to
.geesefs_symlinks. Creating 10 symlinks = 10 PUTs. This batches them
into a single PUT per directory using a configurable delay timer
(--symlinks-batch-delay, default 100ms).

The in-memory cache is updated eagerly so local readers see changes
immediately, but the S3 PUT is deferred. Uses per-directory
time.AfterFunc timers matching the existing ScheduleRetryFlush pattern.

Flush is triggered by: timer expiry, sync/fsync, or unmount. On
conflict, the merge function replays all batched changes. On failure,
changes are re-queued for retry.
completeMultipart() was reading Attributes.Size after its goroutine
re-acquired inode.mu, but concurrent writes could extend the file in
the gap between goroutine launch and lock acquisition. This caused
finalSize to include parts not yet uploaded, leading to knownSize being
set larger than the actual S3 object after CompleteMultipartUpload
(which skips nil parts). The wrong knownSize then triggered spurious
conflict detection and EINVAL errors on subsequent flushes.

Capture finalSize in the caller while the lock is held and canComplete
has verified all dirty parts are flushed.
knownSize grows as parts are flushed via updateFromFlush(), so by the
time copyUnmodifiedParts runs it no longer reflects the actual S3 source
object size. Capture knownSize at MPU begin as mpuSourceSize and clip
copy ranges to that instead, preventing HTTP 400 (InvalidArgument) when
copying ranges beyond the source object.
The previous approach (mpuSourceSize captured from knownSize at MPU
begin) fails when knownSize diverges from the actual S3 object size.
This happens when updateFromFlush sets knownSize to the local file
size after a completed MPU, but the resulting S3 object is smaller
(e.g. due to clipped copies in a prior cycle). The next MPU then
uses the wrong size for range clipping, causing S3 400 InvalidArgument.

Instead, do a single HeadBlob at the start of copyUnmodifiedParts to
get the ground-truth source object size from S3, and clip copy ranges
to that. This adds one HEAD request per flush cycle that needs copies
— negligible compared to the UploadPartCopy operations that follow.
Track only directories with pending symlink changes for sync/shutdown flushes and remove redundant fallocate extension zero-fill path.

Co-Authored-By: Warp <agent@warp.dev>
Capture finalSize under lock, use HeadBlob source-size validation, and fail fast on impossible unmodified-copy ranges to avoid copy thrash.

Co-Authored-By: Warp <agent@warp.dev>
Co-Authored-By: Warp <agent@warp.dev>
@alexsavio alexsavio merged commit 3a92537 into dev Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant