feat: add symlinks capabilities for S3-compatible systems, and fix multipart copy source-size handling#1
Merged
Conversation
92afd1c to
ca440fb
Compare
…n save failure updateSymlinksFile() was mutating the shared symlinksCache in place before confirming the S3 save succeeded. If SaveSymlinksFileWithRetry failed, the in-memory cache would contain changes never persisted, causing phantom symlinks or missed deletions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
loadSymlinksCache() and updateSymlinksFile() were holding parent.mu during S3 network calls, blocking all concurrent FUSE operations on the same directory. Follow the existing GeeseFS convention (loadListing, listObjectsFlat) of temporarily releasing the lock during I/O. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
loadSymlinksCache() was issuing a conditional GET on every call with no TTL guard. An ls -la on 100 files could trigger 100+ S3 requests for the same symlinks file. Now skips the network call if the cache was loaded within StatCacheTTL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The --symlink-attr flag was accidentally removed when adding the new --enable-symlinks-file flags. This left SymlinkAttr as "" for all CLI-launched mounts, breaking symlink detection via the old metadata path. Restore the flag and wire it back into PopulateFlags. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
isNotExist, isNotModified, and isPreconditionFailed were using strings.Contains on error messages (e.g., "404", "412"), which could false-match unrelated errors. Now check awserr.RequestFailure status codes first, with tight fallbacks for non-AWS backends. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous commit changed error detection to use awserr.RequestFailure status codes, but existing tests only exercised the fallback paths via the mock backend. Add tests covering the real S3 paths (awserr typed errors) and verify no false positives on unrelated error messages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Renaming a virtual symlink now properly removes the entry from the source directory's .geesefs_symlinks and adds it to the destination's, ensuring cross-mount visibility. Adds docker integration tests for both cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… symlinks The previous check (userMetadata[SymlinkAttr] != nil) could not distinguish virtual symlinks (stored in .geesefs_symlinks, no S3 object) from S3-backed symlinks (created via the old metadata path). This could cause orphaned S3 objects on unlink or incorrect behavior on rename. Adds isVirtualSymlink bool to Inode, set at all virtual symlink creation sites and checked in Unlink, Rename, and listing cleanup paths.
Extract newVirtualSymlinkInode() and applyVirtualSymlinkAttrs() helpers to replace 9 duplicate virtual symlink creation/update blocks.
Prevents a race where another mount adds a symlink between our empty check and the delete. The If-Match conditional write fails with 412 if the file was modified, letting the retry logic merge correctly.
Avoids map lookups, mutex lock, and string formatting on every file lookup when debug logging is disabled.
Neither the struct nor the function were referenced anywhere.
Randomizes sleep duration in [backoff/2, backoff) to avoid thundering herd when multiple mounts contend on the same symlinks file.
Replace json.MarshalIndent with json.Marshal to reduce S3 storage and bandwidth for .geesefs_symlinks files.
…ability Use an explicit capability flag instead of hardcoded backend name matching to determine conditional write support.
Add tests for corrupted symlinks file loading, recovery after corruption, missing symlinks field, and forward version compatibility.
When a file grows beyond its original S3 size, copyUnmodifiedParts was clipping part ranges to Attributes.Size (the new local size) instead of knownSize (the source S3 object size). This caused S3 to return 400 errors for copy ranges exceeding the source object bounds.
Each symlink create/delete previously triggered an immediate S3 PUT to .geesefs_symlinks. Creating 10 symlinks = 10 PUTs. This batches them into a single PUT per directory using a configurable delay timer (--symlinks-batch-delay, default 100ms). The in-memory cache is updated eagerly so local readers see changes immediately, but the S3 PUT is deferred. Uses per-directory time.AfterFunc timers matching the existing ScheduleRetryFlush pattern. Flush is triggered by: timer expiry, sync/fsync, or unmount. On conflict, the merge function replays all batched changes. On failure, changes are re-queued for retry.
completeMultipart() was reading Attributes.Size after its goroutine re-acquired inode.mu, but concurrent writes could extend the file in the gap between goroutine launch and lock acquisition. This caused finalSize to include parts not yet uploaded, leading to knownSize being set larger than the actual S3 object after CompleteMultipartUpload (which skips nil parts). The wrong knownSize then triggered spurious conflict detection and EINVAL errors on subsequent flushes. Capture finalSize in the caller while the lock is held and canComplete has verified all dirty parts are flushed.
knownSize grows as parts are flushed via updateFromFlush(), so by the time copyUnmodifiedParts runs it no longer reflects the actual S3 source object size. Capture knownSize at MPU begin as mpuSourceSize and clip copy ranges to that instead, preventing HTTP 400 (InvalidArgument) when copying ranges beyond the source object.
The previous approach (mpuSourceSize captured from knownSize at MPU begin) fails when knownSize diverges from the actual S3 object size. This happens when updateFromFlush sets knownSize to the local file size after a completed MPU, but the resulting S3 object is smaller (e.g. due to clipped copies in a prior cycle). The next MPU then uses the wrong size for range clipping, causing S3 400 InvalidArgument. Instead, do a single HeadBlob at the start of copyUnmodifiedParts to get the ground-truth source object size from S3, and clip copy ranges to that. This adds one HEAD request per flush cycle that needs copies — negligible compared to the UploadPartCopy operations that follow.
Track only directories with pending symlink changes for sync/shutdown flushes and remove redundant fallocate extension zero-fill path. Co-Authored-By: Warp <agent@warp.dev>
Capture finalSize under lock, use HeadBlob source-size validation, and fail fast on impossible unmodified-copy ranges to avoid copy thrash. Co-Authored-By: Warp <agent@warp.dev>
Co-Authored-By: Warp <agent@warp.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://dectris.atlassian.net/wiki/spaces/DCSDN/pages/1423114306/ADR0015+Symlinks+support+for+GeeseFS+against+AWS+S3