fix(drive): handle duplicate remote sync paths#803
fix(drive): handle duplicate remote sync paths#803fangshuyu-768 wants to merge 2 commits intomainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughWalkthroughPreserves duplicate remote Drive entries in listings, detects duplicate file rel_paths, adds deterministic selection/rename helpers, introduces --on-duplicate-remote policies (fail/rename/newest/oldest) for status/pull/push, creates deduplicated views for command logic, adds tests, and updates docs. ChangesDuplicate Remote File Handling in Drive Sync Commands
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #803 +/- ##
==========================================
+ Coverage 65.46% 65.49% +0.03%
==========================================
Files 510 510
Lines 47129 47259 +130
==========================================
+ Hits 30851 30952 +101
- Misses 13607 13631 +24
- Partials 2671 2676 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🚀 PR Preview Install Guide🧰 CLI updatenpm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@54fb4b664e40804040e444c27bbfaa647d5ecb54🧩 Skill updatenpx skills add larksuite/cli#feat/drive-duplicate-remote-paths -y -g |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
shortcuts/drive/drive_pull.go (2)
145-155: 💤 Low valueDuplicate scan runs unconditionally; minor wasted work outside
fail.
duplicateRemoteFilePaths(entries)always runs even when--on-duplicate-remote=newest|oldest|rename, where its result is discarded anddrivePullRemoteViewsre-groups the same entries. The waste is O(n) and irrelevant for typical folder sizes, but you can cheaply gate it on the fail policy.♻️ Optional micro-refactor
- if duplicates := duplicateRemoteFilePaths(entries); len(duplicates) > 0 && duplicateRemote == driveDuplicateRemoteFail { - return duplicateRemotePathError(duplicates) + if duplicateRemote == driveDuplicateRemoteFail { + if duplicates := duplicateRemoteFilePaths(entries); len(duplicates) > 0 { + return duplicateRemotePathError(duplicates) + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/drive/drive_pull.go` around lines 145 - 155, The code always computes duplicateRemoteFilePaths(entries) even when duplicateRemote != driveDuplicateRemoteFail; move the call to duplicateRemoteFilePaths(entries) inside the conditional that checks duplicateRemote == driveDuplicateRemoteFail so duplicates is only computed when needed, and keep the existing return duplicateRemotePathError(duplicates) behavior; leave the subsequent call to drivePullRemoteViews(entries, duplicateRemote) unchanged.
312-357: 💤 Low value
drivePullRemoteViewsbuilds the two views correctly and rename behavior is stable.
- Pinning the oldest entry (sorted by
CreatedTimeasc) to the original rel_path is the right choice for rename mode: as long as the oldest Drive duplicate isn't deleted, the canonical local target stays stable across reruns.- Both
remoteFilesandremotePathsare populated for every target rel_path, so--delete-localwon't orphan the renamed siblings on a follow-up pull.- Folder/online-doc entries land in
remotePathsonly, preserving the existing "online doc shadows a same-named local file" no-orphan guarantee.One small belt-and-suspenders thought: a
default:arm in the duplicate switch that writes nothing (or returns an error) would harden against future enum drift, since a typo in a flag value today would silently drop the whole duplicate group. Framework enum validation already guards this in practice, so this is purely defensive.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/drive/drive_pull.go` around lines 312 - 357, In drivePullRemoteViews, the switch on duplicateRemote (which currently handles driveDuplicateRemoteRename, driveDuplicateRemoteNewest and driveDuplicateRemoteOldest) can silently drop a group if duplicateRemote has an unexpected value; add a default: arm to the switch that defensively fails (for example by returning an error or panicking with a clear message) or at minimum logs the unexpected duplicateRemote and skips/returns so the caller can't silently lose entries—update the function to handle/propagate that failure path consistently with its callers.shortcuts/drive/list_remote.go (2)
193-214: 💤 Low valueTime-field sort relies on lexicographic ordering of string timestamps.
sortRemoteFilescomparesModifiedTime/CreatedTimevia>/<on raw strings. Drive returns these as Unix-epoch strings of constant width (10-digit seconds, or 13-digit ms), so lexicographic order matches numeric order in practice — and the deterministicFileTokentiebreaker keeps results stable. Worth a one-liner doc note that this assumes equal-width string timestamps so future contributors don't try to switch the API to a mixed format.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/drive/list_remote.go` around lines 193 - 214, Add a concise one-line comment above the sortRemoteFiles function documenting the assumption that ModifiedTime and CreatedTime are fixed-width Unix-epoch strings (e.g., 10- or 13-digit) so lexicographic comparison is equivalent to numeric ordering; reference the function name sortRemoteFiles and the fields ModifiedTime, CreatedTime and FileToken, and state that this invariant is relied on for correct ordering and stability so future contributors don't alter the comparison to mixed-format timestamps.
216-227: 💤 Low value
relPathWithFileTokenSuffixworks but has a couple of edge-case quirks worth knowing.
- For dotfiles like
.gitignore,path.Extreturns the whole basename, so the result becomes__lark_<token>.gitignore(stem empty). Functional but unusual.- For
archive.tar.gz, the suffix lands asarchive.tar__lark_<token>.gz(only.gzis recognized as the extension).Both are inherent to using
path.Extonce and aren't bugs given how--on-duplicate-remote=renameis documented, but consider a brief comment so the rename-shape is intentional and discoverable. Also worth noting:chooseRemoteFileindexescandidates[0]and would panic on empty input — currently safe because callers guard withlen(files) > 1, but a defensive zero-value return would harden it against future call-site changes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/drive/list_remote.go` around lines 216 - 227, chooseRemoteFile currently assumes files is non-empty and returns candidates[0], which will panic if callers change; make it defensive by returning a zero-value driveRemoteEntry when len(files)==0 (update the chooseRemoteFile function). relPathWithFileTokenSuffix intentionally uses path.Ext once so dotfiles (`.gitignore` -> `__lark_<token>.gitignore`) and multi-part extensions (`archive.tar.gz` -> `archive.tar__lark_<token>.gz`) behave that way; add a concise comment above relPathWithFileTokenSuffix describing these edge cases and that the single path.Ext usage is intentional to document the rename shape.shortcuts/drive/drive_duplicate_remote_test.go (1)
109-114: 💤 Low valueStdout assertion is brittle on JSON formatting.
strings.Contains(out,"rel_path": "dup__lark_tok_second.txt")depends on the runtime emitting indented JSON with a space after the colon. Ifruntime.Outever switches to compact JSON, the assertion fails for a purely cosmetic reason. Decoding stdout into a struct and checkingitems[]field-by-field would be more durable.♻️ Suggested refactor sketch
var payload struct { Items []struct { RelPath string `json:"rel_path"` Action string `json:"action"` } `json:"items"` } if err := json.Unmarshal(stdout.Bytes(), &payload); err != nil { t.Fatalf("decode stdout: %v\n%s", err, stdout.String()) } found := false for _, it := range payload.Items { if it.RelPath == "dup__lark_tok_second.txt" && it.Action == "downloaded" { found = true break } } if !found { t.Fatalf("expected renamed download item, got: %s", stdout.String()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/drive/drive_duplicate_remote_test.go` around lines 109 - 114, The test's stdout assertion in drive_duplicate_remote_test.go is brittle because it relies on JSON spacing; instead parse stdout.Bytes() as JSON into a payload struct with Items []{RelPath string `json:"rel_path"`; Action string `json:"action"`}, check that one item has RelPath "dup__lark_tok_second.txt" and Action "downloaded", and fail with the full stdout on unmarshal or if no matching item is found; replace the strings.Contains check against stdout.String() with this robust json.Unmarshal + item-by-item validation (use the existing stdout variable and keep mustReadFile checks).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@skills/lark-drive/references/lark-drive-status.md`:
- Around line 17-19: The documentation's example for the duplicate_remote[]
detail is inconsistent with the actual serialized shape; update the JSON
example(s) around the "远端同名文件冲突" section (and lines 60–79) to either include the
optional fields produced by driveDuplicateRemoteFile (size, created_time,
modified_time) or add an explicit ".../omitempty" hint so readers know those
fields may appear; locate references to duplicates_remote[] and
driveDuplicateRemoteFile in the markdown and make the example(s) reflect the
real shape (file_token, name, size, created_time, modified_time) or annotate
them as optional.
In `@skills/lark-drive/SKILL.md`:
- Line 241: The Markdown table row containing the inline option
`--on-duplicate-remote newest|oldest` breaks the table because the unescaped
pipe is parsed as a column separator; fix it by escaping the pipe (change to
`--on-duplicate-remote newest\|oldest`) or by splitting into two code spans
(e.g., `--on-duplicate-remote newest` and `--on-duplicate-remote oldest`) so the
table cell no longer contains an unescaped `|`. Update the SKILL.md entry where
the `+push` row mentions `--on-duplicate-remote` to use one of these forms.
---
Nitpick comments:
In `@shortcuts/drive/drive_duplicate_remote_test.go`:
- Around line 109-114: The test's stdout assertion in
drive_duplicate_remote_test.go is brittle because it relies on JSON spacing;
instead parse stdout.Bytes() as JSON into a payload struct with Items []{RelPath
string `json:"rel_path"`; Action string `json:"action"`}, check that one item
has RelPath "dup__lark_tok_second.txt" and Action "downloaded", and fail with
the full stdout on unmarshal or if no matching item is found; replace the
strings.Contains check against stdout.String() with this robust json.Unmarshal +
item-by-item validation (use the existing stdout variable and keep mustReadFile
checks).
In `@shortcuts/drive/drive_pull.go`:
- Around line 145-155: The code always computes
duplicateRemoteFilePaths(entries) even when duplicateRemote !=
driveDuplicateRemoteFail; move the call to duplicateRemoteFilePaths(entries)
inside the conditional that checks duplicateRemote == driveDuplicateRemoteFail
so duplicates is only computed when needed, and keep the existing return
duplicateRemotePathError(duplicates) behavior; leave the subsequent call to
drivePullRemoteViews(entries, duplicateRemote) unchanged.
- Around line 312-357: In drivePullRemoteViews, the switch on duplicateRemote
(which currently handles driveDuplicateRemoteRename, driveDuplicateRemoteNewest
and driveDuplicateRemoteOldest) can silently drop a group if duplicateRemote has
an unexpected value; add a default: arm to the switch that defensively fails
(for example by returning an error or panicking with a clear message) or at
minimum logs the unexpected duplicateRemote and skips/returns so the caller
can't silently lose entries—update the function to handle/propagate that failure
path consistently with its callers.
In `@shortcuts/drive/list_remote.go`:
- Around line 193-214: Add a concise one-line comment above the sortRemoteFiles
function documenting the assumption that ModifiedTime and CreatedTime are
fixed-width Unix-epoch strings (e.g., 10- or 13-digit) so lexicographic
comparison is equivalent to numeric ordering; reference the function name
sortRemoteFiles and the fields ModifiedTime, CreatedTime and FileToken, and
state that this invariant is relied on for correct ordering and stability so
future contributors don't alter the comparison to mixed-format timestamps.
- Around line 216-227: chooseRemoteFile currently assumes files is non-empty and
returns candidates[0], which will panic if callers change; make it defensive by
returning a zero-value driveRemoteEntry when len(files)==0 (update the
chooseRemoteFile function). relPathWithFileTokenSuffix intentionally uses
path.Ext once so dotfiles (`.gitignore` -> `__lark_<token>.gitignore`) and
multi-part extensions (`archive.tar.gz` -> `archive.tar__lark_<token>.gz`)
behave that way; add a concise comment above relPathWithFileTokenSuffix
describing these edge cases and that the single path.Ext usage is intentional to
document the rename shape.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1889e61a-f684-4400-b399-854a7395d63b
📒 Files selected for processing (9)
shortcuts/drive/drive_duplicate_remote_test.goshortcuts/drive/drive_pull.goshortcuts/drive/drive_push.goshortcuts/drive/drive_status.goshortcuts/drive/list_remote.goskills/lark-drive/SKILL.mdskills/lark-drive/references/lark-drive-pull.mdskills/lark-drive/references/lark-drive-push.mdskills/lark-drive/references/lark-drive-status.md
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
shortcuts/drive/list_remote.go (1)
68-118:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHonor cancellation inside the recursive listing.
The
ctxparameter is threaded through this helper but never observed, so a large Drive tree keeps paging and descending even after the command has been canceled. Addctx.Err()checks before each page fetch and before recursing to bail out immediately on cancellation.Alternatively, consider refactoring to use
RuntimeContext.StreamPages()orRuntimeContext.PaginateAll(), which are context-aware pagination helpers that handle cancellation internally and may eliminate manual pagination logic.💡 Minimal fix
func listRemoteFolderEntries(ctx context.Context, runtime *common.RuntimeContext, folderToken, relBase string) ([]driveRemoteEntry, error) { var out []driveRemoteEntry pageToken := "" for { + if err := ctx.Err(); err != nil { + return nil, err + } params := map[string]interface{}{ "folder_token": folderToken, "page_size": fmt.Sprint(driveListRemotePageSize), } if pageToken != "" { params["page_token"] = pageToken } result, err := runtime.CallAPI("GET", "/open-apis/drive/v1/files", params, nil) if err != nil { return nil, err } rawFiles, _ := result["files"].([]interface{}) for _, item := range rawFiles { f, ok := item.(map[string]interface{}) if !ok { continue } fType := common.GetString(f, "type") fName := common.GetString(f, "name") fToken := common.GetString(f, "token") if fName == "" || fToken == "" { continue } rel := joinRelDrive(relBase, fName) out = append(out, driveRemoteEntry{ FileToken: fToken, Name: fName, Size: int64(common.GetFloat(f, "size")), Type: fType, CreatedTime: common.GetString(f, "created_time"), ModifiedTime: common.GetString(f, "modified_time"), RelPath: rel, }) if fType == driveTypeFolder { + if err := ctx.Err(); err != nil { + return nil, err + } sub, err := listRemoteFolderEntries(ctx, runtime, fToken, rel) if err != nil { return nil, err } out = append(out, sub...) } } hasMore, nextToken := common.PaginationMeta(result) if !hasMore || nextToken == "" { break } pageToken = nextToken }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@shortcuts/drive/list_remote.go` around lines 68 - 118, The recursive lister listRemoteFolderEntries ignores ctx so cancellation is not honored; before each runtime.CallAPI page fetch and before recursing into listRemoteFolderEntries, check ctx.Err() and if non-nil return that error immediately to stop paging/recursion; alternatively replace the manual pagination loop with the context-aware helpers RuntimeContext.StreamPages() or RuntimeContext.PaginateAll() to handle cancellation for you.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@shortcuts/drive/list_remote.go`:
- Around line 207-211: The function relPathWithFileTokenSuffix currently turns
dotfiles like ".env" into "__lark_<token>.env" (losing the leading dot); fix by
detecting hidden files where base starts with '.' and stem=="" (i.e., ext ==
base), and in that case append the suffix after the whole base: return dir +
base + "__lark_"+fileToken; otherwise keep the existing logic using
path.Split/path.Ext/stem to build dir+stem+"__lark_"+fileToken+ext.
---
Outside diff comments:
In `@shortcuts/drive/list_remote.go`:
- Around line 68-118: The recursive lister listRemoteFolderEntries ignores ctx
so cancellation is not honored; before each runtime.CallAPI page fetch and
before recursing into listRemoteFolderEntries, check ctx.Err() and if non-nil
return that error immediately to stop paging/recursion; alternatively replace
the manual pagination loop with the context-aware helpers
RuntimeContext.StreamPages() or RuntimeContext.PaginateAll() to handle
cancellation for you.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bf0b1c43-7f75-4cd6-92dd-4c4a0c2139f0
📒 Files selected for processing (4)
shortcuts/drive/drive_duplicate_remote_test.goshortcuts/drive/list_remote.goskills/lark-drive/references/lark-drive-pull.mdskills/lark-drive/references/lark-drive-status.md
✅ Files skipped from review due to trivial changes (1)
- skills/lark-drive/references/lark-drive-status.md
🚧 Files skipped from review as they are similar to previous changes (2)
- skills/lark-drive/references/lark-drive-pull.md
- shortcuts/drive/drive_duplicate_remote_test.go
3e21374 to
74fa3a0
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
skills/lark-drive/SKILL.md (1)
241-241:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winEscape the pipe in the inline option list to avoid breaking the table
The unescaped
|in`--on-duplicate-remote newest|oldest`is parsed as a column separator, breaking the table structure.Suggested fix
-| [`+push`](references/lark-drive-push.md) | Mirror a local directory onto a Drive folder (local → Drive). Duplicate remote `type=file` entries fail by default before upload / overwrite / delete; use `--on-duplicate-remote newest|oldest` only when explicitly targeting one existing remote file. Supports `--if-exists` (overwrite/skip) and `--delete-remote` for one-way mirror sync; the destructive `--delete-remote` requires `--yes`. `--local-dir` is bounded to cwd by CLI path validation; tell the user to switch the agent's working directory if the source is outside cwd. | +| [`+push`](references/lark-drive-push.md) | Mirror a local directory onto a Drive folder (local → Drive). Duplicate remote `type=file` entries fail by default before upload / overwrite / delete; use `--on-duplicate-remote newest\|oldest` only when explicitly targeting one existing remote file. Supports `--if-exists` (overwrite/skip) and `--delete-remote` for one-way mirror sync; the destructive `--delete-remote` requires `--yes`. `--local-dir` is bounded to cwd by CLI path validation; tell the user to switch the agent's working directory if the source is outside cwd. |🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/lark-drive/SKILL.md` at line 241, The table row contains an inline option list ``--on-duplicate-remote newest|oldest`` whose unescaped pipe breaks the Markdown table; update the inline code for the option in SKILL.md (the `+push` table row) to escape the pipe (use a backslash before the `|` or an HTML entity) so the string becomes `--on-duplicate-remote newest\|oldest` (or equivalent) to preserve the table structure.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@skills/lark-drive/SKILL.md`:
- Line 241: The table row contains an inline option list ``--on-duplicate-remote
newest|oldest`` whose unescaped pipe breaks the Markdown table; update the
inline code for the option in SKILL.md (the `+push` table row) to escape the
pipe (use a backslash before the `|` or an HTML entity) so the string becomes
`--on-duplicate-remote newest\|oldest` (or equivalent) to preserve the table
structure.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b195d2b8-627b-42ee-b1c3-ad455508594b
📒 Files selected for processing (9)
shortcuts/drive/drive_duplicate_remote_test.goshortcuts/drive/drive_pull.goshortcuts/drive/drive_push.goshortcuts/drive/drive_status.goshortcuts/drive/list_remote.goskills/lark-drive/SKILL.mdskills/lark-drive/references/lark-drive-pull.mdskills/lark-drive/references/lark-drive-push.mdskills/lark-drive/references/lark-drive-status.md
✅ Files skipped from review due to trivial changes (2)
- skills/lark-drive/references/lark-drive-status.md
- skills/lark-drive/references/lark-drive-pull.md
🚧 Files skipped from review as they are similar to previous changes (6)
- shortcuts/drive/drive_status.go
- shortcuts/drive/drive_pull.go
- shortcuts/drive/drive_duplicate_remote_test.go
- shortcuts/drive/list_remote.go
- shortcuts/drive/drive_push.go
- skills/lark-drive/references/lark-drive-push.md
Summary
Tests
Summary by CodeRabbit
New Features
Tests
Documentation