fix(client): retry transient content-chunk PUTs on wasm + native (0.6.10)#42
Merged
Conversation
….10) Large (chunked) uploads could fail on a single sporadic chunk-PUT drop (ERR_CONNECTION_CLOSED / transient 5xx under the 16-wide concurrent burst). The content-chunk PUT path (put_object_chunked_internal -> put_object_with_ metadata -> request()) had NO retry on either target: the blob-backend retry only wraps __fula_forest_v7_nodes/, and retry_idempotent only wrapped the S3 multipart path AND collapsed to a single attempt on wasm for lack of a sleep primitive. Small (1-chunk) files almost never hit a drop; large ones did, worst on the web (FxFiles #50). - multipart::retry_idempotent: unify the loop across targets; add a wasm backoff sleep via gloo-timers (TimeoutFuture); make it pub(crate). - is_transient: un-gate for wasm (body already wasm-aware) so the unified retry can classify transient errors there. - put_object_chunked_internal: wrap each chunk PUT in retry_idempotent (4 attempts, exponential backoff capped at 5s). Safe: chunk keys are content addressed (idempotent). Per-attempt clones are cheap (Arc / Bytes / small structs). - add gloo-timers (wasm32 target, futures feature); bump workspace version 0.6.9 -> 0.6.10. Tests: - chunk_put_retries_transient (new): RED before / GREEN after -- wiremock injects one transient chunk 503 and asserts the chunked upload survives via the per-chunk retry. - chunk_retry_real_server_e2e (new, #[ignore]): 3 MiB chunked upload + byte-identical round-trip against the live master (Mode A creds). - full fula-client suite green (208 lib + 4 blob-backend retry); cargo check --target wasm32-unknown-unknown clean. Scope note: the per-file INDEX-object PUT (written after the chunks) is the same idempotent, currently-unretried pattern. Left out of scope here (the reported failure and the test both target the content chunks); a one-line follow-up can wrap it in the same retry_idempotent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Large (chunked) uploads could fail on a single sporadic chunk-PUT drop (
ERR_CONNECTION_CLOSED/ transient 5xx under the 16-wide concurrent burst). The content-chunk PUT path (put_object_chunked_internal→put_object_with_metadata→request()) had no retry on either target: the blob-backend retry only wraps__fula_forest_v7_nodes/, andretry_idempotentonly wrapped the S3-multipart path and collapsed to a single attempt on wasm for lack of a sleep primitive. Small (1-chunk) files almost never hit a drop; large ones did — worst on the web (FxFiles #50).Fix
multipart::retry_idempotent: unify the loop across targets; add a wasm backoff sleep via gloo-timers (TimeoutFuture); make itpub(crate).is_transient: un-gate for wasm (its body is already wasm-aware) so the unified retry can classify there.put_object_chunked_internal: wrap each chunk PUT inretry_idempotent(4 attempts, exp backoff capped 5s). Safe — chunk keys are content-addressed (idempotent); per-attempt clones are cheap (Arc / Bytes / small structs).gloo-timers(wasm32 target); bump workspace version 0.6.9 → 0.6.10.Tests / verification
chunk_put_retries_transient(new): wiremock injects one transient chunk 503; asserts the chunked upload survives via retry. RED before the fix, GREEN after (verified).chunk_retry_real_server_e2e(new,#[ignore]): 3 MiB chunked upload + byte-identical round-trip against the live master with Mode A creds — passed.fula-clientsuite green (208 lib + 4 blob-backend retry + 1 chunk-retry, 0 failed).cargo check --target wasm32-unknown-unknown -p fula-clientclean.What's proven vs inferred (honest)
The native unit test proves the retry logic on native. The wasm side is compile-verified (
cargo checkforwasm32) — fula-api has no#[wasm_bindgen_test]infra, so wasm runtime retry is inferred from the now-shared loop (same code path, only the sleep differs by target). The real-server E2E proves the upload path is intact end-to-end on native.Scope note
The per-file index-object PUT (written after the chunks) is the same idempotent, currently-unretried pattern. Left out of scope here (the reported failure + the test target the content chunks) — a one-line follow-up can wrap it in the same
retry_idempotent.Release
Closes the SDK side of FxFiles #50. Going live on the FxFiles web needs the 0.6.10 publish +
flutter-wasm-pkg.ziprelease asset, after which FxFiles bumpsfula_client+ runstools/sync-wasm-pkg.ps1.🤖 Generated with Claude Code