Symptom
On the web app, small files upload fine; large files fail with FulaApiException: Failed to upload large file: AnyhowException(HTTP error: error sending request). Console shows a chunk PUT …/videos-v8/Qm… net::ERR_CONNECTION_CLOSED (plus a benign 409 bucket-exists).
Root cause (confirmed in fula-client 0.6.9 source + the live gateway config)
- Large content is chunked into many <1 MB block PUTs (
should_use_chunked → put_object_chunked_internal), uploaded 16 at a time (buffer_unordered(MAX_CONCURRENT_CHUNK_UPLOADS = 16)).
- The blob-backend PUT retry loop is compiled out on wasm32:
BLOB_BACKEND_MAX_ATTEMPTS = 4 and the retry/backoff constants are all #[cfg(not(target_arch = "wasm32"))]. Native retries a transient PUT failure 4× with backoff; web does a single put_object.
- The gateway is not the limiter (ruled out by reading the nginx config): the
s3.cloud.fx.land:443 block allows limit_conn fula_conn 100, limit_req 600 r/s, client_max_body_size 5G — so 16 concurrent <1 MB PUTs are well within limits. (The conn_limit_per_ip 10 / fula_conn 10 belong to other server blocks, not the upload host.)
So it's a sporadic connection drop on one chunk (the upstream recycling a connection under the burst), and web's missing retry turns a single drop into a total upload failure. Small files (1 chunk) almost never hit a drop — hence small-OK / large-fail.
Fix (web-only)
Re-add the retry the SDK omits on web: retryAsync around the web upload (web_upload_manager), retrying transient errors only (error sending request / connection-closed / 5xx) with ~300 ms–1.2 s backoff, up to 4 attempts. Not retried: 4xx and the benign 409. putFlat is idempotent (content-addressed chunks; path-keyed forest entry) and a chunk failure happens before the forest upsert/flush (the SDK also deletes the uploaded chunks on failure), so re-running is safe — no duplicates, no dirty-forest state.
Tests
lib/core/utils/upload_retry.dart pure helpers — 8 unit tests (transient classification incl. "409 is not retried"; retry / backoff / exhaustion / immediate-rethrow). flutter analyze clean; flutter build web --release green.
Verification (gates close)
Unit tests prove the retry logic, not that the upload now succeeds — that needs a real large upload. This stays open until a real large file uploads on web. (If it still fails after the retry, the failure is deterministic, not sporadic, and escalates to the SDK fix below.)
Follow-up (SDK — proper fix, affects native parity)
The faithful fix is in fula-client: enable the blob-backend retry on wasm32 (per-chunk retry matching native — needs a wasm-friendly async sleep) and/or lower the wasm chunk concurrency below the gateway's per-IP headroom. The Dart per-file retry here is a coarse stopgap (it re-uploads the whole file on any single-chunk failure).
Symptom
On the web app, small files upload fine; large files fail with
FulaApiException: Failed to upload large file: AnyhowException(HTTP error: error sending request). Console shows a chunk PUT…/videos-v8/Qm… net::ERR_CONNECTION_CLOSED(plus a benign409bucket-exists).Root cause (confirmed in
fula-client0.6.9 source + the live gateway config)should_use_chunked→put_object_chunked_internal), uploaded 16 at a time (buffer_unordered(MAX_CONCURRENT_CHUNK_UPLOADS = 16)).BLOB_BACKEND_MAX_ATTEMPTS = 4and the retry/backoff constants are all#[cfg(not(target_arch = "wasm32"))]. Native retries a transient PUT failure 4× with backoff; web does a singleput_object.s3.cloud.fx.land:443block allowslimit_conn fula_conn 100,limit_req 600 r/s,client_max_body_size 5G— so 16 concurrent <1 MB PUTs are well within limits. (Theconn_limit_per_ip 10/fula_conn 10belong to other server blocks, not the upload host.)So it's a sporadic connection drop on one chunk (the upstream recycling a connection under the burst), and web's missing retry turns a single drop into a total upload failure. Small files (1 chunk) almost never hit a drop — hence small-OK / large-fail.
Fix (web-only)
Re-add the retry the SDK omits on web:
retryAsyncaround the web upload (web_upload_manager), retrying transient errors only (error sending request/ connection-closed / 5xx) with ~300 ms–1.2 s backoff, up to 4 attempts. Not retried: 4xx and the benign409.putFlatis idempotent (content-addressed chunks; path-keyed forest entry) and a chunk failure happens before the forest upsert/flush (the SDK also deletes the uploaded chunks on failure), so re-running is safe — no duplicates, no dirty-forest state.Tests
lib/core/utils/upload_retry.dartpure helpers — 8 unit tests (transient classification incl. "409 is not retried"; retry / backoff / exhaustion / immediate-rethrow).flutter analyzeclean;flutter build web --releasegreen.Verification (gates close)
Unit tests prove the retry logic, not that the upload now succeeds — that needs a real large upload. This stays open until a real large file uploads on web. (If it still fails after the retry, the failure is deterministic, not sporadic, and escalates to the SDK fix below.)
Follow-up (SDK — proper fix, affects native parity)
The faithful fix is in
fula-client: enable the blob-backend retry on wasm32 (per-chunk retry matching native — needs a wasm-friendly async sleep) and/or lower the wasm chunk concurrency below the gateway's per-IP headroom. The Dart per-file retry here is a coarse stopgap (it re-uploads the whole file on any single-chunk failure).