Skip to content

Web: large-file uploads fail (ERR_CONNECTION_CLOSED) — fula-client blob retry is compiled out on wasm32 #50

@ehsan6sha

Description

@ehsan6sha

Symptom

On the web app, small files upload fine; large files fail with FulaApiException: Failed to upload large file: AnyhowException(HTTP error: error sending request). Console shows a chunk PUT …/videos-v8/Qm… net::ERR_CONNECTION_CLOSED (plus a benign 409 bucket-exists).

Root cause (confirmed in fula-client 0.6.9 source + the live gateway config)

  • Large content is chunked into many <1 MB block PUTs (should_use_chunkedput_object_chunked_internal), uploaded 16 at a time (buffer_unordered(MAX_CONCURRENT_CHUNK_UPLOADS = 16)).
  • The blob-backend PUT retry loop is compiled out on wasm32: BLOB_BACKEND_MAX_ATTEMPTS = 4 and the retry/backoff constants are all #[cfg(not(target_arch = "wasm32"))]. Native retries a transient PUT failure 4× with backoff; web does a single put_object.
  • The gateway is not the limiter (ruled out by reading the nginx config): the s3.cloud.fx.land:443 block allows limit_conn fula_conn 100, limit_req 600 r/s, client_max_body_size 5G — so 16 concurrent <1 MB PUTs are well within limits. (The conn_limit_per_ip 10 / fula_conn 10 belong to other server blocks, not the upload host.)

So it's a sporadic connection drop on one chunk (the upstream recycling a connection under the burst), and web's missing retry turns a single drop into a total upload failure. Small files (1 chunk) almost never hit a drop — hence small-OK / large-fail.

Fix (web-only)

Re-add the retry the SDK omits on web: retryAsync around the web upload (web_upload_manager), retrying transient errors only (error sending request / connection-closed / 5xx) with ~300 ms–1.2 s backoff, up to 4 attempts. Not retried: 4xx and the benign 409. putFlat is idempotent (content-addressed chunks; path-keyed forest entry) and a chunk failure happens before the forest upsert/flush (the SDK also deletes the uploaded chunks on failure), so re-running is safe — no duplicates, no dirty-forest state.

Tests

  • lib/core/utils/upload_retry.dart pure helpers — 8 unit tests (transient classification incl. "409 is not retried"; retry / backoff / exhaustion / immediate-rethrow). flutter analyze clean; flutter build web --release green.

Verification (gates close)

Unit tests prove the retry logic, not that the upload now succeeds — that needs a real large upload. This stays open until a real large file uploads on web. (If it still fails after the retry, the failure is deterministic, not sporadic, and escalates to the SDK fix below.)

Follow-up (SDK — proper fix, affects native parity)

The faithful fix is in fula-client: enable the blob-backend retry on wasm32 (per-chunk retry matching native — needs a wasm-friendly async sleep) and/or lower the wasm chunk concurrency below the gateway's per-IP headroom. The Dart per-file retry here is a coarse stopgap (it re-uploads the whole file on any single-chunk failure).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions