feat(media): stamp upload attribution on media objects and sidecars#1507
feat(media): stamp upload attribution on media objects and sidecars#1507baxen wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 71b5ee4d7f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Media uploads now carry upload attribution so operators and out-of-band consumers can attribute any stored object without relay internals: - S3 object metadata on the blob and thumbnail PUTs: x-amz-meta-buzz-uploader-id (authenticated Blossom uploader pubkey, hex) and x-amz-meta-buzz-community-id (host-resolved community UUID), readable from a bare HEAD on the object. - The same fields on the BlobMeta sidecar (uploader_id / community_id), nullable with serde defaults so older sidecars still parse. The community always comes from the server-resolved TenantContext (row-zero host binding), never from client input. All three upload paths are covered: image, generic file, and streaming video (the video path attaches metadata via bucket extra_headers so it survives multipart uploads). Note: blobs are shared content-addressed storage across communities, so a re-upload of identical bytes under another tenant overwrites the object metadata with the most recent uploader; the community-scoped sidecar remains the authoritative per-tenant record.
71b5ee4 to
783c549
Compare
tlongwell-block
left a comment
There was a problem hiding this comment.
Reviewed against the #1321 tenant-fence and #1444 Blossom-auth refactors (with Wren) — this is clean: attribution derives only from the authenticated kind:24242 pubkey and the server-resolved TenantContext, never client input; sidecars stay community-scoped; all three upload paths plus derived thumbnails are covered. Verified in rust-s3 0.37 source that bucket-level x-amz-meta-* extra headers are forwarded to InitiateMultipartUpload (request_trait.rs headers()), so the >8 MiB multipart claim holds.
Two non-blocking asks:
-
Hoist the metadata key names.
"buzz-uploader-id"/"buzz-community-id"are repeated as string literals at three call sites inupload.rs. A pair of consts + a smallattribution_meta(uploader_id, community_id)helper would keep the key names from drifting and shrink the call sites. -
Prove the read side. The PR writes HEAD-readable metadata, but
head_with_metadata/BlobHeadMetastill only surface size — nothing in-tree demonstratesx-amz-meta-buzz-uploader-idactually round-trips through our storage wrapper. ExtendingBlobHeadMetawith the metadata map (or a MinIO integration assertion) would bless the S3-HEAD moderation use case properly. Fine as a fast-follow if moderation tooling will HEAD S3 directly.
One semantic worth keeping in the PR description for moderation consumers: blob metadata is last-writer-wins across tenants (shared CAS), and the idempotent short-circuit means same-community re-uploads keep the first uploader's stamp. So blob HEAD is an attribution hint; the per-community sidecar and the MediaUploaded audit log (actor_pubkey/object_id) remain the authoritative records. The in-code comment already says this — 👍.
Hoist upload attribution metadata keys and expose S3 user metadata from BlobHeadMeta so HEAD callers can verify the uploader/community stamp round-trips. Co-authored-by: Bradley Axen <baxen@squareup.com> Signed-off-by: Bradley Axen <baxen@squareup.com>
Stamp uploader display name and tenant host alias alongside authoritative uploader and community IDs so moderation HEAD metadata is easier to read. Co-authored-by: Bradley Axen <baxen@squareup.com> Signed-off-by: Bradley Axen <baxen@squareup.com>
Replace the community alias label with the full server-resolved tenant hostname in S3 metadata and sidecars. Co-authored-by: Bradley Axen <baxen@squareup.com> Signed-off-by: Bradley Axen <baxen@squareup.com>
|
Actually will need another approach, CAS complicates this |
Context
This PR persists upload attribution at upload time.
What changed
S3 object metadata on the blob PUT (all three upload paths — image, generic file, streaming video):
x-amz-meta-buzz-uploader-id— authenticated Blossom uploader pubkey (hex, matching what the relay stores)x-amz-meta-buzz-community-id— host-resolved community UUIDA
HEADon any newly uploaded blob now returns attribution without touching relay internals.Sidecar (
BlobMeta): the same fields as nullableuploader_id/community_idwith serde defaults, so pre-attribution sidecars still deserialize and absent values are omitted (notnull).MediaStorage: newput_with_metadata(builder-based single PUT) andput_file_with_metadata. The streaming video path attaches metadata via bucket-levelextra_headersrather than the stream builder because rust-s3 only forwards builder headers on the small-file branch;extra_headersare also applied toInitiateMultipartUpload, so metadata survives files above the 8 MiB chunk threshold.Tests
x-amz-meta-*header construction: prefixing, control-character value rejection, invalid key rejectioncargo test -p buzz-media(44 passed),cargo clippy -p buzz-media -p buzz-relay --all-targetscleanUpdate — 2026-07-03
Addressed review feedback:
attribution_meta(...)helper so the uploader/community keys cannot drift between blob, video, and thumbnail writes.BlobHeadMetato include the S3 user metadata map returned byhead_with_metadata, with a unit test provingbuzz-uploader-id/buzz-community-idsurface from the rust-s3HeadObjectResultand a MinIO integration assertion for live round-trip coverage.Moderation note: blob object metadata is an attribution hint. Because blobs are shared CAS across tenants, object metadata is last-writer-wins across tenant re-uploads, and the same-community idempotent short-circuit keeps the first uploader's stamp. The community-scoped sidecar and
MediaUploadedaudit log (actor_pubkey/object_id) remain authoritative.Additional local verification:
cargo test -p buzz-media(45 passed, 1 ignored)cargo clippy -p buzz-media --all-targets -- -D warningsUpdate — 2026-07-04
Added human-readable labels for moderation consumers:
x-amz-meta-buzz-uploader-name/ sidecaruploader_namefrom the uploader's configured profile display name, when known.x-amz-meta-buzz-community-host/ sidecarcommunity_hostfrom the full server-resolved tenant host (for examplemoderation.example.com).These labels are sanitized/bounded header-safe readability hints only. The authoritative fields remain
buzz-uploader-id,buzz-community-id, the community-scoped sidecar, and theMediaUploadedaudit log.Additional local verification:
cargo test -p buzz-media(46 passed, 1 ignored)cargo check -p buzz-relay -p buzz-mediacargo clippy -p buzz-media -p buzz-relay --all-targets -- -D warningsUpdate — 2026-07-04 (follow-up)
Changed the community readability label from host-prefix alias to the full server-resolved tenant hostname, per review:
x-amz-meta-buzz-community-hostcommunity_hostRe-ran local verification after the change:
cargo test -p buzz-media(46 passed, 1 ignored)cargo check -p buzz-relay -p buzz-mediacargo clippy -p buzz-media -p buzz-relay --all-targets -- -D warnings