Skip to content

Add synced app-defined metadata for remote resources#75

Draft
hahn-kev wants to merge 3 commits into
mainfrom
cursor/remote-resource-metadata
Draft

Add synced app-defined metadata for remote resources#75
hahn-kev wants to merge 3 commits into
mainfrom
cursor/remote-resource-metadata

Conversation

@hahn-kev

@hahn-kev hahn-kev commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add generic RemoteResource<TMetadata> with synced Metadata so clients can list media without downloading blobs
  • Introduce ResourceService<TMetadata>, HarmonyResource<TMetadata>, and AddCrdtRemoteResources<TMetadata>() DI registration
  • Add CRDT changes: SetRemoteResourceMetadataChange<TMetadata>, DeleteRemoteResourceChange<TMetadata> (stable delete:RemoteResource type name)
  • Make IRemoteResourceService<TMetadata> generic; UploadResult<TMetadata> can return server-authored metadata that overrides client-passed metadata on upload
  • Add NoMetadata marker type for apps that do not need synced metadata fields
  • Sample: MediaMetadata record in SIL.Harmony.Sample

Breaking changes

  • RemoteResource, ResourceService, HarmonyResource, and IRemoteResourceService are now generic-only
  • CrdtConfig.AddRemoteResourceEntity() (non-generic) removed — use AddCrdtRemoteResources<TMetadata>()
  • DeleteChange<RemoteResource> replaced by DeleteRemoteResourceChange<TMetadata>
  • SyncWithResourceUpload now requires <TMetadata> type argument

Summary by CodeRabbit

Release Notes

  • New Features

    • Resources now support optional metadata (e.g., filename, MIME type, file size) that syncs with remote storage.
    • Upload results include metadata returned from remote services.
    • Added ability to update resource metadata after creation.
    • Improved remote resource deletion handling.
  • Tests

    • Added comprehensive test coverage for resource metadata operations and synchronization.

hahn-kev and others added 3 commits June 17, 2026 10:02
…teResource<TMetadata>.

Apps opt in with ResourceService<TMetadata> and AddCrdtRemoteResources<TMetadata> so metadata participates in CRDT sync.

Co-authored-by: Cursor <cursoragent@cursor.com>
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae3404e8-affe-468f-b461-8f5f2c2da649

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cursor/remote-resource-metadata

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/SIL.Harmony/ResourceService.cs (1)

126-146: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Server-returned metadata not persisted during pending upload completion.

When UploadPendingResources processes pending uploads, the uploadResult may contain server-authored Metadata (similar to AddLocalResource at line 80). However, only RemoteResourceUploadedChange is created, which sets the RemoteId but doesn't persist any metadata returned by the server.

If the server can return authoritative metadata during upload (e.g., computed hashes, timestamps), this would be lost for resources that were initially created as pending uploads.

Consider adding a SetRemoteResourceMetadataChange when uploadResult.Metadata is not null:

Proposed fix
 foreach (var localResource in pendingUploads)
 {
     var uploadResult =
         await remoteResourceService.UploadResource(localResource.Id, localResource.LocalPath);
     changes.Add(new RemoteResourceUploadedChange<TMetadata>(localResource.Id, uploadResult.RemoteId));
+    if (uploadResult.Metadata is not null)
+    {
+        changes.Add(new SetRemoteResourceMetadataChange<TMetadata>(localResource.Id, uploadResult.Metadata));
+    }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/SIL.Harmony/ResourceService.cs` around lines 126 - 146, In the
UploadPendingResources method, after creating the RemoteResourceUploadedChange
for each uploaded resource, check if uploadResult.Metadata is not null. When
metadata exists, also create a SetRemoteResourceMetadataChange with the
server-returned metadata and add it to the changes list alongside the
RemoteResourceUploadedChange. This ensures that any authoritative metadata
returned by the server during upload is persisted in the same way as handled in
the AddLocalResource method.
🧹 Nitpick comments (1)
src/SIL.Harmony/CrdtConfig.cs (1)

89-96: 💤 Low value

Consider using consistent JSON serialization options.

The Metadata property conversion uses (JsonSerializerOptions?)null (default options), while the rest of the CRDT system uses configured JsonSerializerOptions from CrdtConfig.JsonSerializerOptions. If custom type resolvers or converters are needed for metadata types, this could cause serialization mismatches.

If this is intentional to keep metadata serialization simple and independent, consider adding a brief comment explaining the design choice.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/SIL.Harmony/CrdtConfig.cs` around lines 89 - 96, The Metadata property
conversion is using default JSON serialization options (null) for both
JsonSerializer.Serialize and JsonSerializer.Deserialize calls, while the rest of
the CRDT system uses configured options from CrdtConfig.JsonSerializerOptions.
Either update both the Serialize and Deserialize conversions to use
CrdtConfig.JsonSerializerOptions instead of null to maintain consistency, or if
intentional, add a brief comment in the HasConversion block explaining why
Metadata serialization is kept simple and independent from the rest of the CRDT
configuration.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/SIL.Harmony.Tests/ResourceTests/RemoteServiceMock.cs`:
- Around line 47-52: The _metadata dictionary is being written with normalized
keys using Path.GetFullPath(localPath) in the SetUploadMetadata method on line
52, but read with raw localPath in the TryGetValue call on line 47. This causes
relative and absolute path variants to have mismatched keys and fail lookups.
Fix this by normalizing the key in the TryGetValue call on line 47 to use
Path.GetFullPath(localPath) instead of raw localPath, ensuring consistency
between how metadata is stored in SetUploadMetadata and retrieved in the upload
result method.

In `@src/SIL.Harmony/Resource/RemoteResource.cs`:
- Around line 47-50: The CloneMetadata method lacks validation to ensure
TMetadata can be safely serialized and deserialized to JSON, which can cause
runtime crashes during copy operations when non-serializable types are used. Add
fail-fast validation in the CloneMetadata method to check whether the metadata
instance can be successfully serialized before attempting the JSON round-trip
serialization, and throw a clear, descriptive exception if serialization fails.
This ensures deterministic error handling and prevents silent failures during
sync flows.

In `@src/SIL.Harmony/ResourceService.cs`:
- Around line 157-164: The UploadPendingResource method does not persist server
metadata from the uploadResult when available. When creating the
RemoteResourceUploadedChange<TMetadata> object, you need to extract and include
the metadata from uploadResult (in addition to the localResource.Id and
uploadResult.RemoteId that are already being passed). Check the
RemoteResourceUploadedChange<TMetadata> constructor to understand what metadata
parameter it expects, then update the method to pass the appropriate metadata
from uploadResult to ensure server-provided metadata is properly persisted in
the change log.

---

Outside diff comments:
In `@src/SIL.Harmony/ResourceService.cs`:
- Around line 126-146: In the UploadPendingResources method, after creating the
RemoteResourceUploadedChange for each uploaded resource, check if
uploadResult.Metadata is not null. When metadata exists, also create a
SetRemoteResourceMetadataChange with the server-returned metadata and add it to
the changes list alongside the RemoteResourceUploadedChange. This ensures that
any authoritative metadata returned by the server during upload is persisted in
the same way as handled in the AddLocalResource method.

---

Nitpick comments:
In `@src/SIL.Harmony/CrdtConfig.cs`:
- Around line 89-96: The Metadata property conversion is using default JSON
serialization options (null) for both JsonSerializer.Serialize and
JsonSerializer.Deserialize calls, while the rest of the CRDT system uses
configured options from CrdtConfig.JsonSerializerOptions. Either update both the
Serialize and Deserialize conversions to use CrdtConfig.JsonSerializerOptions
instead of null to maintain consistency, or if intentional, add a brief comment
in the HasConversion block explaining why Metadata serialization is kept simple
and independent from the rest of the CRDT configuration.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 40f5d8c9-4321-4140-a9d9-35674e8f033c

📥 Commits

Reviewing files that changed from the base of the PR and between 94ffcfa and b364dc8.

📒 Files selected for processing (20)
  • src/SIL.Harmony.Core/IRemoteResourceService.cs
  • src/SIL.Harmony.Sample/CrdtSampleKernel.cs
  • src/SIL.Harmony.Sample/MediaMetadata.cs
  • src/SIL.Harmony.Tests/DbContextTests.VerifyModel.verified.txt
  • src/SIL.Harmony.Tests/ResourceTests/RemoteResourcesMetadataTests.cs
  • src/SIL.Harmony.Tests/ResourceTests/RemoteResourcesTests.cs
  • src/SIL.Harmony.Tests/ResourceTests/RemoteServiceMock.cs
  • src/SIL.Harmony.Tests/ResourceTests/WordResourceTests.cs
  • src/SIL.Harmony/CrdtConfig.cs
  • src/SIL.Harmony/CrdtKernel.cs
  • src/SIL.Harmony/Resource/CreateRemoteResourceChange.cs
  • src/SIL.Harmony/Resource/CreateRemoteResourcePendingUpload.cs
  • src/SIL.Harmony/Resource/DeleteRemoteResourceChange.cs
  • src/SIL.Harmony/Resource/HarmonyResource.cs
  • src/SIL.Harmony/Resource/RemoteResource.cs
  • src/SIL.Harmony/Resource/RemoteResourceNotEnabledException.cs
  • src/SIL.Harmony/Resource/RemoteResourceUploadedChange.cs
  • src/SIL.Harmony/Resource/SetRemoteResourceMetadataChange.cs
  • src/SIL.Harmony/ResourceService.cs
  • src/SIL.Harmony/SyncHelper.cs

Comment on lines +47 to +52
_metadata.TryGetValue(localPath, out var metadata);
return new UploadResult<MediaMetadata>(remoteId, metadata);
}

public void SetUploadMetadata(string localPath, MediaMetadata metadata) =>
_metadata[Path.GetFullPath(localPath)] = metadata;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Normalize metadata lookup key in upload path handling.

Line 47 reads _metadata using raw localPath, but Line 52 writes using Path.GetFullPath(localPath). Equivalent relative/absolute inputs can miss metadata and skip expected override behavior.

Proposed fix
-        _metadata.TryGetValue(localPath, out var metadata);
+        _metadata.TryGetValue(Path.GetFullPath(localPath), out var metadata);
         return new UploadResult<MediaMetadata>(remoteId, metadata);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/SIL.Harmony.Tests/ResourceTests/RemoteServiceMock.cs` around lines 47 -
52, The _metadata dictionary is being written with normalized keys using
Path.GetFullPath(localPath) in the SetUploadMetadata method on line 52, but read
with raw localPath in the TryGetValue call on line 47. This causes relative and
absolute path variants to have mismatched keys and fail lookups. Fix this by
normalizing the key in the TryGetValue call on line 47 to use
Path.GetFullPath(localPath) instead of raw localPath, ensuring consistency
between how metadata is stored in SetUploadMetadata and retrieved in the upload
result method.

Comment on lines +47 to +50
private static TMetadata? CloneMetadata(TMetadata? metadata)
{
if (metadata is null) return null;
return JsonSerializer.Deserialize<TMetadata>(JsonSerializer.Serialize(metadata));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Configured metadata types via AddCrdtRemoteResources<TMetadata>:"
rg -nPo 'AddCrdtRemoteResources<\s*\K([A-Za-z_][A-Za-z0-9_\.]*)' --type=cs \
  | sed -E 's/^[^:]+:[0-9]+://' | sort -u

echo
echo "Declaration scan (to detect interface/abstract metadata registrations):"
types=$(rg -nPo 'AddCrdtRemoteResources<\s*\K([A-Za-z_][A-Za-z0-9_\.]*)' --type=cs \
  | sed -E 's/^[^:]+:[0-9]+://' | sed 's/.*\.//' | sort -u)
for t in $types; do
  echo "== $t =="
  rg -nP "^\s*(public|internal|protected|private)?\s*(abstract\s+class|interface|class|record)\s+${t}\b" --type=cs || true
done

echo
echo "Locate metadata clone call sites:"
rg -nP 'CloneMetadata|JsonSerializer\.Deserialize<\s*TMetadata\s*>\(' --type=cs

Repository: sillsdev/harmony

Length of output: 887


🏁 Script executed:

# Check RemoteResource<TMetadata> class definition and TMetadata constraints
rg -nA 20 "class RemoteResource<TMetadata>" --type=cs | head -60

# Also check if TMetadata has any where clauses
rg -nB 5 -A 20 "RemoteResource<TMetadata>" --type=cs | grep -E "(where|class|record)" | head -20

Repository: sillsdev/harmony

Length of output: 3207


🏁 Script executed:

# Examine CrdtConfig.cs to understand how AddCrdtRemoteResources enforces constraints
rg -nB 5 -A 20 "AddCrdtRemoteResources" --type=cs

Repository: sillsdev/harmony

Length of output: 3227


🏁 Script executed:

# Check if there are any tests or comments documenting metadata type requirements
rg -ni "(metadata.*constraint|seriali.*metadata|json.*metadata)" --type=cs | head -20

Repository: sillsdev/harmony

Length of output: 1016


CloneMetadata lacks validation for JSON-serializable metadata types, risking runtime crashes during copy operations.

RemoteResource<TMetadata> is constrained only to where TMetadata : class, which permits any reference type—including interfaces, abstract classes, and non-serializable concrete types. The Copy() method calls CloneMetadata() which silently assumes TMetadata can survive JSON round-trip serialization. Without validation, sync flows fail with unclear exceptions.

Add fail-fast validation and deterministic error handling:

Proposed hardening diff
 private static TMetadata? CloneMetadata(TMetadata? metadata)
 {
     if (metadata is null) return null;
-    return JsonSerializer.Deserialize<TMetadata>(JsonSerializer.Serialize(metadata));
+    if (typeof(TMetadata).IsInterface || typeof(TMetadata).IsAbstract)
+    {
+        throw new InvalidOperationException(
+            $"Metadata type '{typeof(TMetadata).FullName}' must be a concrete JSON-serializable type.");
+    }
+
+    try
+    {
+        return JsonSerializer.Deserialize<TMetadata>(JsonSerializer.Serialize(metadata))
+            ?? throw new InvalidOperationException(
+                $"Failed to clone metadata of type '{typeof(TMetadata).FullName}'.");
+    }
+    catch (Exception ex) when (ex is JsonException or NotSupportedException)
+    {
+        throw new InvalidOperationException(
+            $"Metadata type '{typeof(TMetadata).FullName}' is not JSON-serializable for copy operations.", ex);
+    }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private static TMetadata? CloneMetadata(TMetadata? metadata)
{
if (metadata is null) return null;
return JsonSerializer.Deserialize<TMetadata>(JsonSerializer.Serialize(metadata));
private static TMetadata? CloneMetadata(TMetadata? metadata)
{
if (metadata is null) return null;
if (typeof(TMetadata).IsInterface || typeof(TMetadata).IsAbstract)
{
throw new InvalidOperationException(
$"Metadata type '{typeof(TMetadata).FullName}' must be a concrete JSON-serializable type.");
}
try
{
return JsonSerializer.Deserialize<TMetadata>(JsonSerializer.Serialize(metadata))
?? throw new InvalidOperationException(
$"Failed to clone metadata of type '{typeof(TMetadata).FullName}'.");
}
catch (Exception ex) when (ex is JsonException or NotSupportedException)
{
throw new InvalidOperationException(
$"Metadata type '{typeof(TMetadata).FullName}' is not JSON-serializable for copy operations.", ex);
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/SIL.Harmony/Resource/RemoteResource.cs` around lines 47 - 50, The
CloneMetadata method lacks validation to ensure TMetadata can be safely
serialized and deserialized to JSON, which can cause runtime crashes during copy
operations when non-serializable types are used. Add fail-fast validation in the
CloneMetadata method to check whether the metadata instance can be successfully
serialized before attempting the JSON round-trip serialization, and throw a
clear, descriptive exception if serialization fails. This ensures deterministic
error handling and prevents silent failures during sync flows.

Comment on lines +157 to 164
public async Task UploadPendingResource(LocalResource localResource, Guid clientId,
IRemoteResourceService<TMetadata> remoteResourceService)
{
ValidateResourcesSetup();
var uploadResult = await remoteResourceService.UploadResource(localResource.Id, localResource.LocalPath);
await _dataModel.AddChange(clientId, new RemoteResourceUploadedChange(localResource.Id, uploadResult.RemoteId));
await _dataModel.AddChange(clientId,
new RemoteResourceUploadedChange<TMetadata>(localResource.Id, uploadResult.RemoteId));
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Same issue: server metadata not persisted in single-resource upload.

UploadPendingResource also doesn't persist metadata from uploadResult when it's provided by the server.

Proposed fix
 var uploadResult = await remoteResourceService.UploadResource(localResource.Id, localResource.LocalPath);
-await _dataModel.AddChange(clientId,
-    new RemoteResourceUploadedChange<TMetadata>(localResource.Id, uploadResult.RemoteId));
+var changes = new List<IChange> { new RemoteResourceUploadedChange<TMetadata>(localResource.Id, uploadResult.RemoteId) };
+if (uploadResult.Metadata is not null)
+{
+    changes.Add(new SetRemoteResourceMetadataChange<TMetadata>(localResource.Id, uploadResult.Metadata));
+}
+await _dataModel.AddChanges(clientId, changes);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/SIL.Harmony/ResourceService.cs` around lines 157 - 164, The
UploadPendingResource method does not persist server metadata from the
uploadResult when available. When creating the
RemoteResourceUploadedChange<TMetadata> object, you need to extract and include
the metadata from uploadResult (in addition to the localResource.Id and
uploadResult.RemoteId that are already being passed). Check the
RemoteResourceUploadedChange<TMetadata> constructor to understand what metadata
parameter it expects, then update the method to pass the appropriate metadata
from uploadResult to ensure server-provided metadata is properly persisted in
the change log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant