[AIT-311] Add Claude skill for translating examples to Swift, and an example translation#3283
[AIT-311] Add Claude skill for translating examples to Swift, and an example translation#3283lawrence-forooghian wants to merge 3 commits intomainfrom
Conversation
Tweak the change made in 1c4e6fe: instead of allow-listing different subdirectories (e.g. .claude/skills) ad-hoc as we need to, make use of Claude's scope system (i.e. ignore local scope, under the definitions given in [1]). [1] https://code.claude.com/docs/en/settings#configuration-scopes
When invoked as, for example:
> /translate-examples-to-swift translate all the example code in @src/pages/docs/ai-transport
it will translate all of the referenced examples to Swift, making sure
to produce code which has been verified to compile. It also runs an
independent verification subagent which reviews the correctness of the
translation and performs an second compilation attempt.
As part of the verification, it also generates a single-page app with a
UI that makes it easy for a human to review the translations; you can then
export a Markdown or JSON file with the feedback (for passing to Claude
for it to iterate on this feedback).
I don't yet have a great process for getting it to apply review feedback
when starting from a fresh context; I've just been telling it something
like "translations x were generated using this skill; now apply feedback
y", but it doesn't do a great job of updating the translation JSON files
(and thus the data displayed in the review app) to reflect the changes
it's made without wiping out any unrelated notes from the original
translation.
The skill also gives Claude the ability to review Swift translations in
isolation (i.e. not as part of a translation run and thus without the
supporting artifacts). For this to work properly, we need to keep the
context comments (added by the translation process) in the MDX files. I
think that we should keep these _anyway_, because I think we should at
some point consider setting up tooling to ensure that _all_ of our code
examples in the docs repo actually are valid and compile. And this would
be a stepping stone to that. Note that these harness comments contain
random IDs, which look a bit useless in isolation; however, they are
definitely useful during the the translation-and-verification processes
(which are independent and thus need some sort of ID for correlation;
previously just tried using a sequential counter but this can easily get
out of sync due to merging new or reordered examples, or the two
different subagents counting examples differently), but I believe will
also continue to be useful as a simple way of referring to an example
when working with Claude ("fix example Kx9mQ3").
I wrote the original version of this skill and then got Claude to do
some heavy iteration of it based on my feedback when testing. I haven't
reviewed any of the skill's supporting files — i.e. the scripts or HTML
or schemas — in any detail.
As part of this change — the first shared addition to the .claude
directory — I've changed the gitignore rules to only ignore local scope
(definitions given in [1]).
A few things that could be improved in the future (I had to draw a line
under this task at some point):
- the review app for some reason requires that you click twice on the
"Flag" or "Approve" button before it collapses the element
- the review app's exported Markdown file's references are done by line
number, which is a slightly meaningless value given that we're
inserting new code into the file as part of translation; switch it to
use IDs like the JSON example
- make the review app accept multi-line comments
- we may be able to simplify the test harness by instead using Swift's
"MainActor isolation by default" mode
- thinking about how to restructure the skill so that it can be extended
to translating other languages (see PR comment [2])
Note that I've chosen to favour an `async` / `await` approach (bridged
with continuations) instead of nested callbacks. Having experimented
with both approaches, I concluded that this is the better of the two.
The bridging boilerplate is repetitive but local — each continuation is
a self-contained block that's easy to skim past. The structural benefit
is that the overall control flow becomes linear and readable, matching
the JS; this makes things easier for users to understand and for us to
review, in particular in more complicated examples that do things like
loading multiple history pages in a loop.
(Note: An earlier version of this skill was already used in b49924d —
that commit is a bit messed up by a botched rebase, it seems — before
being properly introduced here. I've updated the harness comments
introduced there to be in line with the format used here.)
[1] https://code.claude.com/docs/en/settings#configuration-scopes
[2] #3192 (comment)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This demonstrates the translation skill added in 2bc254d. I've reviewed the translations.
dcf84c4 to
2366368
Compare
.claude/skills/translate-examples-to-swift/prompts/translation-subagent.md
Show resolved
Hide resolved
.claude/skills/translate-examples-to-swift/prompts/verification-subagent.md
Show resolved
Hide resolved
.claude/skills/translate-examples-to-swift/prompts/verification-subagent.md
Show resolved
Hide resolved
|
Thanks for the feedback on the skill, Marat. I'll take a look, but as mentioned the skill is very much a "Claude wrote this and I've not reviewed the individual words in detail" thing. Did you have any thoughts on the translations themselves? |
| ```swift | ||
| // Publish initial message and capture the serial for appending tokens | ||
| let publishResult = try await withCheckedThrowingContinuation { (continuation: CheckedContinuation<ARTPublishResult, Error>) in | ||
| channel.publish([.init(name: "response", data: "")]) { result, error in |
There was a problem hiding this comment.
Inconsistent publish API call style. This uses channel.publish([.init(name: "response", data: "")]) (array-of-messages form), but the skill's translation prompt examples at prompts/translation-subagent.md:236 use channel.publish("response", data: "") (name+data form). Both are valid ably-cocoa APIs but the inconsistency between what the skill teaches and what was actually produced could cause confusion for future translation runs. Should pick one and be consistent. The array form is arguably more correct since the JS passes an object { name: 'response', data: '' }, but then the skill prompt examples should be updated to match.
Same issue at lines 403 and 1154.
| */} | ||
| ```swift | ||
| let options = ARTClientOptions(key: "your-api-key") | ||
| options.transportParams = ["appendRollupWindow": .withString("100")] // 10 messages/s |
There was a problem hiding this comment.
transportParams value .withString("100") -- is this the right API? The JS/Python/Java examples all pass a plain string "100", but the Swift uses .withString("100") which implies ARTStringifiable or similar. Worth verifying this compiles against ably-cocoa and that .withString is the correct way to set string values in transportParams. If transportParams is just [String: String], this would be wrong.
| @MainActor | ||
| func example_anthropic_message_per_response_1() async throws { | ||
| // --- example code starts here --- | ||
| func example() async throws { |
There was a problem hiding this comment.
CheckedContinuation<Void, any Error> vs CheckedContinuation<Void, Error>. The guide examples (line 657 and 958) use any Error (existential) while every example in message-per-response.mdx uses plain Error. Both compile, but Error (without any) is the standard pattern and what the skill prompt teaches. These guide files should use the same form for consistency.
| } | ||
|
|
||
| // Example: stream returns events like { type: 'token', text: 'Hello' } | ||
| for try await event in stream { |
There was a problem hiding this comment.
for try await vs for await -- inconsistent with skill prompt and possibly unnecessary. This line and lines 244, 423, 1169 use for try await event in stream, but the skill's translation prompt examples (lines 250, 534) use for await event in stream (no try). The difference matters: for try await is needed when the AsyncSequence's Failure type is non-Never, but all harness signatures declare Never as the failure type (e.g., any AsyncSequence<..., Never>). With a Never failure type, for await (without try) should suffice. The try is harmless but misleading to readers.
| @@ -0,0 +1,315 @@ | |||
| --- | |||
| name: translate-examples-to-swift | |||
| description: Translates inline JavaScript example code to Swift | |||
There was a problem hiding this comment.
Description is too terse for reliable skill triggering. "Translates inline JavaScript example code to Swift" is generic enough that Claude may not invoke this skill when the user says things like "add Swift examples" or "translate to ably-cocoa". Consider expanding to something like: "Translates inline JavaScript example code to Swift in Ably documentation MDX files. Use this skill whenever adding Swift code blocks to docs, translating JS examples to Swift/ably-cocoa, or when the user mentions Swift translations, even if they don't explicitly say 'translate'."
| - `consolidate.sh` - Merges translation and verification JSONs, validates, generates review HTML | ||
| - `generate-translation-stubs.sh` - Generates stub translation JSONs from verification data (for verify-only mode) | ||
|
|
||
| Scripts are in `.claude/skills/translate-examples-to-swift/review-app/`: |
There was a problem hiding this comment.
Minor copy-paste error. This repeats "Scripts are in" from line 294 above. Should be "Review app scripts are in" or similar to distinguish the two locations.
| - `nonisolated(unsafe)` (forbidden by C9) | ||
| - Force-unwraps beyond the `(result, error)` callback convention (i.e. force-unwrapping a result that isn't being used) | ||
| - Logic inside continuation callbacks beyond resuming the continuation (values should be extracted after the `await`, not inside the callback) | ||
| - Fire-and-forget SDK calls using bare callbacks instead of `Task { }` with a continuation inside |
There was a problem hiding this comment.
Verification checklist contradicts translation guidance. This says to check for "Fire-and-forget SDK calls using bare callbacks instead of Task { } with a continuation inside", implying fire-and-forget calls should use Task{} with continuations. But the translation prompt explicitly says fire-and-forget calls should be called directly without a callback or Task wrapper (translation-subagent.md lines 217-219). This checklist item reads as the opposite of the intended rule. Should be reworded to check for fire-and-forget calls that are unnecessarily wrapped in Task{} or continuations when they should just be called directly.
| ```swift | ||
| // The body of this function is the translation of the example. | ||
| // Function name includes the example ID | ||
| func example_Kx9mQ3(channel: ARTRealtimeChannel, stream: any AsyncSequence<(type: String, text: String), Never>) async throws { |
There was a problem hiding this comment.
Harness function signature is missing & Sendable. This example shows any AsyncSequence<(type: String, text: String), Never> without & Sendable, but every actual harness comment in the MDX files includes & Sendable (e.g., message-per-response.mdx:141). Under Swift 6 strict concurrency this matters for passing across isolation boundaries. The prompt examples should include & Sendable to match what is actually produced.
|
The JS example on this page uses |
|
|
|
For full transparency, I've reviewed this by pulling the branch down and probing Claude with questions about it. I posted comments using Claude which are points that I think should be looked at but generally speaking my approach with skills is that you'll know if these are actual issues once you use the skill a few times - still worth having a look at points that might look obviously wrong and fixing them as needed. More general feedback would be to run /skill-creator which is a skill auditor/creator/validator made by Anthropic themselves. My own personal opinion was that the skill.md seems long? maybe we can break it up into more resources? but again, it doesn't concern me if you think the skill works well as is and we can evolve this over time. Specifically to this skill and where it should live... who owns this skill? how do we keep it up to date? do you think it is better off in Ably OS? |
|
also, not in scope for this PR, but how do you go about validating the outputted translation? how do you ascertain confidence that it followed the skill correctly? Can we have another skill to validate? or are we okay with shifting the burden from writing the translation to reviewing it in PRs instead? |
|
Thanks for the comments Umair, will take a look when I have some time. But re your overall comments: The skill was created by Claude, and has gone through many iterations as I've identified various things it was doing wrong. I did not, however, use the
Unless I've misunderstood what it is that you're suggesting, I think that it very much is in scope for this PR:
|
Right now this skill requires you to run it on a Mac and have Xcode installed, so realistically the only people who are going to be running it are you, Marat, or me. I'd be tempted to say "I own it". And it's a skill that is only useful within the context of the docs repo so I think here is fine? |
|
|
||
| ``` | ||
| Tool: Task | ||
| subagent_type: "general-purpose" |
There was a problem hiding this comment.
Is it how subagents are being spawned, or why this snippet is here? wdyt
Description
Replaces #3192.
Adds a
/translate-examples-to-swiftClaude skill. When invoked as, for example:it will translate all of the referenced examples to Swift, making sure to produce code which has been verified to compile. It also runs an independent verification subagent which reviews the correctness of the translation and performs an second compilation attempt. It then produces a webpage with a UI for a human to review the translations.
There's probably plenty of improvement that can be done to this skill still, but I need to draw a line under it and move on to other stuff; we can iterate in the future.
I've then used this skill to translate one of the AI Transport example files. I decided not to translate them all in one go in order to reduce review burden and make sure everyone is happy with the approach.
See commit messages for more (many, many more 😅) details, if you're so inclined. I'm not expecting — or, to be honest — hoping for — a large amount of feedback on the skill itself; I think it's largely a "something is better than nothing" thing, and it's been through quite a lot of iteration already.