feat(llm): add Cursor Agent SDK as opt-in review backend#140
Conversation
Introduce internal/reviewbackend so main review work runs through a shared Backend interface. ChatCompletionsBackend preserves the existing OpenAI and Anthropic tool-call loop; CursorAgentBackend runs local Cursor agents with OCR custom tools. Wire both paths through review and llm commands.
Register cursor as a built-in provider and parse usage maps from Cursor stream events so review summaries report token counts.
Show Cursor in the official preset list with SDK bridge hints during API key setup and after saving provider configuration.
Update provider list test to include cursor in the sorted registry.
9bcee09 to
579782c
Compare
| model := resolved.Endpoint.Model | ||
| source := resolved.Endpoint.Source | ||
| if resolved.Kind == reviewbackend.KindCursorAgent { | ||
| model = resolved.Cursor.Model | ||
| source = resolved.Cursor.Source | ||
| } |
There was a problem hiding this comment.
When resolved.Kind is KindCursorAgent, resolved.Endpoint is a zero-value struct (empty strings), so lines 59-60 assign empty strings that are immediately overwritten on lines 62-63. While functionally correct today, this pattern is fragile: if a new Kind is added in the future and this if block isn't updated, model and source would silently be empty strings.
Consider using a switch statement or deriving these values from the Backend interface methods (backend.Model() / backend.Source()) which already handle this dispatch correctly, making this code more maintainable and less error-prone.
Suggestion:
| model := resolved.Endpoint.Model | |
| source := resolved.Endpoint.Source | |
| if resolved.Kind == reviewbackend.KindCursorAgent { | |
| model = resolved.Cursor.Model | |
| source = resolved.Cursor.Source | |
| } | |
| model := backend.Model() | |
| source := backend.Source() |
| llmClient := reviewbackend.TextClient(backend) | ||
| model := resolved.Endpoint.Model | ||
| if resolved.Kind == reviewbackend.KindCursorAgent { | ||
| model = resolved.Cursor.Model | ||
| } |
There was a problem hiding this comment.
The Backend interface already exposes a Model() method that returns the correct model for each backend kind (see ChatCompletionsBackend.Model() and CursorAgentBackend.Model()). Consider using backend.Model() instead of manually branching on resolved.Kind. This would be simpler, avoid coupling to the internal structure of ResolvedBackend, and automatically handle any new backend kinds added in the future without requiring changes here.
Suggestion:
| llmClient := reviewbackend.TextClient(backend) | |
| model := resolved.Endpoint.Model | |
| if resolved.Kind == reviewbackend.KindCursorAgent { | |
| model = resolved.Cursor.Model | |
| } | |
| llmClient := reviewbackend.TextClient(backend) | |
| model := backend.Model() |
| a.mu.Lock() | ||
| if !a.hasTurnUsage { | ||
| a.prompt = 0 | ||
| a.completion = 0 | ||
| a.cacheRead = 0 | ||
| a.cacheWrite = 0 | ||
| a.hasTurnUsage = true | ||
| } | ||
| a.mu.Unlock() | ||
| a.merge(ui) |
There was a problem hiding this comment.
Potential race condition: Between releasing the lock (line 33) and re-acquiring it inside merge (line 48), a concurrent token-delta event could observe hasTurnUsage == true and discard its delta tokens. While the Cursor SDK may serialize callback invocations in practice, this unlock-then-lock pattern is fragile and unnecessarily complex.
Consider performing the reset and merge atomically under a single lock acquisition to eliminate the gap entirely.
Suggestion:
| a.mu.Lock() | |
| if !a.hasTurnUsage { | |
| a.prompt = 0 | |
| a.completion = 0 | |
| a.cacheRead = 0 | |
| a.cacheWrite = 0 | |
| a.hasTurnUsage = true | |
| } | |
| a.mu.Unlock() | |
| a.merge(ui) | |
| a.mu.Lock() | |
| if !a.hasTurnUsage { | |
| a.prompt = 0 | |
| a.completion = 0 | |
| a.cacheRead = 0 | |
| a.cacheWrite = 0 | |
| a.hasTurnUsage = true | |
| } | |
| a.prompt += ui.PromptTokens | |
| a.completion += ui.CompletionTokens | |
| a.cacheRead += ui.CacheReadTokens | |
| a.cacheWrite += ui.CacheWriteTokens | |
| a.mu.Unlock() |
| if err != nil { | ||
| if hooks.SetError != nil && rec != nil { | ||
| hooks.SetError(rec, err, duration.Milliseconds()) | ||
| } | ||
| return wrapCursorError(err) | ||
| } |
There was a problem hiding this comment.
Missing telemetry on error paths: When run.Wait(ctx) returns an error, this code records the error via hooks.SetError but does not call hooks.RecordUsage or hooks.RecordLLMRequest. Compare with ChatCompletionsBackend.ReviewFile, which calls hooks.RecordLLMRequest(duration.Milliseconds(), 0, "error") on its error path.
This means that failed Cursor agent runs will be invisible to usage tracking, billing/cost analysis, and observability dashboards. Partial usage data accumulated via usageAcc before the failure is also silently discarded.
Similarly, the agent.Send error path above (around line 110) also lacks RecordLLMRequest.
Suggestion:
| if err != nil { | |
| if hooks.SetError != nil && rec != nil { | |
| hooks.SetError(rec, err, duration.Milliseconds()) | |
| } | |
| return wrapCursorError(err) | |
| } | |
| if err != nil { | |
| if hooks.SetError != nil && rec != nil { | |
| hooks.SetError(rec, err, duration.Milliseconds()) | |
| } | |
| usage := usageAcc.usage() | |
| if usage != nil && hooks.RecordUsage != nil { | |
| hooks.RecordUsage(usage) | |
| } | |
| totalTokens := int64(0) | |
| if usage != nil { | |
| totalTokens = usage.TotalTokens | |
| } | |
| if hooks.RecordLLMRequest != nil { | |
| hooks.RecordLLMRequest(duration.Milliseconds(), totalTokens, "error") | |
| } | |
| return wrapCursorError(err) | |
| } |
| Messages: req.Messages, | ||
| MaxTokens: req.MaxTokens, | ||
| }) | ||
| _ = start |
There was a problem hiding this comment.
Dead code: start is assigned but immediately discarded with _ = start. This suggests timing/duration tracking was intended but never completed. Unlike ReviewFile, the Complete method does not track or report latency. Either remove the unused start variable entirely, or add duration tracking to CompleteResponse for observability consistency.
| messages = append(messages, llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished.")) | ||
| if content != "" { | ||
| messages = append(messages[:len(messages)-1], llm.NewTextMessage("assistant", content), messages[len(messages)-1]) | ||
| } |
There was a problem hiding this comment.
This slice manipulation is correct but fragile and hard to read. It inserts an assistant message before the previously-appended user retry message by slicing off the last element and re-appending both. A clearer approach would be to construct the messages in the desired order directly, avoiding the confusing self-referential append pattern:
if content != "" {
messages = append(messages,
llm.NewTextMessage("assistant", content),
llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished."),
)
} else {
messages = append(messages, llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished."))
}This avoids the subtle dependency on Go's argument evaluation order and makes the intent (assistant message before user message) explicit.
Suggestion:
| messages = append(messages, llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished.")) | |
| if content != "" { | |
| messages = append(messages[:len(messages)-1], llm.NewTextMessage("assistant", content), messages[len(messages)-1]) | |
| } | |
| if content != "" { | |
| messages = append(messages, | |
| llm.NewTextMessage("assistant", content), | |
| llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished."), | |
| ) | |
| } else { | |
| messages = append(messages, llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished.")) | |
| } |
| ep, err := llm.ResolveEndpoint(configPath) | ||
| if err != nil { | ||
| return ResolvedBackend{}, err | ||
| } | ||
| return ResolvedBackend{Kind: KindChatCompletions, Endpoint: ep}, nil |
There was a problem hiding this comment.
Redundant file I/O and JSON parsing: When the provider is not "cursor" (or when the config file doesn't exist), this function reads and parses the config file via readConfigFile, then calls llm.ResolveEndpoint(configPath) which internally reads and parses the same file again via tryOCRConfig. This results in double disk I/O and double JSON unmarshalling for every non-cursor invocation.
Consider either:
- Passing the already-read config data into
llm.ResolveEndpoint(e.g., via an alternative entry point that accepts pre-parsed config), or - Reading the file once here and using
llm.ResolveEndpointFromBytes(data)or similar, or - At minimum, skipping the initial
readConfigFilecall when you know you'll fall through tollm.ResolveEndpoint— e.g., only reading the file to check the provider field, then reusing the raw bytes.
| var mainRec *session.TaskRecord | ||
| executor := func(execCtx context.Context, call reviewbackend.ToolCallInput) reviewbackend.ToolCallOutput { | ||
| if execCtx == nil { | ||
| execCtx = ctx | ||
| } | ||
|
|
||
| var results []tool.ToolCallResult | ||
| taskCompleted := false | ||
| hasValidResult := false | ||
|
|
||
| for _, call := range calls { | ||
| cp := a.executeToolCall(ctx, newPath, call, rec) | ||
| if cp.Completed { | ||
| results = append(results, tool.ToolCallResult{ | ||
| ToolCallID: call.ID, | ||
| Name: call.Function.Name, | ||
| Result: "Task completed successfully.", | ||
| }) | ||
| taskCompleted = true | ||
| } else if cp.Data != "" { | ||
| results = append(results, tool.ToolCallResult{ | ||
| ToolCallID: call.ID, | ||
| Name: call.Function.Name, | ||
| Result: cp.Data, | ||
| }) | ||
| hasValidResult = true | ||
| } else { | ||
| results = append(results, tool.ToolCallResult{ | ||
| ToolCallID: call.ID, | ||
| Name: call.Function.Name, | ||
| Result: "Error: Tool execution returned no result.", | ||
| }) | ||
| cp := a.executeToolCall(execCtx, newPath, llm.ToolCall{ | ||
| ID: call.ID, | ||
| Type: "function", | ||
| Function: llm.FunctionCall{ | ||
| Name: call.Name, | ||
| Arguments: call.Arguments, | ||
| }, | ||
| }, mainRec) | ||
| return reviewbackend.ToolCallOutput{Result: cp.Data, Completed: cp.Completed} | ||
| } |
There was a problem hiding this comment.
Potential issue: mainRec is captured by reference in the executor closure and mutated by the AppendTaskRecord hook. This creates an implicit ordering dependency — the backend must call AppendTaskRecord before invoking the executor, otherwise mainRec will be nil (tool results won't be recorded). While both current backends (ChatCompletionsBackend and CursorAgentBackend) satisfy this ordering, the contract is undocumented and fragile.
Additionally, if a future backend were to invoke tool calls concurrently (e.g., parallel tool execution), concurrent reads/writes to mainRec would be a data race.
Consider either:
- Documenting this ordering requirement explicitly in the
ReviewHooks/Backendinterface comments, or - Passing the
*session.TaskRecordas part ofToolCallInputso the executor receives it explicitly rather than via closure capture.
| resp, err := a.backend.Complete(ctx, CompleteRequest{ | ||
| Model: model, | ||
| Messages: req.Messages, | ||
| MaxTokens: req.MaxTokens, | ||
| }) |
There was a problem hiding this comment.
The completeAdapter silently drops Tools and Temperature fields from llm.ChatRequest because CompleteRequest doesn't carry them. While current callers (executePlanPhase and runCompression in agent.go) only use Model/Messages/MaxTokens through this path, any future caller that passes Tools or Temperature will have those fields silently ignored, potentially causing incorrect model behavior.
Consider either:
- Adding a guard that returns an error if unsupported fields are set:
if len(req.Tools) > 0 { return nil, fmt.Errorf("completeAdapter does not support tools; use Backend.ReviewFile instead") }
- Or documenting this limitation prominently so callers are aware.
Description
This PR adds Cursor as an optional review backend for Open Code Review, powered by
cursor-go-sdkv0.0.2. Existing OpenAI and Anthropic chat-completions paths are unchanged.Cursor Agent SDK mirrors the Agent runtime (
CreateAgent,Send, custom tools), not OpenAI/Anthropic HTTP APIs. To keep the integration clean, this PR introducesinternal/reviewbackend:ChatCompletionsBackend— preserves the existing OpenAI/Anthropic tool-call loop (moved out ofagent.go).CursorAgentBackend— runs review via the Cursor local agent runtime, mapping OCR tools toCustomToolhandlers and reusing the same tool executor andCommentCollector.ocr reviewandocr llm testresolve configuration throughreviewbackend.ResolveBackend()and wire a sharedBackendintoagent.Args.Configuration
CURSOR_API_KEYis used as an environment fallback whenproviders.cursor.api_keyis not set.Cursor runtime behavior
repoDir(no workspace copy).SandboxOptions.Enabled = true, emptySettingSources, OCR custom tools only.autoReviewis not enabled by default.turn-ended, fallbacktoken-delta) and reported in review summaries.Architecture
Type of Change
How Has This Been Tested?
make checkpasses locally (go mod tidy,go fmt,go vet)go test ./...passes locallyreviewbackendresolver, chat-completions loop, Cursor usage aggregation, agent backend wiring, provider config, and provider TUIocr llm testwithprovider: cursorocr review --audience agent --concurrency 1on a small diffManual test notes (optional)
Checklist
go fmt,go vet)Related Issues
N/A
Notes for reviewers
github.com/remdev/cursor-go-sdk v0.0.2(requires Go 1.23+; OCR stays ongo 1.25.0).cursor-go-sdk/cmd/setup); OCR does not auto-install npm packages duringocr review.ocr config providerwith SDK bridge guidance.provideris notcursor.