Skip to content

feat(llm): add Cursor Agent SDK as opt-in review backend#140

Draft
remdev wants to merge 4 commits into
alibaba:mainfrom
remdev:feat/cursor-review-backend
Draft

feat(llm): add Cursor Agent SDK as opt-in review backend#140
remdev wants to merge 4 commits into
alibaba:mainfrom
remdev:feat/cursor-review-backend

Conversation

@remdev

@remdev remdev commented Jun 15, 2026

Copy link
Copy Markdown

Description

This PR adds Cursor as an optional review backend for Open Code Review, powered by cursor-go-sdk v0.0.2. Existing OpenAI and Anthropic chat-completions paths are unchanged.

Cursor Agent SDK mirrors the Agent runtime (CreateAgent, Send, custom tools), not OpenAI/Anthropic HTTP APIs. To keep the integration clean, this PR introduces internal/reviewbackend:

  • ChatCompletionsBackend — preserves the existing OpenAI/Anthropic tool-call loop (moved out of agent.go).
  • CursorAgentBackend — runs review via the Cursor local agent runtime, mapping OCR tools to CustomTool handlers and reusing the same tool executor and CommentCollector.

ocr review and ocr llm test resolve configuration through reviewbackend.ResolveBackend() and wire a shared Backend into agent.Args.

Configuration

# One-time bridge setup (Node.js >= 18, npm)
go run github.com/remdev/cursor-go-sdk/cmd/setup@latest

ocr config set provider cursor
ocr config set providers.cursor.api_key "$CURSOR_API_KEY"   # optional if env is set
ocr config set providers.cursor.model composer-2.5

ocr llm test
ocr review --audience agent

CURSOR_API_KEY is used as an environment fallback when providers.cursor.api_key is not set.

Cursor runtime behavior

  • Local agent runs in repoDir (no workspace copy).
  • SandboxOptions.Enabled = true, empty SettingSources, OCR custom tools only.
  • autoReview is not enabled by default.
  • Fail-closed on bridge/sandbox/configuration errors.
  • Token usage is aggregated from Cursor stream deltas (turn-ended, fallback token-delta) and reported in review summaries.

Architecture

review_cmd / llm_cmd
    → reviewbackend.ResolveBackend()
    → reviewbackend.New()
    → agent.Args{ Backend, LLMClient: TextClient(backend) }
    → Backend.ReviewFile()          (main per-file review)
    → TextClient.CompletionsWithCtx() (plan / filter / compression / llm test)

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactoring (no functional changes)
  • Documentation update
  • CI / Build / Tooling

How Has This Been Tested?

  • make check passes locally (go mod tidy, go fmt, go vet)
  • go test ./... passes locally
  • Unit tests for reviewbackend resolver, chat-completions loop, Cursor usage aggregation, agent backend wiring, provider config, and provider TUI
  • Manual: ocr llm test with provider: cursor
  • Manual: ocr review --audience agent --concurrency 1 on a small diff

Manual test notes (optional)

export CURSOR_API_KEY=...
go run ./cmd/opencodereview config set provider cursor
go run ./cmd/opencodereview config set providers.cursor.model composer-2.5
go run ./cmd/opencodereview llm test
go run ./cmd/opencodereview review --preview
go run ./cmd/opencodereview review --audience agent --concurrency 1

Checklist

  • My code follows the project's coding style (go fmt, go vet)
  • I have performed a self-review of my code
  • I have added tests that prove my fix is effective or my feature works
  • New and existing unit tests pass locally with my changes
  • I have updated the documentation accordingly (if applicable)
  • I have signed the CLA

Related Issues

N/A

Notes for reviewers

  • Adds github.com/remdev/cursor-go-sdk v0.0.2 (requires Go 1.23+; OCR stays on go 1.25.0).
  • Bridge installation is explicit (cursor-go-sdk/cmd/setup); OCR does not auto-install npm packages during ocr review.
  • Cloud agents, PR creation, and auto-fix flows are out of scope.
  • Cursor appears under Official presets in ocr config provider with SDK bridge guidance.
  • OpenAI/Anthropic/custom HTTP provider behavior is unchanged when provider is not cursor.

@CLAassistant

CLAassistant commented Jun 15, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

remdev added 4 commits June 15, 2026 16:24
Introduce internal/reviewbackend so main review work runs through a shared
Backend interface. ChatCompletionsBackend preserves the existing OpenAI and
Anthropic tool-call loop; CursorAgentBackend runs local Cursor agents with
OCR custom tools. Wire both paths through review and llm commands.
Register cursor as a built-in provider and parse usage maps from Cursor
stream events so review summaries report token counts.
Show Cursor in the official preset list with SDK bridge hints during API
key setup and after saving provider configuration.
Update provider list test to include cursor in the sorted registry.
@remdev remdev force-pushed the feat/cursor-review-backend branch from 9bcee09 to 579782c Compare June 15, 2026 13:25

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 OpenCodeReview found 9 issue(s) in this PR.

  • ✅ 9 posted as inline comment(s)
  • 📝 0 posted as summary

Comment on lines +59 to 64
model := resolved.Endpoint.Model
source := resolved.Endpoint.Source
if resolved.Kind == reviewbackend.KindCursorAgent {
model = resolved.Cursor.Model
source = resolved.Cursor.Source
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When resolved.Kind is KindCursorAgent, resolved.Endpoint is a zero-value struct (empty strings), so lines 59-60 assign empty strings that are immediately overwritten on lines 62-63. While functionally correct today, this pattern is fragile: if a new Kind is added in the future and this if block isn't updated, model and source would silently be empty strings.

Consider using a switch statement or deriving these values from the Backend interface methods (backend.Model() / backend.Source()) which already handle this dispatch correctly, making this code more maintainable and less error-prone.

Suggestion:

Suggested change
model := resolved.Endpoint.Model
source := resolved.Endpoint.Source
if resolved.Kind == reviewbackend.KindCursorAgent {
model = resolved.Cursor.Model
source = resolved.Cursor.Source
}
model := backend.Model()
source := backend.Source()

Comment on lines +101 to +105
llmClient := reviewbackend.TextClient(backend)
model := resolved.Endpoint.Model
if resolved.Kind == reviewbackend.KindCursorAgent {
model = resolved.Cursor.Model
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Backend interface already exposes a Model() method that returns the correct model for each backend kind (see ChatCompletionsBackend.Model() and CursorAgentBackend.Model()). Consider using backend.Model() instead of manually branching on resolved.Kind. This would be simpler, avoid coupling to the internal structure of ResolvedBackend, and automatically handle any new backend kinds added in the future without requiring changes here.

Suggestion:

Suggested change
llmClient := reviewbackend.TextClient(backend)
model := resolved.Endpoint.Model
if resolved.Kind == reviewbackend.KindCursorAgent {
model = resolved.Cursor.Model
}
llmClient := reviewbackend.TextClient(backend)
model := backend.Model()

Comment on lines +25 to +34
a.mu.Lock()
if !a.hasTurnUsage {
a.prompt = 0
a.completion = 0
a.cacheRead = 0
a.cacheWrite = 0
a.hasTurnUsage = true
}
a.mu.Unlock()
a.merge(ui)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential race condition: Between releasing the lock (line 33) and re-acquiring it inside merge (line 48), a concurrent token-delta event could observe hasTurnUsage == true and discard its delta tokens. While the Cursor SDK may serialize callback invocations in practice, this unlock-then-lock pattern is fragile and unnecessarily complex.

Consider performing the reset and merge atomically under a single lock acquisition to eliminate the gap entirely.

Suggestion:

Suggested change
a.mu.Lock()
if !a.hasTurnUsage {
a.prompt = 0
a.completion = 0
a.cacheRead = 0
a.cacheWrite = 0
a.hasTurnUsage = true
}
a.mu.Unlock()
a.merge(ui)
a.mu.Lock()
if !a.hasTurnUsage {
a.prompt = 0
a.completion = 0
a.cacheRead = 0
a.cacheWrite = 0
a.hasTurnUsage = true
}
a.prompt += ui.PromptTokens
a.completion += ui.CompletionTokens
a.cacheRead += ui.CacheReadTokens
a.cacheWrite += ui.CacheWriteTokens
a.mu.Unlock()

Comment on lines +131 to +136
if err != nil {
if hooks.SetError != nil && rec != nil {
hooks.SetError(rec, err, duration.Milliseconds())
}
return wrapCursorError(err)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing telemetry on error paths: When run.Wait(ctx) returns an error, this code records the error via hooks.SetError but does not call hooks.RecordUsage or hooks.RecordLLMRequest. Compare with ChatCompletionsBackend.ReviewFile, which calls hooks.RecordLLMRequest(duration.Milliseconds(), 0, "error") on its error path.

This means that failed Cursor agent runs will be invisible to usage tracking, billing/cost analysis, and observability dashboards. Partial usage data accumulated via usageAcc before the failure is also silently discarded.

Similarly, the agent.Send error path above (around line 110) also lacks RecordLLMRequest.

Suggestion:

Suggested change
if err != nil {
if hooks.SetError != nil && rec != nil {
hooks.SetError(rec, err, duration.Milliseconds())
}
return wrapCursorError(err)
}
if err != nil {
if hooks.SetError != nil && rec != nil {
hooks.SetError(rec, err, duration.Milliseconds())
}
usage := usageAcc.usage()
if usage != nil && hooks.RecordUsage != nil {
hooks.RecordUsage(usage)
}
totalTokens := int64(0)
if usage != nil {
totalTokens = usage.TotalTokens
}
if hooks.RecordLLMRequest != nil {
hooks.RecordLLMRequest(duration.Milliseconds(), totalTokens, "error")
}
return wrapCursorError(err)
}

Messages: req.Messages,
MaxTokens: req.MaxTokens,
})
_ = start

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code: start is assigned but immediately discarded with _ = start. This suggests timing/duration tracking was intended but never completed. Unlike ReviewFile, the Complete method does not track or report latency. Either remove the unused start variable entirely, or add duration tracking to CompleteResponse for observability consistency.

Comment on lines +135 to +138
messages = append(messages, llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished."))
if content != "" {
messages = append(messages[:len(messages)-1], llm.NewTextMessage("assistant", content), messages[len(messages)-1])
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This slice manipulation is correct but fragile and hard to read. It inserts an assistant message before the previously-appended user retry message by slicing off the last element and re-appending both. A clearer approach would be to construct the messages in the desired order directly, avoiding the confusing self-referential append pattern:

if content != "" {
    messages = append(messages,
        llm.NewTextMessage("assistant", content),
        llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished."),
    )
} else {
    messages = append(messages, llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished."))
}

This avoids the subtle dependency on Go's argument evaluation order and makes the intent (assistant message before user message) explicit.

Suggestion:

Suggested change
messages = append(messages, llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished."))
if content != "" {
messages = append(messages[:len(messages)-1], llm.NewTextMessage("assistant", content), messages[len(messages)-1])
}
if content != "" {
messages = append(messages,
llm.NewTextMessage("assistant", content),
llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished."),
)
} else {
messages = append(messages, llm.NewTextMessage("user", "You did not successfully call any tools. Please try again or use task_done if finished."))
}

Comment on lines +36 to +40
ep, err := llm.ResolveEndpoint(configPath)
if err != nil {
return ResolvedBackend{}, err
}
return ResolvedBackend{Kind: KindChatCompletions, Endpoint: ep}, nil

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant file I/O and JSON parsing: When the provider is not "cursor" (or when the config file doesn't exist), this function reads and parses the config file via readConfigFile, then calls llm.ResolveEndpoint(configPath) which internally reads and parses the same file again via tryOCRConfig. This results in double disk I/O and double JSON unmarshalling for every non-cursor invocation.

Consider either:

  1. Passing the already-read config data into llm.ResolveEndpoint (e.g., via an alternative entry point that accepts pre-parsed config), or
  2. Reading the file once here and using llm.ResolveEndpointFromBytes(data) or similar, or
  3. At minimum, skipping the initial readConfigFile call when you know you'll fall through to llm.ResolveEndpoint — e.g., only reading the file to check the provider field, then reusing the raw bytes.

Comment thread internal/agent/agent.go
Comment on lines +907 to +921
var mainRec *session.TaskRecord
executor := func(execCtx context.Context, call reviewbackend.ToolCallInput) reviewbackend.ToolCallOutput {
if execCtx == nil {
execCtx = ctx
}

var results []tool.ToolCallResult
taskCompleted := false
hasValidResult := false

for _, call := range calls {
cp := a.executeToolCall(ctx, newPath, call, rec)
if cp.Completed {
results = append(results, tool.ToolCallResult{
ToolCallID: call.ID,
Name: call.Function.Name,
Result: "Task completed successfully.",
})
taskCompleted = true
} else if cp.Data != "" {
results = append(results, tool.ToolCallResult{
ToolCallID: call.ID,
Name: call.Function.Name,
Result: cp.Data,
})
hasValidResult = true
} else {
results = append(results, tool.ToolCallResult{
ToolCallID: call.ID,
Name: call.Function.Name,
Result: "Error: Tool execution returned no result.",
})
cp := a.executeToolCall(execCtx, newPath, llm.ToolCall{
ID: call.ID,
Type: "function",
Function: llm.FunctionCall{
Name: call.Name,
Arguments: call.Arguments,
},
}, mainRec)
return reviewbackend.ToolCallOutput{Result: cp.Data, Completed: cp.Completed}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential issue: mainRec is captured by reference in the executor closure and mutated by the AppendTaskRecord hook. This creates an implicit ordering dependency — the backend must call AppendTaskRecord before invoking the executor, otherwise mainRec will be nil (tool results won't be recorded). While both current backends (ChatCompletionsBackend and CursorAgentBackend) satisfy this ordering, the contract is undocumented and fragile.

Additionally, if a future backend were to invoke tool calls concurrently (e.g., parallel tool execution), concurrent reads/writes to mainRec would be a data race.

Consider either:

  1. Documenting this ordering requirement explicitly in the ReviewHooks / Backend interface comments, or
  2. Passing the *session.TaskRecord as part of ToolCallInput so the executor receives it explicitly rather than via closure capture.

Comment on lines +39 to +43
resp, err := a.backend.Complete(ctx, CompleteRequest{
Model: model,
Messages: req.Messages,
MaxTokens: req.MaxTokens,
})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The completeAdapter silently drops Tools and Temperature fields from llm.ChatRequest because CompleteRequest doesn't carry them. While current callers (executePlanPhase and runCompression in agent.go) only use Model/Messages/MaxTokens through this path, any future caller that passes Tools or Temperature will have those fields silently ignored, potentially causing incorrect model behavior.

Consider either:

  1. Adding a guard that returns an error if unsupported fields are set:
    if len(req.Tools) > 0 {
        return nil, fmt.Errorf("completeAdapter does not support tools; use Backend.ReviewFile instead")
    }
  2. Or documenting this limitation prominently so callers are aware.

@remdev remdev marked this pull request as draft June 15, 2026 15:11
@remdev remdev changed the title feat: add Cursor Agent SDK as opt-in review backend feat(llm): add Cursor Agent SDK as opt-in review backend Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants