[feat][evaluation] Coze Coding Evaluation Target Support by HearyShen · Pull Request #461 · coze-dev/coze-loop

HearyShen · 2026-03-17T06:59:11Z

What type of PR is this?

Check the PR title

This PR title match the format: [<type>][<scope>] <description>. For example: [fix][backend] flaky fix
The description of this PR title is user-oriented and clear enough for others to understand.
Add documentation if the current PR requires user awareness at the usage level.
This PR is written in English. PRs not in English will not be reviewed.

(Optional) Translate the PR title into Chinese

(Optional) More detailed description for this PR(en: English/zh: Chinese)

en:
zh(optional):

(Optional) Which issue(s) this PR fixes

codecov · 2026-03-18T13:36:52Z

Codecov Report

❌ Patch coverage is 92.06349% with 10 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...modules/evaluation/application/eval_openapi_app.go	90.47%	5 Missing and 1 partial ⚠️
...api/handler/coze/loop/apis/eval_open_apiservice.go	0.00%	2 Missing ⚠️
...odules/evaluation/domain/service/evaluator_impl.go	95.83%	1 Missing and 1 partial ⚠️

@@            Coverage Diff             @@
##             main     #461      +/-   ##
==========================================
+ Coverage   74.45%   74.52%   +0.07%     
==========================================
  Files         629      629              
  Lines       66337    66438     +101     
==========================================
+ Hits        49389    49511     +122     
+ Misses      13663    13644      -19     
+ Partials     3285     3283       -2

Flag	Coverage Δ
unittests	`74.52% <92.06%> (+0.07%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...luation/application/convertor/evaluator/openapi.go	`90.58% <100.00%> (+0.04%)`	⬆️
...uation/application/convertor/experiment/openapi.go	`84.22% <100.00%> (+0.04%)`	⬆️
...ules/evaluation/domain/service/expt_export_impl.go	`75.34% <100.00%> (ø)`
...aluation/domain/service/expt_run_item_turn_impl.go	`87.83% <100.00%> (ø)`
.../evaluation/infra/repo/evaluator/evaluator_impl.go	`81.51% <100.00%> (+0.18%)`	⬆️
...api/handler/coze/loop/apis/eval_open_apiservice.go	`0.00% <0.00%> (ø)`
...odules/evaluation/domain/service/evaluator_impl.go	`83.90% <95.83%> (+4.65%)`	⬆️
...modules/evaluation/application/eval_openapi_app.go	`92.43% <90.47%> (+0.21%)`	⬆️

... and 3 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 77fb395...69cb5de. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Add case for EvaluatorTypeCustomRPC in convertEntityEvaluatorTypeToOpenAPI function and refactor evaluator version ID retrieval to use GetEvaluatorVersionID method. Also add test case for agent evaluator in SubmitExperimentOApi test.

Add more comprehensive test cases to verify conversion of different evaluator types

Add EvaluatorTypeAgent constant and handle conversion between entity and openapi types. Also add validation to reject agent type in evaluator openapi conversion.

Skip workspace validation for builtin evaluators to allow cross-workspace execution. Add test cases for evaluator version not found and builtin success scenarios.

…idation Add optional Extra field to ImportEvaluationSetOApiRequest and GetEvaluationSetIOJobOApiRequest thrift structs Implement validation, serialization and deserialization for the new field in generated code

implement API to run builtin evaluators by ID or name, including: - add new endpoint /v1/loop/evaluation/builtin_evaluators/run - add service method to resolve visible version ID - add repo method to get evaluator by space ID and name - update thrift IDL and generate code - add tests for new functionality

- Move builtin evaluator endpoint from `/builtin_evaluators/run` to `/evaluators/builtin/run` - Add new middleware `_builtinMw` for builtin evaluator routes - Implement `GetEvaluatorMetaBySpaceIDAndName` repo method and tests - Add `ResolveBuiltinEvaluatorVisibleVersionID` service method and tests

Clarify that either builtin_evaluator_id or builtin_evaluator_name must be provided, and if both are provided, they must match

get all target fields for AgentEvaluator EvaluateTargetOutputFields

a751c91

HearyShen changed the title ~~get all target fields for AgentEvaluator EvaluateTargetOutputFields~~ [feat][evaluation] Coze Coding Evaluation Support Mar 17, 2026

test(evaluation): simplify test data setup for agent evaluation

6494523

style: fix indentation in test data structure

e708b0d

HearyShen changed the title ~~[feat][evaluation] Coze Coding Evaluation Support~~ [feat][evaluation] Coze Coding Evaluation Target Support Mar 19, 2026

dsf86 previously approved these changes Mar 19, 2026

View reviewed changes

HearyShen dismissed dsf86’s stale review via 858b3fa March 19, 2026 09:14

HearyShen added 11 commits March 19, 2026 17:48

test(evaluation): update test cases for OpenAPIColumnEvaluatorsDO2DTOs

5c64e17

Add more comprehensive test cases to verify conversion of different evaluator types

Merge branch 'main' into feat/coze_coding

eb69f91

feat(evaluator): add agent type support for evaluator

7096243

Add EvaluatorTypeAgent constant and handle conversion between entity and openapi types. Also add validation to reject agent type in evaluator openapi conversion.

test(evaluator): add test case for agent evaluator rejection

0d48339

fix(evaluation): handle builtin evaluator version check properly

4d51a1f

Skip workspace validation for builtin evaluators to allow cross-workspace execution. Add test cases for evaluator version not found and builtin success scenarios.

Merge branch 'main' into feat/coze_coding

ffcd745

feat(evaluation): add extra field to thrift structs and implement val…

16dca90

…idation Add optional Extra field to ImportEvaluationSetOApiRequest and GetEvaluationSetIOJobOApiRequest thrift structs Implement validation, serialization and deserialization for the new field in generated code

docs(thrift): update comment for builtin evaluator requirements

7751df6

Clarify that either builtin_evaluator_id or builtin_evaluator_name must be provided, and if both are provided, they must match

test: improve coverage for builtin evaluator domain service methods

69cb5de

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat][evaluation] Coze Coding Evaluation Target Support#461

[feat][evaluation] Coze Coding Evaluation Target Support#461
HearyShen wants to merge 15 commits intomainfrom
feat/coze_coding

HearyShen commented Mar 17, 2026

Uh oh!

codecov bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HearyShen commented Mar 17, 2026

What type of PR is this?

Check the PR title

(Optional) Translate the PR title into Chinese

(Optional) More detailed description for this PR(en: English/zh: Chinese)

(Optional) Which issue(s) this PR fixes

Uh oh!

codecov bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 18, 2026 •

edited

Loading