Skip to content

Show aggregated metrics in UI (Part 1)#43

Merged
dphuang2 merged 15 commits intomainfrom
show-aggregated-metrics-in-ui
Aug 10, 2025
Merged

Show aggregated metrics in UI (Part 1)#43
dphuang2 merged 15 commits intomainfrom
show-aggregated-metrics-in-ui

Conversation

@dphuang2
Copy link
Copy Markdown
Collaborator

@dphuang2 dphuang2 commented Aug 9, 2025

No description provided.

Dylan Huang added 15 commits August 6, 2025 23:05
use model_dump(mode="json")

Add run_id to EvalMetadataSchema for unique run identification

- Introduced run_id as an optional string to the EvalMetadataSchema to uniquely identify evaluation runs.
- Updated description to clarify the purpose of the run_id field.

Add run_id field to EvalMetadata for unique run identification

- Added run_id as an optional string to the EvalMetadata class to uniquely identify groups of evaluation rows.
- Updated the field description to clarify its purpose in relation to evaluation tests.

Fix evaluation result assignment in markdown highlighting test

- Updated the test_markdown_highlighting_evaluation function to assign the evaluation result directly to the row when no assistant response is found, ensuring proper handling of evaluation results.

Add run_id generation in evaluation_test for unique identification

- Integrated the generate_id function to create a run_id within the evaluation_test function.
- Passed the generated run_id to the evaluation function, ensuring unique identification of evaluation runs.
- Simplified the construction of the log initialization message by creating a data dictionary before sending it over the WebSocket, improving code readability.
- Introduced a new test case to validate the handling of multiple column fields in the computePivot function.
- Verified correct computation of cell values, row totals, column totals, and grand total for the pivot table with composite columns.
# Conflicts:
#	eval_protocol/pytest/evaluation_test.py
…t_id, and run_id fields. Update evaluation_test to handle new identifiers and improve documentation on evaluation concepts.
…ng schema in eval-protocol types. This enhances tracking of invocation context for evaluation rows.
@dphuang2 dphuang2 changed the title Show aggregated metrics in UI Show aggregated metrics in UI (Part 1) Aug 10, 2025
@dphuang2 dphuang2 merged commit e355931 into main Aug 10, 2025
7 checks passed
@dphuang2 dphuang2 deleted the show-aggregated-metrics-in-ui branch August 10, 2025 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant