Merged
Conversation
added 15 commits
August 6, 2025 23:05
use model_dump(mode="json") Add run_id to EvalMetadataSchema for unique run identification - Introduced run_id as an optional string to the EvalMetadataSchema to uniquely identify evaluation runs. - Updated description to clarify the purpose of the run_id field. Add run_id field to EvalMetadata for unique run identification - Added run_id as an optional string to the EvalMetadata class to uniquely identify groups of evaluation rows. - Updated the field description to clarify its purpose in relation to evaluation tests. Fix evaluation result assignment in markdown highlighting test - Updated the test_markdown_highlighting_evaluation function to assign the evaluation result directly to the row when no assistant response is found, ensuring proper handling of evaluation results. Add run_id generation in evaluation_test for unique identification - Integrated the generate_id function to create a run_id within the evaluation_test function. - Passed the generated run_id to the evaluation function, ensuring unique identification of evaluation runs.
- Simplified the construction of the log initialization message by creating a data dictionary before sending it over the WebSocket, improving code readability.
- Introduced a new test case to validate the handling of multiple column fields in the computePivot function. - Verified correct computation of cell values, row totals, column totals, and grand total for the pivot table with composite columns.
# Conflicts: # eval_protocol/pytest/evaluation_test.py
…t_id, and run_id fields. Update evaluation_test to handle new identifiers and improve documentation on evaluation concepts.
…ng schema in eval-protocol types. This enhances tracking of invocation context for evaluation rows.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.