Skip to content

in-progress eval log viewer (part 1)#22

Merged
dphuang2 merged 55 commits intomainfrom
in-progress-eval-viewer
Aug 6, 2025
Merged

in-progress eval log viewer (part 1)#22
dphuang2 merged 55 commits intomainfrom
in-progress-eval-viewer

Conversation

@dphuang2
Copy link
Copy Markdown
Collaborator

@dphuang2 dphuang2 commented Aug 5, 2025

progress so far:

Screenshot 2025-08-05 at 7 03 00 PM

Dylan Huang added 25 commits August 5, 2025 17:23
…on Result and Ground Truth, improving code readability and maintainability.
…ty and positioning, enhancing user experience for chat window adjustments.
…ermine if the evaluation passed based on the threshold of success.
…to gray-600 for improved visual consistency.
…lexibility for displaying connection and evaluation statuses. Update App and Row components to utilize the new StatusIndicator implementation.
…tor, simplifying initialization. Update _call_model method to return a Message object with structured response data. Modify default_agent_rollout_processor to utilize the new constructor and streamline dataset population. Enhance evaluation_test to initialize eval_metadata for each row before running rollouts, ensuring consistent metadata handling.
…alization. Ensure eval_metadata is set for each row before rollouts, and enhance exception management to log errors appropriately while maintaining pytest behavior.
…le. This update ensures that unnecessary files are ignored during version control, streamlining project management.
…r chat window resizing from 80% to 66% of the available width, improving layout responsiveness.
@dphuang2 dphuang2 changed the title in-progress eval log viewer in-progress eval log viewer (part 1) Aug 6, 2025
@dphuang2 dphuang2 merged commit 0fb7071 into main Aug 6, 2025
6 checks passed
@dphuang2 dphuang2 deleted the in-progress-eval-viewer branch August 6, 2025 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant