Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@hannesrudolph
Copy link
Collaborator

@hannesrudolph hannesrudolph commented Dec 16, 2025

Summary

This PR improves the evals UI with two changes:

Changes

  1. Add tool groups for aggregating tool usage stats - Groups related tools together for better visualization and analysis of tool usage statistics in the evals UI.
image image image
  1. Fix duration not reported in evals UI - Resolves an issue where task duration was not being displayed correctly in the evaluations UI.

Commits

  • feat(evals): add tool groups for aggregating tool usage stats
  • fix: duration not reported in evals UI

Important

Enhance evals UI with tool groups for better visualization and fix duration reporting by calculating from timestamps when streaming data is unavailable.

  • UI Enhancements:
    • Add tool groups for aggregating tool usage stats in run.tsx and runs.tsx.
    • Fix duration reporting in run.tsx by calculating from timestamps if streaming data is unavailable.
  • Backend Changes:
    • In runTask.ts, ensure task metrics are updated correctly by waiting for taskMetricsId to be set before processing certain events.
    • Handle race conditions in runTask.ts by resolving taskMetricsReady on disconnect.

This description was created by Ellipsis for b9aa6b5. You can customize this summary. It will automatically update as commits are pushed.

- Fix backend race condition in runTask.ts where TaskTokenUsageUpdated
  could arrive before TaskStarted handler set taskMetricsId
- Add Promise-based synchronization (taskMetricsReady) for event handlers
- Fix UI to fall back to database timestamps (startedAt/finishedAt) when
  streaming duration is unavailable (e.g., page loaded after TaskStarted)
- Add tool groups feature with customizable name and icon
- Groups aggregate tool usage stats in table columns
- Persist groups to localStorage
- Tools can only belong to one group
- Each group displays only icon in header with tooltip showing name and tools
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. bug Something isn't working UI/UX UI/UX related or focused labels Dec 16, 2025
@roomote
Copy link
Contributor

roomote bot commented Dec 16, 2025

Oroocle Clock   See task on Roo Cloud

Re-review complete. No blocking items remain.

  • Ensure finished tasks still get duration populated when DB metrics are empty and streaming usage is unavailable (fallback to startedAt/finishedAt timestamps).
  • Prevent taskMetricsReady deadlock in runTask if TaskStarted never fires or throws before resolving the promise (add guard/timeout/fallback).
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Dec 16, 2025
@roomote
Copy link
Contributor

roomote bot commented Dec 16, 2025

Fixaroo Clock   See task on Roo Cloud

Fixed the taskMetricsReady deadlock by resolving the promise on disconnect and adding a guard to skip metrics updates when taskMetricsId is not set. All local checks passed.

View commit | Revert commit

roomote and others added 2 commits December 16, 2025 18:18
… data

Add fallback case for finished tasks where DB metrics are empty and
streaming usage is unavailable. Duration is now calculated from
startedAt/finishedAt timestamps in all cases.
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 16, 2025
@cte cte merged commit 84c5d2f into main Dec 16, 2025
10 checks passed
@cte cte deleted the eval-ui-update branch December 16, 2025 20:43
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Dec 16, 2025
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Dec 16, 2025
@cte cte mentioned this pull request Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. lgtm This PR has been approved by a maintainer size:XL This PR changes 500-999 lines, ignoring generated files. UI/UX UI/UX related or focused

Projects

No open projects
Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants