Skip to content

feat: eval-evolution work in progress - recommendations and workers UI#11444

Draft
mp-roocode wants to merge 25 commits intomainfrom
feat/eval-recommendations
Draft

feat: eval-evolution work in progress - recommendations and workers UI#11444
mp-roocode wants to merge 25 commits intomainfrom
feat/eval-recommendations

Conversation

@mp-roocode
Copy link
Contributor

@mp-roocode mp-roocode commented Feb 12, 2026

Work in progress on eval-evolution features including:

  • Updated methodology content
  • Workers UI improvements
  • Comparison chart enhancements
  • New workers-v2 pages
  • Eval outcomes utilities

Start a new Roo Code Cloud session on this branch

roomote and others added 10 commits February 11, 2026 15:46
Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>
- Add 5 engineer roles: Junior, Senior, Staff, Architecture Reviewer, Autonomous Agent
- Build role selection landing page with hiring metaphor at /evals/workers
- Build candidate rankings page with tiered recommendations at /evals/workers/[roleId]
- Build candidate comparison page with Recharts charts at /evals/workers/[roleId]/compare
- Build "How We Interview" methodology page at /evals/methodology
- Add mock data with real eval scores from 27 model runs
- Implement "Hire This Engineer" CTA linking to Roo Code Cloud
- Implement "Configure Extension" CTA with clipboard copy
- Per-language score breakdowns (Go, Java, JS, Python, Rust)
- Daily salary pricing (80 tasks/agent/day estimate)
- framer-motion animations, glass-morphism design, role color themes
- Tone-of-voice compliance (no em dashes, no hype, workflow-first copy)
- vscode:// deep link design doc at plans/vscode-deep-link-design.md
…e themes

- Atmospheric header with role-colored blur gradients
- Glass-morphism containers for chart, filters, and export
- Styled language toggle pills with role color accents
- Themed provider checkboxes and success rate slider
- Custom chart tooltip with backdrop blur
- Export buttons with press feedback
- framer-motion scroll-triggered animations
- Bottom navigation with pill-style links
- Role themes: reviewer (violet) and autonomous (cyan) added to candidates page
…line)

- Add "Value Map: Salary vs Interview Score" scatter to comparison page
  - Dots colored by tier, sized by success rate
  - Sweet Spot quadrant highlight (upper-left)
  - Respects existing provider/success-rate filters
- Add "AI Coding Capability Over Time" scatter to landing page
  - 10 models from Jun 2025 to Feb 2026
  - Dots colored by provider, sized by cost efficiency
  - Dashed trend line showing upward trajectory
- Add MODEL_TIMELINE data to mock-recommendations.ts
@roomote
Copy link
Contributor

roomote bot commented Feb 12, 2026

Rooviewer Clock   See task

Reviewed 6c5b7ad (refine recommendation objective and profile snapshot). All previously flagged issues remain resolved. No new issues found.

  • --font-display CSS variable is undefined on the baseline /evals/workers path -- fixed by lifting font setup to shared evals/layout.tsx
  • --font-display CSS variable is undefined on the /evals/methodology path -- fixed by lifting font setup to shared evals/layout.tsx
  • recommendations/layout.tsx redundantly imports Fraunces/IBM Plex Sans and wraps content with font variables already provided by the shared evals/layout.tsx -- fixed by simplifying to pass-through fragment
  • copyPrompt in objective-content.tsx lacks visual feedback -- fixed with promptCopied state, Check icon, and 2s reset timer
  • Aggregate stats totalEvalRuns/totalExercises inflated in workers/page.tsx -- fixed by using recommendations[0] instead of summing across roles
  • Aggregate stats totalEvalRuns/totalExercises inflated in recommendations/page.tsx -- fixed by using recommendations[0] instead of summing across roles
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

@mp-roocode mp-roocode self-assigned this Feb 12, 2026
roomote and others added 3 commits February 13, 2026 03:39
…s layout

Move --font-display and --font-body CSS variable declarations from
workers/page.tsx into a new evals/layout.tsx so all evals sub-pages
(methodology, workers, workers/[roleId], etc.) inherit the font
variables without duplicating the setup.
@roomote
Copy link
Contributor

roomote bot commented Feb 13, 2026

Rooviewer Clock   See task

Reviewed 268b183. The two --font-display fixes hold via the shared evals/layout.tsx. Three open items remain from earlier rounds -- no new issues found.

  • --font-display CSS variable undefined on the baseline /evals/workers path -- fixed by shared evals/layout.tsx
  • --font-display CSS variable undefined on the /evals/methodology path -- fixed by shared evals/layout.tsx
  • totalEvalRuns is still inflated in workers/page.tsx (line 65) and recommendations/page.tsx (line 85) -- each role reports the same 27 runs / 120 exercises, so reduce across 5 roles produces 135 / 600 instead of the actual 27 / 120
  • recommendations/layout.tsx redundantly imports Fraunces and IBM Plex Sans (lines 2-5) and wraps children with font variables already provided by the parent evals/layout.tsx, causing double font optimization entries
  • copyPrompt in objective-content.tsx (line 186) fires navigator.clipboard.writeText without visual feedback -- the existing pattern in comparison-chart.tsx and copy-settings-button.tsx tracks a copied state and shows a confirmation
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Track copied state with a checkmark icon and "Copied!" text that
resets after 2 seconds, matching the existing pattern used in
copy-settings-button.tsx and comparison-chart.tsx.
…back

- Remove duplicate Fraunces/IBM Plex Sans imports from recommendations/layout.tsx
  (already provided by shared evals/layout.tsx)
- Add visual "Copied!" feedback to copyPrompt button in objective-content.tsx
  (follows existing copy-settings-button.tsx pattern)
@roomote
Copy link
Contributor

roomote bot commented Feb 13, 2026

Rooviewer Clock   See task

Reviewed 8ba3bd4. All previously flagged issues resolved except the inflated aggregate stats.

  • --font-display CSS variable undefined on /evals/workers path -- fixed by shared evals/layout.tsx
  • --font-display CSS variable undefined on /evals/methodology path -- fixed by shared evals/layout.tsx
  • recommendations/layout.tsx redundant Fraunces/IBM Plex Sans imports -- fixed, now a pass-through
  • copyPrompt in objective-content.tsx missing visual feedback -- fixed with promptCopied state and 2s reset
  • totalEvalRuns/totalExercises still inflated in workers/page.tsx (line 65) and recommendations/page.tsx (line 85) -- each role reports identical 27/120 values, reduce across 5 roles produces 135/600
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>
@mp-roocode
Copy link
Contributor Author

@roomote fix the 2 failing tests

@roomote
Copy link
Contributor

roomote bot commented Feb 13, 2026

Fixaroo Clock   See task

Wrapped WorkersContent in a <Suspense> boundary in both workers/page.tsx and recommendations/page.tsx to fix the useSearchParams() build error. Also fixed inflated totalEvalRuns/totalExercises in recommendations/page.tsx (uses recommendations[0] instead of reduce). All lint and type checks pass.

View commit | Revert commit

@github-actions
Copy link
Contributor

github-actions bot commented Feb 13, 2026

🚀 Preview deployed!

Your changes have been deployed to Vercel:

Preview URL: https://roo-code-website-f6xx4dox2-roo-code.vercel.app

This preview will be updated automatically when you push new commits to this PR.

@roomote roomote bot force-pushed the feat/eval-recommendations branch from 7bac237 to a91054f Compare February 18, 2026 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments