Skip to content

docs + UI/UX audit fixes + jobs reliability pass#117

Merged
RaghavChamadiya merged 7 commits intomainfrom
docs/readmes-and-ui-audit
May 1, 2026
Merged

docs + UI/UX audit fixes + jobs reliability pass#117
RaghavChamadiya merged 7 commits intomainfrom
docs/readmes-and-ui-audit

Conversation

@RaghavChamadiya
Copy link
Copy Markdown
Collaborator

Summary

Bundle of doc refreshes, UI/UX audit fixes, and jobs/persistence reliability improvements that accumulated on this branch.

Changes

  • Docs: refreshed packages/web and packages/server READMEs; shipped UI/UX audit notes and first fix pass.
  • Web UI: closed remaining audit items — confirmation dialogs, mobile breakpoints, a11y fixes, loading/empty/error states.
  • Jobs:
    • new cancel endpoint, hydrated progress reporting, and stuck-job detection
    • reset stale pending jobs on server startup (not just running)
    • per-repo DB in workspace mode to avoid cross-repo contention
    • reduce SQLite write contention during sync
  • Repo hygiene: .gitignore excludes local scratch dirs (GitNexus/, graphify/, test-repos/), Office temp files, and local-only planning docs.

Test plan

  • pytest green on server tests covering jobs reset/cancel/stuck-detection
  • Manual smoke: index a repo, cancel mid-run, confirm cleanup
  • Manual smoke: kill server during indexing, restart, confirm pending jobs reset
  • Manual smoke: workspace mode indexes two repos without DB contention errors
  • Web UI: spot-check confirmation dialogs, mobile layout at 375px, a11y with keyboard nav

READMEs:
- packages/web: document Overview standalone, Blast Radius, Costs, Workspace,
  Workspace Co-Changes and Contracts pages; expand component organization
  tree with chat/, workspace/, settings/.
- packages/server: document blast-radius, costs, knowledge-map, security,
  chat, claude-md, providers, and workspace routers.

Audit:
- Add packages/web/AUDIT_UI_UX.md - ~150 prioritized findings (P0/P1/P2)
  spanning tables, mobile responsiveness, accessibility, loading/error
  states, forms, navigation, polish, performance and design tokens.

First fix pass (highlights):
- globals.css: add color-fresh / -stale / -outdated / -accent /
  -accent-blue / -border-accent aliases.
- Replace style={{ maxWidth: 0 }} pattern with min/max-width floors
  across symbol, finding, hotspot, ownership, freshness and decisions
  tables (root cause of unreadable file-path cells).
- Sticky table headers + scope/aria-sort across the same six tables.
- Truncated cells gain title attributes for full text on hover.
- Skip-to-content link, id=main-content, graph-toolbar aria-labels with
  aria-pressed, chat textarea + send/stop labels, blast-radius form
  labels, Mermaid figure role/aria-label.
- Mobile: 44x44 hamburger, sheet width, command-palette mobile trigger,
  hotspot filter row wrapping, command-palette top padding on phones,
  Mermaid SVG forced fluid (max-width:100%, height:auto).
- Errors: notFound() only fires on real 404; finding-row + decision-detail
  surface toast.error on failure.
- Numbers: blast-radius risk 0-10 + centrality %, formatCost <$0.01,
  cost X-axis M/D ticks with min tick gap.
- Misc: AddRepoDialog no longer renders button inside button; decisions
  tags get +N overflow chip; decisions staleness shown as percentage.

Type-check clean.
…a11y, states

- Add ConfirmDialog component; wire into dead-code Resolve/Ack/FP, decision Confirm/Deprecate, and bulk Resolve. Successful actions emit an Undo toast where reversible.
- Mobile: docs explorer collapses tree below md and overlays as full width; GraphDocPanel/PathFinder/ExecutionFlows/EgoSidebar widths clamped to viewport; React Flow gets touchAction:none and aria-label; MiniMap hidden below sm; AddRepoDialog mobile width.
- Tables: workspace co-change + contract-links rows use min/max-w with tooltips and composite keys; hotspot Commits/Trend cells right-aligned with aria-sort + scope; coverage filter buttons get role=tablist/tab; decisions table now surfaces SWR errors with retry; symbol-table rows are keyboard-reachable (tabIndex/role/aria-label/Enter handler) and Complexity right-aligned.
- A11y: sidebar/mobile-nav derive active repo from URL and auto-expand it; sidebar collapse + repo toggles + operations panel + mobile-nav repo toggles get aria-expanded/aria-controls; Badge renders as <span> with role=status for status variants; RunConfigForm provider/concurrency labels associated; repo-settings excluded-pattern input labelled; dead-code confidence range gets aria-valuetext; code-block copy button aria-label flips on copy; graph context menu now closes on Escape.
- States: dead-code summary shows retry on error and Analyze uses toasts (no inline green text); hotspots and search pages surface load/search errors inline; Recent Jobs items on dashboard now link to the repo overview.
- Polish: command palette list uses max-h-[60dvh]; sidebar non-standard h-4.5/w-4.5 normalized to h-[18px]; wiki article-body H1 downgraded to H2 in renderer; settings save fires a toast; new not-found.tsx for repo sub-routes; DecisionDetail gets a back link.

Verified: tsc --noEmit clean, next build succeeds, next lint clean.
The startup cleanup only reset jobs in 'running' state. A job is created
with status='pending' first, then transitions to 'running' when the
background task picks it up. If the server crashed (or the asyncio task
was cancelled) between row-insert and pickup, the row stayed 'pending'
forever — and the active-job guard in /repos/{id}/sync rejects new
syncs whenever any 'pending' or 'running' row exists, blocking the user
indefinitely with "A sync job is already in progress".

Reset both 'running' and 'pending' on startup so a server restart
guarantees a clean slate.
The previous flow had several gaps that combined to produce the user-visible
"I pressed Sync and nothing happened, now it won't let me sync again":

- If `execute_job` raised before its inner try block (e.g., resolving
  app_state attributes), or if `asyncio.create_task` failed, the row stayed
  pending forever and no one ever marked it failed.
- The UI cleared `activeJobId` on reload, so a refresh erased all progress
  state even though the server still had a job in flight (and was rejecting
  new syncs because of it).
- The progress component rendered "Generating level ?…" for both pending
  and running jobs, hiding the fact that a queued job had never started.
- There was no way to recover from a stuck job from the UI; users had to
  edit the DB.

Server:
- Add POST /api/jobs/{id}/cancel — marks pending/running jobs as failed,
  releasing the active-job guard.
- Move app_state attribute resolution inside execute_job's try block and
  fall back to app_state.session_factory in the failure-recording path so
  any setup failure gets recorded instead of leaving the row pending.
- Wrap _launch_job_task with a fallback _mark_failed coroutine triggered
  on create_task failure, task cancellation, or unhandled task exceptions.

Web:
- New cancelJob() in lib/api/jobs.ts.
- QuickActions now hydrates activeJobId on mount from any in-flight job
  for this repo, so a page reload picks up live progress instead of
  forgetting it.
- If sync fails with "already in progress", QuickActions transparently
  switches to the in-flight job's progress panel and surfaces a hint.
- GenerationProgress distinguishes "Queued — waiting for worker…" from
  "Generating level N…", shows a Cancel button while a job is in flight,
  and surfaces a stuck-job warning after 30s in pending without start.
Root cause of "job_not_found" right after a successful POST /sync:

- In workspace mode, app.state.workspace_sessions[repo_id] is a per-repo
  session_factory pointing at <repo>/.repowise/wiki.db.
- /api/repos/{id}/sync via Depends(get_db_session) correctly routes to
  that per-repo DB and commits the new GenerationJob row there.
- But _launch_job_task previously called execute_job(job_id, app_state),
  and execute_job read app_state.session_factory — the workspace primary
  DB, NOT the per-repo one. The job was committed in DB A and looked up
  in DB B, so get_generation_job returned None and the row stayed
  pending forever.
- Same issue affected /api/jobs/{id}, /api/jobs/{id}/cancel, and the
  SSE stream — all of them queried the primary DB only, so any UI that
  tried to observe or cancel a workspace-repo job would see "Job not
  found" instead of progress.

Fix:
- execute_job accepts session_factory_override; _launch_job_task takes
  repo_id and resolves the right factory via workspace_sessions[repo_id]
  (with primary as fallback) before passing it down. Both sync_repo and
  full_resync now provide repo_id explicitly.
- jobs.py adds _find_job_factory helper that scans primary + all
  workspace factories for the job_id; get_job, cancel_job, and
  stream_job all use it instead of the bound dep.
After the workspace-mode fix in 23e212b, syncs actually run — but on a
moderately large repo the run hits "database is locked" during the bulk
persist phase. Three changes to keep writes from stepping on each other:

1. SQLite busy_timeout 5s → 30s. The pragma is a polite block (SQLite
   blocks the second writer until the first releases or the timeout
   elapses), so a longer ceiling is essentially free. 5s wasn't enough
   for a single bulk persist of tens of thousands of rows on a slow disk.

2. Throttle JobProgressCallback writes to at most one per second AND
   never more than one in flight. Previously a fast phase could fire
   N concurrent UPDATEs while persist_pipeline_result held the write
   lock, generating both a flood of contention and a flood of stack
   traces. Phase boundaries still force-write so the "current level"
   label updates promptly.

3. When the throttled progress write does still hit a lock (rare now),
   log a one-liner instead of dumping the full SQLAlchemy/aiosqlite
   traceback. The state is recoverable — the next write picks up the
   latest counts — so the noise is just confusing.
@RaghavChamadiya RaghavChamadiya requested a review from swati510 as a code owner May 1, 2026 07:28
@RaghavChamadiya RaghavChamadiya merged commit 57bd984 into main May 1, 2026
5 checks passed
@RaghavChamadiya RaghavChamadiya deleted the docs/readmes-and-ui-audit branch May 1, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants