diff --git a/.gitignore b/.gitignore index db11719..53fa80e 100644 --- a/.gitignore +++ b/.gitignore @@ -27,3 +27,8 @@ backend/backend/data/cloned_repos/ backend/cloned_repos/ backend/evaluation/results/ backend/tests/validation_report.html + +# Ignore all markdown files +*.md +!README.md +!**/README.md \ No newline at end of file diff --git a/DISCORD_BOT_SETUP.md b/DISCORD_BOT_SETUP.md deleted file mode 100644 index b3f8ca4..0000000 --- a/DISCORD_BOT_SETUP.md +++ /dev/null @@ -1,90 +0,0 @@ -# GitHub-to-Discord Bot Setup Guide - -This workflow automatically posts commit information to Discord whenever you push code to GitHub. - -## What It Does - -✅ Posts commit details to Discord on every push -✅ Shows commit message, author, branch, and file changes -✅ Displays stats (files changed, insertions, deletions) -✅ Provides clickable link to the commit on GitHub -✅ Formatted with emojis and colors for easy reading - -## Setup Instructions - -### Step 1: Create a Discord Channel (if you don't have one) -1. Go to your Discord server -2. Create a new channel (e.g., `#github-notifications`) -3. Make sure the bot will have permission to post messages - -### Step 2: Create a Discord Webhook - -1. **In Discord**, right-click on the channel where you want notifications -2. Click **"Edit Channel"** -3. Go to **"Integrations"** → **"Webhooks"** -4. Click **"New Webhook"** -5. Give it a name (e.g., "GitHub Bot") -6. Copy the **Webhook URL** - -### Step 3: Add Webhook URL to GitHub Secrets - -1. Go to your GitHub repository: https://github.com/Project-XI/Project-EL -2. Click **Settings** (top right) -3. Go to **Secrets and variables** → **Actions** -4. Click **New repository secret** -5. Name: `DISCORD_WEBHOOK_URL` -6. Paste the webhook URL you copied from Discord -7. Click **Add secret** - -### Step 4: Test It - -1. Make a commit and push to `main`, `develop`, or `master` branch -2. The GitHub Action will automatically run -3. You should see a formatted message in your Discord channel within 30 seconds - -## Example Discord Message - -The bot posts messages that look like: - -``` -📝 New Commit Pushed -"Refine ORACLE analysis and testing UI" - -🔗 Commit: c2c1c7d -👤 Author: rajkoli -🌿 Branch: main -📊 Stats: Files: 18 | +2442 | -271 -📄 Changed Files: -• Docs/index.html -• backend/.env.example -• backend/src/agents/oracle.py -• backend/src/main.py -• backend/src/cli.py -``` - -## Customization - -You can edit `.github/workflows/discord-notify.yml` to: -- Change which branches trigger notifications (currently: `main`, `develop`, `master`) -- Modify the message format and colors -- Add additional fields or remove some -- Change the webhook timeout or retry logic - -## Troubleshooting - -| Issue | Solution | -|-------|----------| -| No message in Discord | Check that `DISCORD_WEBHOOK_URL` secret is set in GitHub Settings | -| Webhook URL error | Make sure the webhook URL is correct and not expired | -| Action fails | Check the GitHub Actions logs: Go to repo → Actions → find the failed run | -| Webhook URL shows as empty | The secret may have been corrupted; delete and recreate it | - -## Security Note - -⚠️ **Never** paste your Discord webhook URL in code or commit messages. -⚠️ Always use GitHub Secrets to store sensitive URLs. -⚠️ If you accidentally expose a webhook URL, delete it and create a new one in Discord. - ---- - -Once set up, every commit push will automatically notify your team on Discord! 🚀 diff --git a/MAIN_AGENT_ISSUE_BREAKDOWN.md b/MAIN_AGENT_ISSUE_BREAKDOWN.md deleted file mode 100644 index 91e1b4a..0000000 --- a/MAIN_AGENT_ISSUE_BREAKDOWN.md +++ /dev/null @@ -1,541 +0,0 @@ -# MAIN Agent Issue Breakdown - -This document defines a contributor-safe issue set for the ORACLE MAIN Agent. The MAIN Agent is the viva orchestration layer, not a reasoning engine. It must stay deterministic, evidence-grounded, and modular. - -## Architecture Boundary - -The current codebase routes the overall pipeline through `backend/src/agents/main_agent/agent.py`, which coordinates GATEKEEPER, ORACLE, and SENTINEL. That orchestration role is the correct home for viva session control. - -The MAIN Agent must: - -- orchestrate the viva flow -- track session state -- persist transcript data -- coordinate ORACLE outputs into questioning decisions -- remain explainable and auditable - -The MAIN Agent must not: - -- perform deep AST analysis -- duplicate ORACLE implementation logic -- invent speculative reasoning -- hide confidence scoring logic -- behave like a generic chatbot - -The issues below are intentionally narrow. Each issue owns one responsibility and must not expand into other agent domains. - ---- - -## Issue 1: Session State Manager - -### 1. Title - -Build a persistent viva session state manager for the MAIN Agent. - -### 2. Purpose - -Create the durable session memory layer that lets the MAIN Agent resume, replay, and continue a viva without losing critical state. - -### 3. Background Context - -The MAIN Agent currently orchestrates the pipeline, but session memory is not yet represented as a clear standalone module. For a viva system, the agent needs a structured state object that survives multiple turns and can be serialized safely. - -This is not a reasoning feature. It is a session coordination primitive that the orchestration layer uses to avoid repetition and preserve transcript continuity. - -### 4. Responsibilities - -- track session lifecycle stages -- store asked question history -- store candidate response history -- retain contradiction memory -- retain weak-area tracking -- maintain topic coverage state -- track follow-up chains -- support session recovery after interruption - -### 5. Technical Requirements - -- the state model must be JSON serializable -- the state must support multi-turn sessions -- the design must be modular and easy to extend -- the model must be replay-safe and order-stable -- transitions must be explicit rather than inferred -- state mutation must be deterministic -- state objects must be testable without the full viva stack - -### 6. Acceptance Criteria - -- session state can be saved and restored without data loss -- asked questions are preserved in order -- responses are mapped to the correct question turns -- contradiction history is retained across turns -- weak-area and coverage fields update predictably -- session transitions can be unit tested -- replay from persisted state produces the same session view - -### 7. Non-Goals - -- no AI reasoning or question generation -- no ORACLE analysis duplication -- no UI rendering logic -- no hidden scoring systems -- no transcript visualization concerns -- no direct database or file system coupling unless abstracted through a storage interface - -### 8. Suggested File Structure - -```text -backend/src/agents/main_agent/session/ - state.py - transitions.py - history.py - persistence.py - -backend/src/agents/main_agent/models/ - session_state.py - transcript_entry.py - coverage_state.py -``` - -### 9. Integration Notes - -- the state manager should be owned by MAIN only -- ORACLE may feed evidence inputs, but not mutate this state directly -- SENTINEL may read session events for audit purposes, but not own the state -- the state schema should expose stable fields for transcript and coverage tracking -- any schema changes must be backward compatible or explicitly versioned - -### 10. Testing Expectations - -- unit tests for serialization and deserialization -- unit tests for session recovery -- unit tests for lifecycle transitions -- unit tests for question history ordering -- unit tests for contradiction tracking -- fixture-based tests for replay consistency - ---- - -## Issue 2: Viva Flow Orchestrator - -### 1. Title - -Build the central viva flow orchestrator for MAIN Agent session progression. - -### 2. Purpose - -Create the deterministic engine that controls viva progression, pacing, and branching decisions. - -### 3. Background Context - -The current MAIN Agent is already the entry point that coordinates GATEKEEPER, ORACLE, and SENTINEL. What is missing is a clearly scoped flow orchestrator that owns the sequence of viva actions after initialization. - -This issue is about orchestration, not intelligence generation. It should decide when to ask, when to follow up, when to move on, and when to end the session. - -### 4. Responsibilities - -- sequence viva questions -- manage pacing and turn progression -- trigger session transitions -- inject follow-up branches -- balance topic categories across the session -- decide when to terminate the viva -- keep the flow deterministic across replay - -### 5. Technical Requirements - -- must consume ORACLE outputs as inputs, not recompute them -- must avoid repetitive questioning -- must support dynamic flow transitions based on session state -- must remain deterministic for the same state and inputs -- must separate sequencing policy from state storage -- must be able to run without UI dependencies -- must not depend on hidden model scores - -### 6. Acceptance Criteria - -- the orchestrator can run a multi-turn viva end to end -- branching follow-ups are handled correctly -- state updates are reflected in the next decision step -- repeated prompts are reduced when coverage already exists -- replaying the same inputs yields the same orchestration decisions -- session termination conditions are explicit and testable - -### 7. Non-Goals - -- no AST parsing -- no fairness scoring -- no speculative LLM reasoning -- no transcript rendering -- no direct repository inspection -- no replacement of the ORACLE analysis layer - -### 8. Suggested File Structure - -```text -backend/src/agents/main_agent/orchestration/ - flow_orchestrator.py - pacing.py - termination.py - branching.py - category_balancer.py -``` - -### 9. Integration Notes - -- the orchestrator should consume session state plus ORACLE viva targets -- it should not infer implementation facts on its own -- it should emit decisions that can be logged and replayed -- any change to session transition semantics should be coordinated with the session state manager -- if a new question type is needed, define it as an orchestration concern, not a reasoning engine - -### 10. Testing Expectations - -- deterministic orchestration tests -- follow-up branching tests -- pacing and termination tests -- coverage-aware sequencing tests -- replay consistency tests -- regression tests for repetitive question avoidance - ---- - -## Issue 3: Follow-Up Question Strategy Engine - -### 1. Title - -Build an evidence-grounded follow-up question strategy engine for MAIN Agent. - -### 2. Purpose - -Create the strategy layer that determines how the MAIN Agent probes shallow answers, contradictions, and implementation gaps. - -### 3. Background Context - -The viva should not feel like a generic chatbot loop. It should challenge implementation understanding using evidence from ORACLE outputs such as viva targets, observable signals, and failure scenarios. - -This engine is responsible for strategy, not creativity. It should choose from deterministic follow-up patterns based on structured evidence. - -### 4. Responsibilities - -- detect shallow or generic responses -- escalate depth when an answer is weak -- probe implementation familiarity -- challenge contradictions -- generate operational follow-ups -- remain grounded in available evidence only - -### 5. Technical Requirements - -- the engine must consume ORACLE signals and failure scenarios -- follow-up selection must be deterministic for the same inputs -- it must avoid textbook-style generic prompts -- it must support contradiction-driven probing -- it must preserve evidence references for each follow-up -- it must not invent facts not present in shared inputs - -### 6. Acceptance Criteria - -- weak answers trigger meaningful follow-ups -- follow-ups remain implementation-specific -- operational and runtime questions are preferred over generic theory -- contradiction probing works when prior answers conflict -- repeated prompts are avoided when a topic was already covered -- generated follow-ups can be traced back to evidence inputs - -### 7. Non-Goals - -- no free-form hallucinated questioning -- no generic chatbot behavior -- no ORACLE logic duplication -- no AI memory beyond the approved session state -- no broad conversation generation outside viva purpose - -### 8. Suggested File Structure - -```text -backend/src/agents/main_agent/followups/ - strategy_engine.py - patterns.py - contradiction_probe.py - weak_answer_detector.py - evidence_mapper.py -``` - -### 9. Integration Notes - -- inputs should come from ORACLE outputs and MAIN session state -- the strategy engine should not parse source code directly -- each follow-up should cite the evidence or gap that triggered it -- if a future enhancement needs new evidence fields, update shared schemas first -- strategy decisions should be loggable for replay and review - -### 10. Testing Expectations - -- weak-answer response tests -- contradiction-based probing tests -- evidence-grounding tests -- non-repetitive follow-up tests -- deterministic strategy selection tests -- fixture tests using known ORACLE outputs - ---- - -## Issue 4: Topic Coverage Tracker - -### 1. Title - -Build a topic coverage tracker for viva breadth and gap detection. - -### 2. Purpose - -Track which implementation domains have been covered so the MAIN Agent can avoid repetitive questioning and can intentionally close gaps. - -### 3. Background Context - -The MAIN Agent should not just ask questions. It should manage coverage across architecture, runtime behavior, failure analysis, and tradeoffs. A topic coverage tracker makes the viva more structured and prevents over-focusing on one area. - -### 4. Responsibilities - -- track architecture coverage -- track runtime reasoning coverage -- track failure-analysis coverage -- track scalability and security coverage -- detect unanswered or under-covered topics -- reduce repetitive questioning - -### 5. Technical Requirements - -- coverage state should be lightweight -- topic tagging must be supported -- updates must be deterministic -- coverage fields should be easy to serialize -- the tracker must integrate with session state -- it must be simple enough for contributors to extend safely - -### 6. Acceptance Criteria - -- topic coverage updates correctly after each turn -- missing-topic detection works consistently -- repeated questioning is reduced when a topic is already covered -- topic tags can be added without rewriting the tracker -- coverage data can be displayed or exported without changing core logic - -### 7. Non-Goals - -- no reasoning about code correctness -- no ORACLE replacement logic -- no UI-specific rendering concerns -- no hidden policy engine -- no autonomous grading of candidate quality - -### 8. Suggested File Structure - -```text -backend/src/agents/main_agent/coverage/ - tracker.py - categories.py - heuristics.py - coverage_state.py -``` - -### 9. Integration Notes - -- coverage should be updated from question issuance and response completion events -- category definitions should remain stable across the viva lifecycle -- the tracker should consume session state rather than duplicating it -- ORACLE evidence can inform the initial topic map, but not the tracker logic itself - -### 10. Testing Expectations - -- coverage update tests -- missing-topic detection tests -- category tagging tests -- repetitive question reduction tests -- deterministic update tests -- serialization tests - ---- - -## Issue 5: Transcript Persistence Layer - -### 1. Title - -Build a transcript persistence layer for explainable viva records. - -### 2. Purpose - -Persist the full viva transcript and related audit events so the session can be reviewed, replayed, and exported. - -### 3. Background Context - -The MAIN Agent needs a durable record of questions, answers, contradiction events, and fairness-related annotations. This record is a core engineering artifact, not a UI artifact. - -The transcript must support explanation and replay. It should be easy for contributors to inspect and hard to misuse. - -### 4. Responsibilities - -- store question and answer entries -- log contradiction events -- log fairness events -- support replay export -- retain event ordering -- preserve evidence links - -### 5. Technical Requirements - -- transcript data must be JSON exportable -- storage structure must be replay-safe -- formatting must remain explainability-friendly -- the layer should not require UI logic -- the layer should not mutate viva decisions -- the persisted format should be stable enough for downstream tools - -### 6. Acceptance Criteria - -- sessions export correctly to JSON -- transcript replay reconstructs the same turn history -- contradiction and fairness events remain linked to the right session step -- the persistence layer can be exercised in isolation -- exported records remain readable by contributors - -### 7. Non-Goals - -- no question generation -- no assessment scoring -- no visualization logic -- no ORACLE analysis duplication -- no hidden state outside the transcript contract - -### 8. Suggested File Structure - -```text -backend/src/agents/main_agent/transcript/ - store.py - serializer.py - replay.py - event_log.py - schemas.py -``` - -### 9. Integration Notes - -- transcript writes should be driven by session events, not ad hoc writes -- the persistence layer should be compatible with the session state manager -- SENTINEL events may be appended as audit annotations, but not interpreted here -- if file storage is used, keep it behind a storage interface - -### 10. Testing Expectations - -- JSON export tests -- replay reconstruction tests -- event ordering tests -- contradiction log tests -- fairness log tests -- storage abstraction tests - ---- - -## Issue 6: ORACLE Integration Adapter - -### 1. Title - -Build a normalized ORACLE integration adapter for the MAIN Agent. - -### 2. Purpose - -Create the adapter that converts ORACLE outputs into stable inputs the MAIN Agent can safely use for orchestration. - -### 3. Background Context - -The MAIN Agent must never duplicate ORACLE analysis. It should only consume normalized ORACLE outputs such as viva targets, observable signals, failure scenarios, and evidence traces. - -This adapter is the contract boundary between the intelligence layer and the orchestration layer. It should absorb schema variability and expose safe, versioned fields to MAIN. - -### 4. Responsibilities - -- normalize ORACLE outputs -- expose viva targets -- expose observable signals -- expose failure scenarios -- validate schema shape -- handle malformed payloads safely - -### 5. Technical Requirements - -- the adapter must not duplicate ORACLE logic -- schema validation must be strict and explicit -- malformed payloads must fail safely -- normalized outputs should be simple for MAIN to consume -- the adapter should remain modular and independently testable -- if schemas evolve, the adapter should be the first compatibility layer updated - -### 6. Acceptance Criteria - -- integration remains stable across expected ORACLE output shapes -- malformed payloads are rejected or normalized safely -- MAIN receives normalized interfaces only -- evidence links remain intact after normalization -- adapter behavior is deterministic and auditable - -### 7. Non-Goals - -- no ORACLE implementation duplication -- no AST parsing -- no viva orchestration logic -- no hidden confidence calculation -- no candidate response scoring - -### 8. Suggested File Structure - -```text -backend/src/agents/main_agent/integration/ - oracle_adapter.py - oracle_schema.py - payload_normalizer.py - validation.py - compatibility.py -``` - -### 9. Integration Notes - -- the adapter should sit between ORACLE output models and MAIN orchestration logic -- the adapter should be the only place where ORACLE payload shape differences are handled -- if ORACLE adds a new field, update the adapter and shared schema intentionally -- do not let MAIN reach into ORACLE internals directly - -### 10. Testing Expectations - -- valid payload normalization tests -- malformed payload rejection tests -- schema compatibility tests -- regression tests for evidence mapping -- deterministic output tests -- adapter isolation tests - ---- - -## Contributor Guidance - -When implementing any of these issues, contributors must follow these rules: - -- keep the MAIN Agent orchestration-focused -- preserve evidence grounding at every decision point -- do not add speculative AI systems -- do not duplicate ORACLE intelligence logic -- keep new modules modular and independently testable -- prefer small, composable files over monolithic logic -- make every state transition auditable - -If a change starts to look like analysis, move it to ORACLE. If it starts to look like moderation, move it to SENTINEL. If it starts to look like submission validation, move it to GATEKEEPER. - -## Recommended Review Standard - -Before merging any MAIN Agent work, reviewers should confirm: - -- the issue is narrow and non-overlapping -- the code does not infer hidden reasoning -- the code does not read like a generic chat assistant -- the code uses ORACLE outputs rather than recomputing them -- the code remains deterministic and replayable -- the code can be explained by a contributor in one paragraph diff --git a/ORACLE_ARCHITECTURE_DOCUMENTATION.md b/ORACLE_ARCHITECTURE_DOCUMENTATION.md deleted file mode 100644 index a9f62f8..0000000 --- a/ORACLE_ARCHITECTURE_DOCUMENTATION.md +++ /dev/null @@ -1,405 +0,0 @@ -# ORACLE Architecture Documentation - -## System Overview - -``` -┌─────────────────────────────────────────────────────────────────────────────┐ -│ ORACLE Implementation Familiarity System │ -│ Architecture v1.0 (Stable) │ -└─────────────────────────────────────────────────────────────────────────────┘ -``` - ---- - -## Module Inventory & Ownership - -### Core Analysis Pipeline - -| Module | LOC | Purpose | Owner | Status | -|--------|-----|---------|-------|--------| -| `viva_session_conductor.py` | 677 | Orchestrate viva sessions, score responses | Core | ✅ ACTIVE | -| `reasoning_depth_analyzer.py` | 543 | Analyze reasoning patterns, classify implementation familiarity | Core | ✅ ACTIVE | -| `trust_audit.py` | 338 | Verify evidence grounding, detect overconfidence | Core | ✅ ACTIVE | -| `engineering_review_corpus.py` | 375 | Real engineering review data (grounding source) | Core | ✅ ACTIVE | -| `failure_corpus.py` | 807 | Failure pattern scenarios (probing targets) | Core | ✅ ACTIVE | - -**Total: 2,740 LOC** ← ORACLE core intelligence - -### Evaluation & Validation - -| Module | LOC | Purpose | Owner | Status | -|--------|-----|---------|-------|--------| -| `comparative_reasoning_evaluator.py` | 430 | Compare ORACLE vs engineering reviews | Validation | ✅ ACTIVE | -| `comparative_evaluator.py` | 600 | Multi-dimensional comparative analysis | Validation | ⚠️ REVIEW | -| `execution_behavior_analysis.py` | 466 | Analyze code execution patterns | Analysis | ⚠️ REVIEW | -| `evaluator.py` | 628 | Initial evaluator (older pattern) | Validation | ⚠️ REVIEW | - -**Total: 2,124 LOC** ← Validation layers (3 competing systems) - -### Infrastructure & Data - -| Module | LOC | Purpose | Owner | Status | -|--------|-----|---------|-------|--------| -| `models.py` | 392 | Core data models | Infrastructure | ⚠️ CONSOLIDATE | -| `human_evaluator_models.py` | 341 | Human evaluator models | Infrastructure | ⚠️ CONSOLIDATE | -| `datasets.py` | 173 | Dataset management | Infrastructure | ✅ ACTIVE | -| `comparative_calibration_runner.py` | 137 | CLI test runner | Infrastructure | ✅ ACTIVE | - -**Total: 1,043 LOC** ← Data layer (2 competing model files) - -### Dead Code - -| Module | LOC | Purpose | Issue | Action | -|--------|-----|---------|-------|--------| -| `viva_simulation.py` | 449 | Simulate student responses | Not imported, duplicates conductor | 📌 ARCHIVE | - -**Total: 449 LOC** ← Dead code - ---- - -## Execution Flow - -### Happy Path: Generate Assessment - -``` -Input: Repository + Engineering Concern - ↓ -[1] Load Engineering Review - ├─ Find relevant reviews (matching concern) - └─ Extract implementation signals - ↓ -[2] Analyze Failure Patterns - ├─ Identify potential failure modes - └─ Map to observable signals - ↓ -[3] Generate Viva Session Plan - ├─ Create 3-4 opening questions - │ └─ Evidence-grounded in reviews - ├─ Prepare follow-up paths - └─ Design response evaluation rubric - ↓ -[4] Conduct Viva Session (Interactive Loop) - ├─ Present question - ├─ Accept response - ├─ Score response quality - │ ├─ Specificity (0-100%) - │ ├─ Correctness (0-100%) - │ └─ Quality enum (EXCELLENT/GOOD/ADEQUATE/WEAK/EVASIVE/CONTRADICTION) - ├─ Determine if follow-up needed - │ └─ If ADEQUATE/WEAK: generate targeted follow-up - └─ Repeat until 3-4 responses collected - ↓ -[5] Analyze Reasoning Patterns - ├─ For each response, extract indicators: - │ ├─ Understanding indicators (7 types): - │ │ ├─ EXPLAINS_RATIONALE - │ │ ├─ MENTIONS_TRADEOFFS - │ │ ├─ HANDLES_EDGE_CASE - │ │ ├─ IDENTIFIES_GAPS - │ │ ├─ ADMITS_UNCERTAINTY - │ │ ├─ INTEGRATES_CONTEXT - │ │ └─ CITES_SPECIFIC_IMPLEMENTATION - │ ├─ Memorization indicators (7 types): - │ │ ├─ TEXTBOOK_LANGUAGE - │ │ ├─ GENERIC_ANSWER - │ │ ├─ FAILS_FOLLOW_UP - │ │ ├─ CONTRADICTS_SELF - │ │ ├─ PARROTS_QUESTION - │ │ ├─ USES_BUZZWORDS - │ │ └─ BLANK_ON_EDGE_CASE - └─ Classify reasoning pattern (1 of 5 levels) - ↓ -[6] Compute Implementation Familiarity - ├─ Base classification (DEEP/PRACTICED/INFORMED/LOW/INSUFFICIENT) - ├─ Confidence score (0-1) - │ └─ <2 indicators → MEDIUM/LOW confidence - │ └─ 2-4 indicators → HIGH confidence - │ └─ >4 indicators → VERY HIGH confidence - └─ Uncertainty surfaces (e.g., "Based on 1 indicator, insufficient data") - ↓ -[7] Trust Audit - ├─ Check for overconfidence - │ └─ Flag any score > 0.95 without 3+ indicators - ├─ Verify evidence grounding - │ └─ Every conclusion must map to specific evidence - ├─ Flag contradictions - │ └─ When responses diverge, note it - └─ Surface uncertainty - └─ Confidence < 0.7 → flag as MEDIUM/LOW - ↓ -[8] Generate Assessment Report - ├─ Classification: DEEP_IMPLEMENTATION_FAMILIARITY / PRACTICED / INFORMED / INSUFFICIENT - ├─ Confidence: HIGH/MEDIUM/LOW (reflects signal strength) - ├─ Evidence trace: Q1→[indicators]→score, Q2→[indicators]→score, ... - ├─ Uncertainty: "Based on X indicators, confidence Y%" - └─ Transcript: Full Q&A with evaluation markers - -Output: Assessment Report + Transcript + Explainability -``` - ---- - -## Data Flow - -### Request Phase -``` -Repository Metadata - ↓ -Code Structure → ExecutionGraph -Engineering Reviews (Corpus) → CorpusContext -Failure Patterns (Corpus) → FailureSignalMap -``` - -### Session Phase -``` -Question Plan - ↓ -Candidate Response - ↓ -ResponseEvaluation - ├─ Specificity score - ├─ Correctness score - ├─ Quality enum - └─ Red flags - ↓ -IndicatorExtraction - ├─ Understanding indicators - └─ Memorization indicators -``` - -### Analysis Phase -``` -Indicator Data (per response) - ↓ -ReasoningPatternClassification - ├─ Score calculation - ├─ Depth classification - └─ Confidence computation - ↓ -AggregateProfile - ├─ Overall familiarity - ├─ Overall confidence - └─ Red flags - ↓ -TrustAudit - └─ Overconfidence check - └─ Evidence verification - ↓ -FinalAssessment - ├─ Grounded classification - ├─ Confidence (reflects uncertainty) - └─ Explainability trace -``` - ---- - -## Module Dependencies - -### Import Graph - -``` -reasoning_depth_analyzer.py -├─ Imports from: (no other HV modules) -├─ Used by: __init__.py, comparative_calibration_runner.py -└─ Data source: VivaSession (external) - -viva_session_conductor.py -├─ Imports from: engineering_review_corpus -├─ Used by: __init__.py, comparative_calibration_runner.py -└─ Data source: CandidateResponse (external) - -trust_audit.py -├─ Imports from: (no other HV modules) -├─ Used by: comparative_calibration_runner.py -└─ Data source: Various (flexible) - -engineering_review_corpus.py -├─ Imports from: (no other HV modules) -├─ Used by: viva_session_conductor.py, comparative_reasoning_evaluator.py -└─ Data source: Hardcoded fixture - -failure_corpus.py -├─ Imports from: (no other HV modules) -├─ Used by: comparative_evaluator.py, execution_behavior_analysis.py -└─ Data source: Hardcoded fixture - -comparative_reasoning_evaluator.py -├─ Imports from: engineering_review_corpus, failure_corpus -├─ Used by: comparative_calibration_runner.py -└─ Data source: VivaSession, assessment report - -⚠️ COMPLEX DEPENDENCIES: - -evaluator.py -├─ Imports from: human_evaluator_models, models -├─ Used by: [unknown] -└─ Status: REVIEW NEEDED - -comparative_evaluator.py -├─ Imports from: human_evaluator_models, execution_behavior_analysis -├─ Used by: comparative_calibration_runner.py -└─ Status: REVIEW NEEDED (3 competing evaluation systems) - -execution_behavior_analysis.py -├─ Imports from: models, failure_corpus -├─ Used by: comparative_evaluator.py -└─ Status: REVIEW NEEDED (speculative behavior analysis) - -models.py -├─ Imports from: (no other HV modules) -├─ Imported by: execution_behavior_analysis.py -└─ Status: CONSOLIDATE with human_evaluator_models.py - -human_evaluator_models.py -├─ Imports from: (no other HV modules) -├─ Imported by: evaluator.py, comparative_evaluator.py -└─ Status: CONSOLIDATE with models.py -``` - ---- - -## Critical Issues Summary - -### 🔴 Issue 1: Multiple Evaluation Systems - -**Problem:** Three competing evaluation approaches -``` -evaluator.py ← Original pattern, usage unknown -comparative_evaluator.py ← Newer, multi-dimensional -execution_behavior_analysis.py ← Speculative behaviors -``` - -**Impact:** -- Confusing maintenance burden -- Potential behavioral divergence -- Unclear which is "source of truth" - -**Resolution:** -- [ ] Audit which is actually used -- [ ] Consolidate into single `implementation_familiarity_evaluator.py` -- [ ] Document clear ownership - ---- - -### 🔴 Issue 2: Split Model Definitions - -**Problem:** Models scattered across two files -``` -models.py → 11 classes (ExecutionGraph, etc.) -human_evaluator_models.py → 16 classes (HumanEvaluationSession, etc.) -``` - -**Impact:** -- Schema inconsistencies -- Import confusion -- Maintenance overhead - -**Resolution:** -- [ ] Consolidate into single `models.py` -- [ ] Create clear sections: CoreModels, SessionModels, EvaluationModels - ---- - -### 🟠 Issue 3: Dead Code - -**Problem:** viva_simulation.py not imported anywhere -``` -viva_simulation.py (449 LOC) ← Designed for simulating students -viva_session_conductor.py ← Active, production use -``` - -**Impact:** -- Confusing for new developers -- Maintenance burden -- Dead code in repository - -**Resolution:** -- [ ] Document as "archived/deprecated" -- [ ] Keep in repo with clear deprecation notice -- [ ] Potential future use for unit testing - ---- - -### 🟠 Issue 4: API Bloat - -**Problem:** 52 exported symbols from __init__.py - -**Impact:** -- Confusing public API -- Hard to find what to use -- Maintenance overhead - -**Resolution:** -- [ ] Reduce to ~20 core exports: - - Core models: VivaQuestion, CandidateResponse, VivaSession - - Core analyzers: VivaSessionConductor, ReasoningDepthAnalyzer - - Core data: EngineeredReviewEntry, FailureCorpusRepository - - Evaluation: TrustAuditPipeline - - Main runner: ComparativeCalibrationRunner - ---- - -## Terminology Mapping (Hardening) - -### Before (Pseudo-Psychological) -``` -"Builder Detection" -"Fake Developer Detection" -"Deep Builder" -"Memorizer" -"Builder Confidence" -"Reasoning Depth" -``` - -### After (Evidence-Grounded) -``` -"Implementation Familiarity Analysis" -"Surface Knowledge Identification" -"High Implementation Familiarity" -"Low Implementation Familiarity" -"Implementation Familiarity Confidence Score" -"Reasoning Pattern Classification" -``` - ---- - -## Testing Strategy - -### Unit Tests (Per Module) -- VivaSessionConductor: Question generation, response scoring -- ReasoningDepthAnalyzer: Indicator detection, classification -- TrustAuditPipeline: Overconfidence detection, evidence tracing - -### Integration Tests -- End-to-end: Repository → Assessment Report -- Fairness: Edge cases (weak speakers, confident guessers, etc.) -- Bias: Communication style doesn't affect classification - -### Real Human Testing -- Internal validation (3 builders, 3 non-builders) -- Pilot study (10-15 real people) -- Error case collection (misclassifications) - ---- - -## Success Criteria - -| Criterion | Target | Current | Status | -|-----------|--------|---------|--------| -| Module clarity | <3 competing systems | 3 evaluation systems | ⚠️ TODO | -| Dead code | 0% | viva_simulation.py | ⚠️ TODO | -| Export bloat | <20 core symbols | 52 symbols | ⚠️ TODO | -| False positive rate | <5% on real humans | Unknown | ❌ TEST | -| Fairness bias | 0 communication-style correlation | Unknown | ❌ TEST | -| Evidence tracing | 100% traceable | ~90% | ⚠️ TODO | -| Documentation | Complete | 0% | ❌ TODO | - ---- - -## Next Steps - -1. **Week 1**: Audit and consolidate evaluation systems -2. **Week 2**: Rename terminology throughout codebase -3. **Week 3**: Implement fairness audit framework -4. **Week 4**: Real human testing framework -5. **Week 5+**: Validation and hardening diff --git a/ORACLE_PHASE_2_SUMMARY.md b/ORACLE_PHASE_2_SUMMARY.md deleted file mode 100644 index 2cdcb4c..0000000 --- a/ORACLE_PHASE_2_SUMMARY.md +++ /dev/null @@ -1,349 +0,0 @@ -# ORACLE Phase 2 Complete: Evidence-Grounded Intelligence with Validation & Calibration - -## 🎯 Mission Accomplished - -ORACLE has evolved from "evidence-grounded implementation analysis" to "validated, calibrated, and stress-tested engineering intelligence infrastructure." - -### What We Built - -#### Phase 2 Evolution: Three Evidence-Grounded Engines -✅ **ObservableSignalsEngine** - Extracts observable facts (error handling, resilience patterns, observability) -✅ **ExecutionGraphFailureAnalyzer** - Traces failure scenarios through execution graphs -✅ **EvidenceGroundedVivaGenerator** - Creates interview questions grounded in code evidence - -#### Validation & Calibration Framework -✅ **Repository Fixtures** - 4 stress-test repositories with expected outputs -✅ **Signal Validator** - Precision/Recall metrics for observable signals -✅ **Failure Propagation Validator** - Validates execution path analysis -✅ **Viva Quality Validator** - Rejects generic/textbook questions -✅ **Confidence Calibrator** - Calibrates scores to actual accuracy -✅ **Runtime Observability** - Deep tracing of all reasoning -✅ **Calibration Dashboard** - Interactive visualization -✅ **CI/CD Integration** - Automated threshold checking - -#### Integration & Documentation -✅ **Integration Scripts** - Ready-to-use validation wrappers -✅ **GitHub Actions Workflow** - Automated pipeline on every PR/push -✅ **Comprehensive Documentation** - Architecture, quick-start, integration guide -✅ **Quality Gating** - Threshold checking with exit codes - -## 📊 Architecture Overview - -``` -┌──────────────────────────────────────────────────┐ -│ ORACLE Agent Process │ -├──────────────────────────────────────────────────┤ -│ 1. Document Parsing │ -│ 2. Repository Analysis │ -│ 3. Execution Graph Build │ -│ 4. Observable Signals Extraction ← PHASE 2 │ -│ 5. Failure Scenario Analysis ← PHASE 2 │ -│ 6. Viva Question Generation ← PHASE 2 │ -│ 7. Architecture Inference │ -│ 8. Context Assembly │ -│ 9. Implementation Flow Analysis │ -└──────────────────────────────────────────────────┘ - ↓ -┌──────────────────────────────────────────────────┐ -│ Validation Pipeline │ -├──────────────────────────────────────────────────┤ -│ • Signal Validator (Precision/Recall/F1) │ -│ • Failure Validator (Propagation Accuracy) │ -│ • Viva Quality Validator (Specificity Score) │ -│ • Confidence Calibrator (RMSE/MAE) │ -└──────────────────────────────────────────────────┘ - ↓ -┌──────────────────────────────────────────────────┐ -│ Metrics & Reporting │ -├──────────────────────────────────────────────────┤ -│ • Precision/Recall metrics │ -│ • Confidence calibration curves │ -│ • Issue detection & recommendations │ -│ • Repository-specific performance │ -│ • Dashboard visualization │ -└──────────────────────────────────────────────────┘ -``` - -## 📈 Performance Baseline (After Full Integration) - -| Component | Precision | Recall | F1 Score | Notes | -|-----------|-----------|--------|----------|-------| -| **Signals** | 0.847 | 0.823 | 0.835 | Observable facts detected accurately | -| **Failures** | 0.805 | 0.778 | 0.791 | Propagation paths correctly traced | -| **Viva Questions** | — | — | — | Validity: 0.856, Grounding: 0.912 | -| **Confidence Calibration** | — | — | — | RMSE: 0.062 (excellent) | - -## 🗂️ Deliverables - -### Core Validation Framework -``` -backend/evaluation/calibration/ -├── __init__.py ✅ Framework overview -├── repository_fixtures.py ✅ 4 diverse test cases -├── signal_validator.py ✅ Observable signal validation -├── failure_propagation_validator.py ✅ Failure scenario validation -├── viva_quality_validator.py ✅ Viva question validation -├── confidence_calibrator.py ✅ Confidence score calibration -├── observability.py ✅ Runtime tracing -├── calibration_runner.py ✅ Orchestration & reporting -├── README.md ✅ Detailed documentation -├── SYSTEM_OVERVIEW.md ✅ Architecture guide -└── INTEGRATION_GUIDE.md ✅ Integration instructions -``` - -### Integration Scripts -``` -backend/evaluation/ -├── check_calibration_thresholds.py ✅ Quality gating -├── validate_oracle_analysis.py ✅ Validation wrapper -└── CALIBRATION_QUICKSTART.md ✅ Quick reference -``` - -### Visualization -``` -backend/testing_oracle_ui/ -└── calibration_dashboard.html ✅ Interactive dashboard -``` - -### CI/CD Automation -``` -.github/workflows/ -└── calibration.yml ✅ GitHub Actions pipeline -``` - -### Refactored Core -``` -backend/src/agents/oracle/agent.py ✅ Phase 2 integrated -backend/src/services/intelligence/ -├── observable_signals_engine.py ✅ Signals extraction -├── execution_graph_failure_analyzer.py ✅ Failure analysis -└── evidence_grounded_viva_generator.py ✅ Viva generation -``` - -## 🔑 Key Capabilities - -### ✅ Evidence-Grounded Intelligence -- All signals reference specific code locations -- Failure scenarios trace through execution graph -- Viva questions grounded in actual code patterns -- No speculation or unsupported reasoning - -### ✅ Comprehensive Validation -- Precision/Recall metrics for each component -- Confidence calibration to actual accuracy -- False positive/negative detection -- Hallucination detection - -### ✅ Stress-Testing -- Clean FastAPI REST APIs -- Messy student projects -- Broken async systems -- Monorepos with shared state - -### ✅ Quality Assurance -- Rejects generic textbook questions -- Detects speculative reasoning -- Validates evidence grounding -- Measures specificity scores - -### ✅ Observable Reasoning -- Traces signal generation -- Captures execution graph traversal -- Records failure propagation steps -- Exports reasoning as JSON - -### ✅ Continuous Calibration -- Confidence scores calibrated to observed accuracy -- Confidence buckets mapped to precision/recall -- RMSE/MAE metrics for calibration quality -- Automated recalibration recommendations - -## 🚀 Quick Start - -### Run Calibration -```bash -cd backend -python -m evaluation.calibration.calibration_runner -``` - -### Check Thresholds -```bash -cd backend -python evaluation/check_calibration_thresholds.py -``` - -### View Dashboard -``` -Open: backend/testing_oracle_ui/calibration_dashboard.html -``` - -### Validate Against Fixtures -```bash -cd backend -python evaluation/validate_oracle_analysis.py -``` - -## 📋 Integration Checklist - -### ✅ Phase 2 Complete -- [x] ObservableSignalsEngine implemented (400+ lines) -- [x] ExecutionGraphFailureAnalyzer implemented (350+ lines) -- [x] EvidenceGroundedVivaGenerator implemented (250+ lines) -- [x] OracleAgent refactored with Phase 2 engines -- [x] Phase 1 deprecated engines removed - -### ✅ Validation Framework Complete -- [x] Repository fixtures defined (4 cases) -- [x] Signal validator implemented -- [x] Failure propagation validator implemented -- [x] Viva quality validator implemented -- [x] Confidence calibrator implemented -- [x] Observability infrastructure created -- [x] Calibration runner orchestrated -- [x] Dashboard visualization built - -### ✅ Integration & Automation Complete -- [x] Threshold checking script (420+ lines) -- [x] Validation wrapper script (280+ lines) -- [x] GitHub Actions workflow (200+ lines) -- [x] Comprehensive documentation (1500+ lines) -- [x] Quick start guide - -### 🔄 Next: Runtime Integration (Optional) -- [ ] Wire validators into OracleAgent.process() -- [ ] Add trace emission to intelligence engines -- [ ] Create trace collection during analysis -- [ ] Integrate dashboard with live data -- [ ] Set up automated trend monitoring - -## 💡 Key Insights - -### Why Validation Matters -1. **Confidence scores alone aren't enough** - They tell you about single predictions, not system-wide accuracy -2. **Hallucinations are detectable** - If signals/scenarios don't exist in code, validators catch them -3. **Calibration prevents overconfidence** - If system says 95% confident but is only 60% accurate, that's a problem -4. **Specificity can be measured** - Questions can be graded on code-specificity vs generic trivia -5. **Evidence grounding is verifiable** - Every signal must reference specific file locations - -### Evolution Path -``` -Phase 0: Template-based reasoning only -Phase 1: Added speculative scoring (rejected as unreliable) -Phase 2: Evidence-grounded with validation (✅ SHIPPED) -Phase 3: Runtime tracing & continuous calibration (ready) -Phase 4: Automated refinement based on validation results (future) -``` - -## 📚 Documentation Map - -**Start Here:** -- [CALIBRATION_QUICKSTART.md](backend/evaluation/CALIBRATION_QUICKSTART.md) - 5 min overview - -**For Understanding:** -- [SYSTEM_OVERVIEW.md](backend/evaluation/calibration/SYSTEM_OVERVIEW.md) - Architecture & concepts -- [README.md](backend/evaluation/calibration/README.md) - Detailed framework - -**For Implementation:** -- [INTEGRATION_GUIDE.md](backend/evaluation/calibration/INTEGRATION_GUIDE.md) - Step-by-step integration - -**For Using:** -- Individual validator docstrings - For programmatic use - -## 🎓 Example: Validation in Action - -### Scenario: Validating FastAPI Repository - -``` -1. Run OracleAgent Analysis - ↓ Extracts 3 signals - ↓ Detects 2 failure scenarios - ↓ Generates 8 viva questions - -2. Run Signal Validator - Expected: ["Async error recovery", "Redis resilience", "Observability"] - Detected: ["Async error recovery", "Redis resilience", "Observability"] - Result: ✅ Precision 1.00, Recall 1.00 - -3. Run Failure Validator - Expected: ["DB connection loss", "Redis cache failure"] - Detected: ["DB connection loss", "Redis cache failure"] - Result: ✅ Precision 1.00, Propagation accuracy 1.00 - -4. Run Viva Validator - Generated: 8 questions - Valid (specific, grounded): 7 - Invalid (generic, speculative): 1 - Result: ⚠️ Validity 0.875 (improvement needed) - -5. Run Calibration - Confidence scores vs actual accuracy - Result: ✅ RMSE 0.045 (well-calibrated) - -6. Final Report - ✅ Signals: Excellent - ✅ Failures: Excellent - ⚠️ Viva: Good (minor issues) - ✅ Overall: PASSED -``` - -## 🔐 What We Prevent - -### ❌ Hallucinations -- Signals now reference actual code locations -- Validators verify evidence exists -- False positives detected and reported - -### ❌ Overconfidence -- Confidence scores calibrated to actual accuracy -- If system says 95% confident, it actually ~95% accurate -- Calibration RMSE tracks quality - -### ❌ Generic Reasoning -- Viva questions must be specific to codebase -- Textbook patterns rejected ("What is FastAPI?") -- Speculative patterns rejected ("How would you add ML?") - -### ❌ Ungrounded Failures -- Propagation paths must exist in execution graph -- Recovery strategies must be code-grounded -- Risk severity justified by path count - -### ❌ Regression -- CI/CD checks ensure metrics don't degrade -- Strict mode for main branch (higher thresholds) -- Standard mode for branches (reasonable thresholds) - -## 📞 Support & Questions - -**How do I run validation?** -→ See [CALIBRATION_QUICKSTART.md](backend/evaluation/CALIBRATION_QUICKSTART.md) - -**How do I interpret the metrics?** -→ See [SYSTEM_OVERVIEW.md](backend/evaluation/calibration/SYSTEM_OVERVIEW.md#interpreting-results) - -**How do I integrate with my code?** -→ See [INTEGRATION_GUIDE.md](backend/evaluation/calibration/INTEGRATION_GUIDE.md) - -**What if my metrics are low?** -→ Troubleshooting section in QUICKSTART.md - -## 🎉 Summary - -ORACLE Phase 2 is complete with: - -✅ **3 Evidence-Grounded Engines** - Observable signals, failure propagation, grounded viva -✅ **Comprehensive Validation** - Precision/Recall metrics across all components -✅ **Confidence Calibration** - Scores calibrated to actual accuracy -✅ **Stress-Testing** - Validated against 4 diverse repository types -✅ **Runtime Observability** - Deep tracing of reasoning -✅ **Quality Dashboard** - Interactive visualization -✅ **CI/CD Automation** - Threshold checking on every PR/push -✅ **Complete Documentation** - Architecture, integration, quick-start guides - -**Status: PRODUCTION READY** ✅ - -Next optional step: Wire validators into OracleAgent for runtime integration. - ---- - -**ORACLE Evidence-Grounded Intelligence** | Validated ✓ | Calibrated ✓ | Observable ✓ | Production-Ready ✓ diff --git a/ORACLE_PHASE_3_5_STATUS.md b/ORACLE_PHASE_3_5_STATUS.md deleted file mode 100644 index d439447..0000000 --- a/ORACLE_PHASE_3_5_STATUS.md +++ /dev/null @@ -1,310 +0,0 @@ -# ORACLE Phase 3.5: Stabilization & Reality Hardening Status - -**Date:** May 18, 2026 -**Status:** ✅ PHASE 1 COMPLETE | Starting Phase 2 -**Focus:** Transform from "advanced prototype" to "stable, trustworthy infrastructure" - ---- - -## Phase 1: Architecture Assessment & Cleanup ✅ COMPLETE - -### Deliverables - -#### 1. Architecture Documentation (3 comprehensive guides) - -| Document | Purpose | Status | -|----------|---------|--------| -| [ORACLE_ARCHITECTURE_DOCUMENTATION.md](ORACLE_ARCHITECTURE_DOCUMENTATION.md) | Module inventory, execution flow, data flow, dependencies | ✅ Complete | -| [ORACLE_STABILIZATION_PLAN.md](ORACLE_STABILIZATION_PLAN.md) | 8-week hardening roadmap, success criteria | ✅ Complete | -| [ORACLE_TESTING_FRAMEWORK.md](ORACLE_TESTING_FRAMEWORK.md) | Real human testing protocols (4 phases, metrics) | ✅ Complete | - -#### 2. Code Improvements - -| Item | From | To | Impact | -|------|------|----| -------| -| API Exports | 52 symbols | 30 symbols | 47% reduction, API clarity | -| Dead Code | viva_simulation.py active | Archived with deprecation | Maintenance burden reduced | -| Model Organization | Split across 2 files | Clear ownership | Schema consistency | -| Fairness Framework | Not implemented | FairnessAuditor + framework | Bias detection enabled | - -#### 3. Fairness Audit Framework - -**New Module:** `fairness_audit.py` (350+ LOC) - -**Features:** -- ✅ Communication style bias detection (8 patterns) -- ✅ Demographic bias auditing (8 contexts) -- ✅ Overconfidence detection (>0.95 scores, insufficient evidence) -- ✅ False positive pattern detection (weak communicators) -- ✅ False negative pattern detection (confident guessers) -- ✅ Uncertainty surfacing (confidence reduction) -- ✅ Manual review recommendations - -**Classes:** -- `FairnessAuditReport`: Comprehensive audit results -- `FairnessAuditIssue`: Individual bias/false-positive issues -- `FairnessAuditor`: Main auditing engine - ---- - -## Phase 1 Achievements - -### ✅ Architecture Frozen - -``` -PRESERVED: - ✓ AST-first design - ✓ Execution graph foundation - ✓ Explainability - ✓ Deterministic behavior - ✓ Calibration systems - ✓ Comparative validation - -NOT ADDED: - ✗ New intelligence engines - ✗ New reasoning layers - ✗ Speculative AI features - ✗ Architectural abstractions -``` - -### ✅ Terminology Hardened - -| Before | After | Reason | -|--------|-------|--------| -| Builder Detection | Implementation Familiarity Analysis | Removes psychological framing | -| Deep Builder | High Implementation Familiarity | Grounded language | -| Memorizer | Low Implementation Familiarity | Neutral classification | -| Builder Confidence | Impl. Familiarity Score | Removes fabrication | -| Reasoning Depth Detection | Reasoning Pattern Classification | Evidence-based | - -### ✅ Codebase Clarity - -**Module Status:** - -| Category | Status | Details | -|----------|--------|---------| -| **Core** | ✅ ACTIVE | viva_session_conductor, reasoning_depth_analyzer, fairness_audit, trust_audit | -| **Grounding** | ✅ ACTIVE | engineering_review_corpus, failure_corpus | -| **Validation** | ⚠️ REVIEW | 3 competing evaluation systems need consolidation | -| **Infrastructure** | ✅ ACTIVE | datasets, calibration_runner | -| **Dead Code** | ✅ ARCHIVED | viva_simulation.py (deprecation notice added) | - -### ✅ False Positive Reduction Framework - -**Detection Patterns Implemented:** - -1. **Nervous Developer Pattern** - - Detection: Nervous hedging + low memorization + high understanding - - Action: Manual review recommended, not penalized - -2. **Confident Guesser Pattern** - - Detection: High confidence + zero understanding + textbook language - - Action: Confidence reduced, false negative risk flagged - -3. **Overconfidence Pattern** - - Detection: Score >0.95 with <4 indicators - - Action: Reduce to ≤0.85, flag as critical - -4. **Insufficient Evidence Pattern** - - Detection: Confidence HIGH/MEDIUM with <2 indicators - - Action: Reduce to LOW, mark insufficient data - -5. **Demographic Bias Pattern** - - Detection: Non-native speaker, early career, neurodivergent communication - - Action: Manual review recommended, separation of fluency from familiarity - -6. **Communication Style Bias Pattern** - - Detection: Correlation between communication traits and assessment - - Action: Surface explicitly, adjust if correlated - -### ✅ End-to-End Validation Working - -**Test Results:** - -| Test Case | Input | System Output | Correct? | -|-----------|-------|---------------|----------| -| Nervous Builder (HIGH impl famil) | Hesitant delivery | Detects bias, recommends review | ✅ YES | -| Confident Guesser (LOW impl famil) | Confident buzzwords | Detects false negative risk | ✅ YES | -| Edge Case: Non-native Speaker | Technical depth | Flags demographic bias risk | ✅ YES | - ---- - -## Phase 2: Real Human Testing (Next) - -### Timeline: Weeks 3-4 - -#### 2.1 Internal Validation (Week 1-2 Concurrent) - -**Participants:** 6 total -- 3 builders (actually built systems) -- 3 non-builders (read code/docs only) -- 1 weak communicator (builder) -- 1 confident speaker (non-builder) -- 1 non-native speaker (builder) - -**Metrics:** -- True positive rate (identify builders): Target >90% -- True negative rate (identify non-builders): Target >90% -- False positive rate: Target <10% -- Communication bias: Target <5% correlation - -#### 2.2 Pilot Human Study (Week 3-4) - -**Participants:** 10-15 real people -- Backend developers (2-3) -- System contributors (2-3) -- Engineering leads (1-2) -- Students/learners (3-4) -- Cross-team members (2-3) - -**Data Collection:** -- Pre/post surveys (communication style, demographics) -- Viva session recordings -- Assessment outputs (classification, confidence, evidence) -- Fairness audit results -- Participant feedback (accuracy 1-5 scale) -- Interviewer observations - -**Outputs:** -- Accuracy rates by participant type -- Disagreement analysis (false positive/negative patterns) -- Bias analysis (demographic patterns) -- Recommendations for improvements - ---- - -## Critical Path for Production Readiness - -### Completed ✅ -- [x] Architecture assessment -- [x] Terminology hardening initiated -- [x] Fairness audit framework implemented -- [x] Documentation complete -- [x] API exports reduced - -### In Progress 🔄 -- [ ] Real human testing (Phase 1) -- [ ] False positive/negative pattern analysis -- [ ] System adjustments based on testing - -### Not Started ⏳ -- [ ] Terminology hardening completion (all code) -- [ ] End-to-end workflow reliability hardening -- [ ] Viva UX improvements -- [ ] Trust audit expansion - ---- - -## Key Metrics - -### API Health -- **Exports:** 52 → 30 symbols (47% ↓) -- **Dead Code:** 449 LOC archived -- **Module Clarity:** 3 competing systems identified for consolidation - -### Testing Coverage -- **Unit Tests:** Fairness audit patterns (6 detectors) -- **Integration Tests:** End-to-end viva → analysis → fairness audit (✅ PASSING) -- **Real Human Tests:** Ready to launch (protocols created) - -### Documentation -- **Architecture:** Complete (module inventory, flows, dependencies) -- **Execution:** Complete (data flow, module graph) -- **Testing:** Complete (4-phase protocol, success criteria) -- **Fairness:** Complete (bias patterns, detection rules) - ---- - -## Known Limitations & Mitigation - -| Limitation | Impact | Mitigation | -|-----------|--------|-----------| -| Multiple evaluation systems (3 competing) | Confusion, maintenance burden | Consolidate in Phase 3 | -| Terminology not fully hardened in code | Potential confusion | Complete in Phase 2 | -| Real human testing not yet conducted | Unknown accuracy rates | Phase 2: Launch pilot | -| Edge case bias patterns unknown | Possible misclassifications | Phase 2: Collect and analyze | -| UX not optimized for believability | Could feel artificial | Phase 3: UX hardening | - ---- - -## Success Criteria (Per Roadmap) - -### Stability Metrics ✅ -- [x] All modules have clear purpose -- [x] <30 exported symbols (target met: 30) -- [x] Zero dead code (archived, not deleted) -- [x] Fairness framework in place - -### Reality Metrics 🔄 -- [ ] <5% false positive rate (testing needed) -- [ ] <10% false negative rate (testing needed) -- [ ] Zero pseudo-psychological claims (terminology hardening complete) -- [ ] 100% evidence traceability (implemented) - -### Trustworthiness Metrics ✅ -- [x] All conclusions flagged with confidence -- [x] Overconfidence detection active (>0.95) -- [x] Uncertainty surfaced when <2 indicators -- [x] Contradictions logged (framework in place) - -### Documentation Metrics ✅ -- [x] Execution flow documented -- [x] Data flow documented -- [x] Module dependency map created -- [x] Fairness audit checklist published - ---- - -## Next Actions (Immediate) - -### Week 1-2: Real Human Testing Phase 1 -1. [ ] Recruit 6 internal testers (3 builders, 3 non-builders) -2. [ ] Run viva sessions -3. [ ] Collect fairness audit reports -4. [ ] Verify bias detection working -5. [ ] Document findings - -### Week 2-3: Terminology Hardening Completion -1. [ ] Update reasoning_depth_analyzer.py naming -2. [ ] Audit all output strings for "detection" language -3. [ ] Update error messages and logs -4. [ ] Verify no pseudo-psychology language remains - -### Week 3-4: Pilot Human Study -1. [ ] Recruit 10-15 external participants -2. [ ] Run full protocol (pre/post surveys, sessions, feedback) -3. [ ] Collect disagreement cases -4. [ ] Analyze false positive/negative patterns -5. [ ] Identify bias patterns - -### Week 5: System Adjustments -1. [ ] Implement improvements from testing -2. [ ] Re-test on failure cases -3. [ ] Document learnings -4. [ ] Update fairness audit rules if needed - ---- - -## References - -- [VIVA_INTELLIGENCE_EXPLORATION.md](VIVA_INTELLIGENCE_EXPLORATION.md) — Phase 2 exploration -- [ORACLE_PHASE_2_SUMMARY.md](ORACLE_PHASE_2_SUMMARY.md) — Previous outcomes -- [ORACLE_STABILIZATION_PLAN.md](ORACLE_STABILIZATION_PLAN.md) — Full 8-week plan -- [ORACLE_ARCHITECTURE_DOCUMENTATION.md](ORACLE_ARCHITECTURE_DOCUMENTATION.md) — Technical details -- [ORACLE_TESTING_FRAMEWORK.md](ORACLE_TESTING_FRAMEWORK.md) — Testing protocols - ---- - -## Conclusion - -**ORACLE has successfully entered the stabilization phase.** - -The system is no longer adding new intelligence capabilities. Instead, it's: -- ✅ Freezing the architecture -- ✅ Hardening the terminology -- ✅ Implementing fairness auditing -- ✅ Preparing for real human validation -- ✅ Reducing false positives -- ✅ Surfacing uncertainty honestly - -**Ready for Phase 2: Real Human Testing** diff --git a/ORACLE_STABILIZATION_PLAN.md b/ORACLE_STABILIZATION_PLAN.md deleted file mode 100644 index 24b8e74..0000000 --- a/ORACLE_STABILIZATION_PLAN.md +++ /dev/null @@ -1,396 +0,0 @@ -# ORACLE Stabilization & Reality Hardening Phase - -## Executive Summary - -Transform ORACLE from "advanced intelligence prototype" to "stable, evidence-grounded implementation-aware viva infrastructure." - -**Key Principle:** Architecture freeze. No new intelligence engines. Focus on stabilization, clarity, and real-world validation. - ---- - -## Phase 1: Architecture Assessment & Cleanup - -### Current State Analysis (May 18, 2026) - -**Codebase Size:** 14 modules, 6,356 LOC -``` -failure_corpus.py 807 LOC -viva_session_conductor.py 677 LOC -evaluator.py 628 LOC -comparative_evaluator.py 600 LOC -reasoning_depth_analyzer.py 543 LOC -execution_behavior_analysis.py 466 LOC -viva_simulation.py 449 LOC (⚠️ POTENTIAL DUPLICATE) -comparative_reasoning_evaluator.py 430 LOC -human_evaluator_models.py 341 LOC -trust_audit.py 338 LOC -models.py 392 LOC -engineering_review_corpus.py 375 LOC -datasets.py 173 LOC -comparative_calibration_runner.py 137 LOC -``` - -### Critical Issues - -| Priority | Issue | Impact | Action | -|----------|-------|--------|--------| -| 🔴 HIGH | Multiple evaluation systems (3 modules) | Confusing ownership, duplicated logic | **Consolidate** into single evaluation pipeline | -| 🔴 HIGH | Multiple viva systems (2 modules) | Behavioral divergence, test confusion | **Identify** if viva_simulation.py is dead code | -| 🔴 HIGH | "Builder Detection" terminology | Pseudo-psychological claims | **Rename** to "Implementation Familiarity Analysis" | -| 🟠 MEDIUM | Models scattered (2 files) | Schema inconsistencies, import confusion | **Consolidate** into models.py | -| 🟠 MEDIUM | 52 exported symbols | API bloat, maintenance burden | **Reduce** to <30 core exports | -| 🟠 MEDIUM | No runtime documentation | New contributors lost, unmaintainable | **Create** execution flow + data flow docs | -| 🟡 LOW | viva_simulation.py usage unclear | Possible dead code | **Audit** usage patterns | - ---- - -## Phase 2: Terminology Hardening - -### Current Misleading Terms - -``` -"Builder Detection" → "Implementation Familiarity Depth Analysis" -"Fake Developer Detection" → "Surface Knowledge Identification" -"Truth Detection" → "Reasoning Pattern Analysis" -"Deep Builder" → "High Implementation Familiarity" -"Memorizer" → "Low Implementation Familiarity" -"Guesser" → "Insufficient Evidence" -"Builder Confidence" → "Familiarity Confidence Score" -"Reasoning Depth" → "Reasoning Pattern Classification" -``` - -### Replacement Strategy - -1. **reasoning_depth_analyzer.py** - - Rename `ReasoningDepth` enum values (keep internal, expose neutral names) - - Rename `builder_confidence` → `implementation_familiarity_score` - - Rename `overall_reasoning_depth` → `reasoning_pattern_classification` - - Update assessment language to avoid psychological claims - -2. **viva_session_conductor.py** - - Remove any "detection" language - - Focus on "assessment patterns" not "detection" - -3. **Trust Audit** - - Expand "overconfidence detection" - - Flag assessments claiming >90% certainty - - Require evidence grounding for all claims - ---- - -## Phase 3: End-to-End Workflow Documentation - -### Execution Graph - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ ORACLE Implementation Familiarity Assessment Workflow │ -└─────────────────────────────────────────────────────────────────┘ - -Step 1: Repository Analysis -├─ Parse codebase structure -├─ Extract execution patterns -├─ Identify architecture decisions -├─ Map dependency chains -└─ Output: ExecutionGraph - -Step 2: Engineering Review Corpus Loading -├─ Load relevant engineering reviews -├─ Match to repository patterns -├─ Extract implementation signals -└─ Output: CorpusContext - -Step 3: Failure Pattern Mapping -├─ Identify potential failure modes -├─ Map to observable signals -├─ Create probing targets -└─ Output: FailureSignalMap - -Step 4: Viva Session Planning -├─ Generate opening questions (grounded in reviews + failures) -├─ Prepare follow-up paths -├─ Design evaluation rubric -└─ Output: VivaSessionPlan - -Step 5: Live Viva Execution -├─ Present opening question -├─ Evaluate response quality (specificity, correctness, consistency) -├─ Determine follow-up need -├─ Adapt questioning path -└─ Output: ResponseEvaluation - -Step 6: Reasoning Pattern Analysis -├─ Analyze understanding indicators (rationale, tradeoffs, edge cases, etc.) -├─ Analyze memorization indicators (textbook language, generic answers, etc.) -├─ Classify reasoning pattern -├─ Compute confidence scores -└─ Output: ReasoningPatternAssessment - -Step 7: Fairness & Bias Audit -├─ Check for overconfidence -├─ Verify evidence grounding -├─ Flag edge cases (nervous/confident speakers) -├─ Surface uncertainty -└─ Output: FairnessAuditReport - -Step 8: Final Assessment & Explanation -├─ Synthesize all signals -├─ Generate explanation -├─ Surface contradictions -├─ Provide evidence traceability -└─ Output: FinalAssessment + Transcript -``` - -### Data Flow - -``` -Repository Code - ↓ -ExecutionGraph (AST analysis) - ↓ -Engineering Reviews -+ Failure Corpus - ↓ -Question Generation - ↓ -Viva Session - ↓ -Response Evaluation -(Quality Scoring) - ↓ -Indicator Analysis -(Understanding vs Memorization) - ↓ -Reasoning Pattern Classification - ↓ -Trust Audit -(Overconfidence check) - ↓ -Final Assessment -(with evidence trace + explanation) -``` - ---- - -## Phase 4: Real Human Testing Framework - -### Test Categories - -#### 4.1 Internal Validation (Week 1-2) - -| Test | Participants | Goal | Metrics | -|------|-------------|------|---------| -| **Baseline** | 3 developers who built systems | Establish true-positive rate | Correctly identify high familiarity | -| **Adversarial** | 3 who only read code/docs | Establish true-negative rate | Correctly identify low familiarity | -| **Communication** | 1 weak speaker (built), 1 confident (memorized) | Test communication bias | Non-correlated with actual familiarity | -| **Edge Cases** | Nervous developers, non-native speakers, unconventional architects | Reduce false positives | <5% incorrect on edge cases | - -#### 4.2 Pilot Human Study (Week 3-4) - -**Participants:** 10-15 real people (mix of roles) -- Backend developers (2-3) -- System contributors (2-3) -- Students/learners (3-4) -- Engineering leads (1-2) -- Cross-team members (2-3) - -**Metrics Collected:** -- Familiarity accuracy (true positive, true negative, false positive, false negative) -- Question realism ratings (1-5 Likert) -- Follow-up quality (helpful? confusing? too hard? too easy?) -- Fairness perception (feeling evaluated fairly? biased by communication style?) -- Disagreement patterns (when does ORACLE assessment diverge from reviewer?) - -**Outputs:** -- Confusion cases (where ORACLE misclassified) -- False-positive patterns (who was mislabeled?) -- False-negative patterns (who should have been identified but wasn't?) -- Bias patterns (communication style, accent, confidence, experience level) - ---- - -## Phase 5: False Positive & Bias Reduction - -### Known Risk Patterns - -| Risk | Evidence | Mitigation | -|------|----------|-----------| -| **Weak Communicators** | Good technical knowledge but vague explanations | Reweight indicators: require >2 understanding indicators, not just 1 | -| **Confident Guessers** | Sound authoritative despite guessing | Flag buzzwords without specifics, probe follow-ups | -| **Non-Native Speakers** | Fluency ≠ familiarity | Separate "communication quality" from "technical depth" | -| **Nervous Candidates** | Knowledge is real but responses rambling | Penalize less for "WEAK" responses, more for "EVASIVE" | -| **Unconventional Architects** | Different approach but valid reasoning | Require evidence, not style matching | - -### Fairness Audit Checklist - -- [ ] Assessment doesn't over-reward confidence (trait bias) -- [ ] Assessment doesn't penalize communication style (demographic bias) -- [ ] Uncertainty is surfaced honestly (no false precision) -- [ ] Follow-ups probe actual knowledge, not communication ability -- [ ] Evidence grounding is verified (no "intuitive" conclusions) - ---- - -## Phase 6: Viva UX & Interaction Improvements - -### Current Issues - -| Issue | Current Behavior | Desired Behavior | -|-------|------------------|------------------| -| **Pacing** | No pacing control | 10-15 min opening session, optional follow-ups | -| **Question Order** | Static order | Adaptive: sequence based on response quality | -| **Follow-up Timing** | Immediate | Grouped: collect 3-4 responses, then targeted follow-ups | -| **Feedback** | Silent scoring | Real-time indicators (question difficulty, relevance) | -| **Transcript** | Stored, not visible | Live transcript with evidence markers | - -### Proposed UX Improvements - -1. **Opening Phase (3-4 questions, ~10 min)** - - Question difficulty: EASY → MEDIUM → HARD - - Each grounded in actual engineering reviews - - Candidate sees question, timer (optional) - -2. **Evaluation Phase (LIVE)** - - Response quality shown (specificity %, correctness %, consistency) - - Evidence markers visible (code ref, timeline, tradeoff discussion) - -3. **Follow-up Phase (2-3 follow-ups)** - - Only if response quality < GOOD - - Adaptive: probe actual gaps, not generic depth - - Surface contradictions gently - -4. **Transcript (Exportable)** - - Q&A pairs with evaluations - - Evidence markers hyperlinked - - Final assessment with confidence/uncertainty - ---- - -## Phase 7: Trustworthiness Reinforcement - -### Trust Audit Expansion - -Current trust audit detects: -- Unsupported conclusions -- Overconfident judgments -- Weak contradiction evidence - -**New detections needed:** - -1. **Overconfidence Threshold** - - Flag any single indicator with >85% weight - - Flag familiarity_confidence_score > 0.95 without 3+ indicators - - Flag 0.0 scores as "insufficient evidence" not "confirmed false" - -2. **Evidence Traceability** - - Every conclusion must link to specific evidence - - Show what was said + why it matters - - Show what contradictions exist - -3. **Uncertainty Surfacing** - - "Insufficient evidence for determination" > guessing - - Confidence scores must reflect actual signal strength - - "MEDIUM" certainty for <2 indicators - ---- - -## Implementation Roadmap - -### Week 1: Architecture & Cleanup -- [x] Assess current state -- [ ] Identify dead code (viva_simulation.py) -- [ ] Consolidate duplicate evaluation systems -- [ ] Consolidate model classes -- [ ] Reduce exports to <30 core symbols -- [ ] Document architecture freeze - -### Week 2: Terminology Hardening -- [ ] Rename reasoning_depth_analyzer.py concepts -- [ ] Update all assessment language -- [ ] Expand trust audit for overconfidence -- [ ] Audit all output strings for "detection" language - -### Week 3: Documentation -- [ ] Create execution flow diagrams -- [ ] Create data flow documentation -- [ ] Create module dependency map -- [ ] Document actual runtime behavior - -### Week 4: Real Testing Framework -- [ ] Build test harness for real humans -- [ ] Create fairness audit checklist -- [ ] Document false-positive patterns -- [ ] Plan pilot study - -### Week 5-6: Hardening & Validation -- [ ] Run internal validation tests -- [ ] Fix identified false positives -- [ ] Pilot with real humans (10-15 people) -- [ ] Collect disagreement cases - -### Week 7: UX Improvements -- [ ] Implement adaptive question sequencing -- [ ] Add real-time response quality feedback -- [ ] Create exportable transcripts -- [ ] Improve follow-up UX - -### Week 8: Final Validation -- [ ] End-to-end workflow test -- [ ] Trust audit verification -- [ ] Performance benchmarks -- [ ] Documentation completeness - ---- - -## Success Criteria - -### Stability Metrics -- [ ] All modules have clear purpose and ownership -- [ ] <30 exported symbols (down from 52) -- [ ] Zero dead code -- [ ] <5% test flakiness -- [ ] All workflows end-to-end tested - -### Reality Metrics -- [ ] <5% false positive rate on real humans -- [ ] <10% false negative rate on real humans -- [ ] Zero pseudo-psychological claims -- [ ] 100% evidence traceability -- [ ] Fairness audit pass: no communication-style bias - -### Trustworthiness Metrics -- [ ] All conclusions flagged with confidence -- [ ] Overconfidence detection active (0 >0.95 scores) -- [ ] Uncertainty surfaced when evidence <2 indicators -- [ ] All contradictions logged and explained - -### Documentation Metrics -- [ ] Execution flow documented -- [ ] Data flow documented -- [ ] Module dependency map created -- [ ] Fairness audit checklist published - ---- - -## Non-Goals (Architecture Freeze) - -❌ DO NOT add: -- New intelligence engines -- New reasoning layers -- New behavioral models -- Speculative AI features -- Additional abstraction layers - -✅ DO maintain: -- AST-first architecture -- Execution graph foundation -- Explainability -- Deterministic behavior -- Calibration systems - ---- - -## References - -- [VIVA_INTELLIGENCE_EXPLORATION.md](VIVA_INTELLIGENCE_EXPLORATION.md) — Previous exploration -- [ORACLE_PHASE_2_SUMMARY.md](ORACLE_PHASE_2_SUMMARY.md) — Phase 2 outcomes diff --git a/ORACLE_TESTING_FRAMEWORK.md b/ORACLE_TESTING_FRAMEWORK.md deleted file mode 100644 index 2d04246..0000000 --- a/ORACLE_TESTING_FRAMEWORK.md +++ /dev/null @@ -1,340 +0,0 @@ -# ORACLE Real Human Testing Framework - -## Testing Protocol - -### Phase 1: Internal Validation (Week 1-2) - -#### 1.1 True Positive Test: High Implementation Familiarity - -**Participants:** 3 developers who built the actual systems - -**Requirement:** Participants have: -- Direct implementation experience (6+ months on the system) -- Can explain design decisions -- Know about production incidents -- Can discuss tradeoffs and limitations - -**Test Procedure:** -1. Conduct full viva session (3-4 questions, 10-15 min) -2. Questions drawn from engineering review corpus -3. Collect responses -4. Run implementation familiarity analysis -5. Record assessment - -**Expected Outcome:** -- Classification: HIGH_IMPLEMENTATION_FAMILIARITY or better -- Confidence: HIGH -- Indicators: 3+ understanding indicators - -**Pass Criteria:** -- ✅ All 3 participants classified as HIGH/PRACTICED -- ✅ Confidence >= MEDIUM -- ✅ <1 false uncertainty issue - ---- - -#### 1.2 True Negative Test: Low Implementation Familiarity - -**Participants:** 3 who only read code/documentation (no hands-on experience) - -**Requirement:** Participants have: -- Studied code but never built/deployed -- Can recite concepts -- Know theory but not practice -- Cannot discuss production incidents - -**Test Procedure:** -1. Conduct full viva session (same questions as 1.1) -2. Collect responses -3. Run implementation familiarity analysis -4. Record assessment - -**Expected Outcome:** -- Classification: LOW_IMPLEMENTATION_FAMILIARITY or INSUFFICIENT -- Confidence: HIGH -- Indicators: 2+ memorization indicators, 0 understanding indicators - -**Pass Criteria:** -- ✅ All 3 participants classified as LOW/INSUFFICIENT -- ✅ Confidence >= MEDIUM -- ✅ No false positives marking them as HIGH - ---- - -#### 1.3 Communication Style Bias Test - -**Participants:** -- 1 builder who communicates poorly (nervous, hedging, unsure tone) -- 1 non-builder who communicates confidently (confident tone, buzzwords) - -**Test Procedure:** -1. Run same viva session as 1.1/1.2 -2. Compare communication style markers vs familiarity assessment -3. Verify FairnessAuditor detects patterns - -**Expected Outcome:** -- Builder classified HIGH despite nervous communication -- Non-builder classified LOW despite confident communication -- FairnessAuditReport flags communication style bias - -**Pass Criteria:** -- ✅ Nervous builder not penalized for communication -- ✅ Confident guesser not rewarded for delivery -- ✅ Assessment based on content, not style - ---- - -#### 1.4 Edge Cases - -**Participants:** -- 1 weak non-native English speaker who is a builder -- 1 unconventional (non-OOP, non-standard) but valid engineer -- 1 nervous but knowledgeable candidate - -**Test Procedure:** -1. Run viva sessions -2. Collect assessments -3. Flag any problematic conclusions - -**Expected Outcome:** -- Fairness audit catches potential bias -- Manual review recommended for edge cases -- Confidence marked as MEDIUM (not HIGH) - -**Pass Criteria:** -- ✅ No wrong classifications -- ✅ Uncertainty surfaced honestly -- ✅ Manual review recommended - ---- - -### Phase 2: Pilot Human Study (Week 3-4) - -#### 2.1 Participant Recruitment - -**Total:** 10-15 real people - -**Mix:** -- Backend developers who built systems (2-3) -- System contributors (2-3) -- Engineering leads (1-2) -- Students/learners (3-4) -- Cross-team members (2-3) - -**Inclusion Criteria:** -- Willing to participate in 20-30 min viva session -- OK with recording/analyzing responses -- Willing to provide feedback on assessment accuracy - ---- - -#### 2.2 Session Procedure - -For each participant: - -1. **Pre-Session Survey** (5 min) - - Background: role, experience, how long on this system? - - Communication style: comfortable in technical interviews? Nervous? Confident? - - Demographics: first language? Neurodivergent? - -2. **Viva Session** (15-20 min) - - 3-4 opening questions - - Optional follow-ups based on response quality - - Record all responses - -3. **Assessment** (automated) - - VivaSessionConductor scores responses - - ReasoningDepthAnalyzer classifies familiarity - - FairnessAuditor checks for bias - - TrustAudit verifies evidence grounding - -4. **Post-Session Survey** (5 min) - - How accurate was the assessment? (1-5 scale) - - Which questions were: too easy / too hard / just right? - - Did you feel evaluated fairly? Any biases? - - Would you recommend this for hiring/evaluation? - -5. **Interviewer Notes** (written) - - Technical depth impression - - Communication observations - - Any contradictions/confusion? - - Confidence in assessment - ---- - -#### 2.3 Data Collection - -**Metrics Collected:** - -``` -For each participant: -├─ demographics (role, exp_years, first_language, etc.) -├─ responses (text, quality_score, correctness_score, etc.) -├─ assessment (classification, confidence, indicators) -├─ fairness_audit (issues found, recommendations) -├─ accuracy (participant self-report: 1-5 scale) -├─ feedback (too easy? fair? recommendations?) -└─ interviewer_notes (text observations) -``` - -**Output Files:** -- `viva_session_[participant_id].json` (session recording) -- `assessment_[participant_id].json` (classification + evidence) -- `fairness_audit_[participant_id].json` (bias check results) -- `participant_feedback_[participant_id].json` (survey responses) - ---- - -#### 2.4 Disagreement Analysis - -**Find cases where:** -1. ORACLE says HIGH but interviewer says LOW (possible false positive) -2. ORACLE says LOW but interviewer says HIGH (possible false negative) -3. ORACLE HIGH but participant self-reports LOW (overconfidence?) -4. ORACLE LOW but participant self-reports HIGH (underconfidence?) - -**For each disagreement, analyze:** -- What signals did ORACLE use? -- Did fairness audit catch issues? -- Was evidence insufficient? -- Did communication style affect assessment? -- What should have happened? - ---- - -### Phase 3: Error Analysis & Hardening (Week 5) - -#### 3.1 False Positive Analysis - -**Question:** When did ORACLE mark someone as non-familiar when they actually were? - -**Analysis:** -- Which communication patterns triggered false positives? -- Were fairness audit issues correctly flagged? -- Should confidence be reduced? Recommendations added? -- What follow-ups would have helped? - -**Output:** False positive patterns document - ---- - -#### 3.2 False Negative Analysis - -**Question:** When did ORACLE mark someone as familiar when they actually weren't? - -**Analysis:** -- Which confidence indicators were misleading? -- How many memorization indicators were missed? -- Did confident delivery trick the system? -- Should follow-ups probe deeper? - -**Output:** False negative patterns document - ---- - -#### 3.3 Bias Pattern Analysis - -**Question:** Did certain demographics get systematically misclassified? - -**Analysis by demographic:** -- Non-native speakers: under/over represented in misclassifications? -- Early career: systematic bias? -- Non-traditional background: systematic bias? -- Communication style: correlation with accuracy? - -**Output:** Bias analysis report - ---- - -### Phase 4: System Improvements (Week 6-7) - -Based on findings from Phase 2-3: - -1. **Adjust indicator weights** if communication style bias detected -2. **Add new follow-up patterns** if certain misclassifications repeat -3. **Improve fairness audit** if certain biases not caught -4. **Reduce confidence scores** if overconfidence detected -5. **Retrain on test cases** if patterns are systematic - ---- - -## Success Metrics - -| Metric | Target | How to Measure | -|--------|--------|----------------| -| True Positive Rate | 90%+ | % of HIGH-familiarity test cases correctly identified | -| True Negative Rate | 90%+ | % of LOW-familiarity test cases correctly identified | -| False Positive Rate | <10% | % of non-familiar marked as familiar | -| False Negative Rate | <10% | % of familiar marked as non-familiar | -| Communication Bias | <5% | Correlation between communication style and assessment | -| Demographic Bias | <5% | Systematic bias by demographic | -| Fairness Audit Effectiveness | 80%+ | % of problems caught by fairness audit | -| Participant Accuracy Self-Report | 4+/5 avg | Mean participant satisfaction | -| Evidence Grounding | 100% | All conclusions have evidence trace | - ---- - -## Testing Checklist - -### Before Phase 1: -- [ ] Define test case library (questions to ask) -- [ ] Recruit 6 internal testers (3 builders, 3 non-builders) -- [ ] Create feedback template -- [ ] Set baseline metrics - -### Before Phase 2: -- [ ] Recruit 10-15 external participants -- [ ] Create pre/post surveys -- [ ] Set up recording infrastructure -- [ ] Train facilitators - -### After Phase 2-3: -- [ ] Analyze all disagreement cases -- [ ] Categorize false positives/negatives -- [ ] Identify bias patterns -- [ ] Create improvement plan - -### After Phase 4: -- [ ] Implement improvements -- [ ] Re-test on sample of failures -- [ ] Document learnings -- [ ] Generate final report - ---- - -## Sample Test Case - -**Participant:** Backend developer, 3 years on project X - -**Question 1:** "The API endpoint for user list loads 100+ related resources per user. What's the performance concern and how would you fix it?" - -**Good Response (HIGH familiarity):** -> "N+1 query problem. When we first built this, we didn't batch load relationships, so each user load triggered a separate query. We discovered this in production when response time hit 2 seconds for 10 users. We fixed it using SQLAlchemy's joinedload with batch pagination - we load at most 10 related records per batch. Tradeoff is complexity in query construction, but we get sub-100ms responses now." - -**Poor Response (LOW familiarity):** -> "Um, probably an N+1 query issue? That's like a common database pattern problem. You'd use eager loading to fix it, I think. That's a best practice in database design." - -**Expected Difference in Assessment:** -- Good response: HIGH_FAMILIARITY, HIGH confidence, 3+ understanding indicators -- Poor response: LOW_FAMILIARITY, HIGH confidence, 2+ memorization indicators - ---- - -## Documentation Output - -- `TESTING_RESULTS_PHASE1.md` - Internal validation results -- `TESTING_RESULTS_PHASE2.md` - Pilot study results + feedback -- `DISAGREEMENT_ANALYSIS.md` - False positive/negative patterns -- `BIAS_ANALYSIS.md` - Demographic bias findings -- `IMPROVEMENTS_APPLIED.md` - Changes made based on testing - ---- - -## Next Actions - -1. [ ] Finalize test case library -2. [ ] Recruit internal testers -3. [ ] Create feedback templates -4. [ ] Schedule Phase 1 (Week 1-2) -5. [ ] Recruit external participants -6. [ ] Schedule Phase 2 (Week 3-4) diff --git a/ORACLE_UI_ALIGNMENT_REPORT.md b/ORACLE_UI_ALIGNMENT_REPORT.md deleted file mode 100644 index f74d931..0000000 --- a/ORACLE_UI_ALIGNMENT_REPORT.md +++ /dev/null @@ -1,271 +0,0 @@ -# ORACLE Agent UI Alignment Report - -**Date:** May 18, 2026 -**Status:** ✅ **ALIGNED** (with minor considerations) - ---- - -## Executive Summary - -The ORACLE Agent UI (`backend/src/agents/oracle/ui/index.html`) is **well-aligned** with the ORACLE agent implementation (`backend/src/agents/oracle/agent.py`). The UI expects WebSocket messages with a specific payload structure, and the backend is correctly producing those structures through the `MainAgent` → `OracleAgent` → `StructuredContext` pipeline. - -**Key Finding:** All critical data fields match. Minor alignment notes exist for edge cases and optional fields. - ---- - -## Alignment Analysis - -### 1. **WebSocket Connection** ✅ ALIGNED - -| Component | UI Expects | Agent Provides | -|-----------|-----------|-----------------| -| **Endpoint** | `ws://localhost:8001/ws/analyze` | ✅ Implemented in `backend/src/main.py:42` | -| **Input Format** | `{ repo_url: string }` | ✅ Accepted in `websocket_analyze()` | -| **Output Format** | `{ type: "log", message, log_type }` | ✅ Sent via `log_cb()` callback | -| **Result Format** | `{ type: "result", data: {...} }` | ✅ Sent as final JSON payload | - -**Status:** ✅ Perfect alignment. UI can connect and receive data as expected. - ---- - -### 2. **Analysis Dashboard Cards** ✅ ALIGNED - -#### Backend Framework Card -``` -UI expects: card_backend.innerText = payload.backend_framework?.value -Agent produces: StructuredContext.backend_framework = EvidenceModel(value=) -``` -✅ **ALIGNED** — EvidenceModel has `.value` field - -#### Architecture Pattern Card -``` -UI expects: card_architecture.innerText = payload.architecture_pattern?.value -Agent produces: StructuredContext.architecture_pattern = EvidenceModel(value=) -``` -✅ **ALIGNED** — EvidenceModel has `.value` field - -#### Authentication System Card -``` -UI expects: card_auth.innerText = payload.authentication_system?.value -Agent produces: StructuredContext.authentication_system = EvidenceModel(value=) -``` -✅ **ALIGNED** — EvidenceModel has `.value` field - -#### Graph Integrity Card -``` -UI expects: card_graph_meta.innerHTML = `${nodeCount} Nodes Found` -Agent produces: StructuredContext.execution_graph = ExecutionGraph(nodes=[...]) -``` -✅ **ALIGNED** — ExecutionGraph has `.nodes` array - -**Status:** ✅ All dashboard cards receive correct data structure. - ---- - -### 3. **Execution Graph Visualization** ✅ ALIGNED - -| UI Requirement | Agent Provides | Status | -|---|---|---| -| `execution_graph.nodes[]` with `id`, `label`, `type`, `metadata` | ✅ ExecutionGraph model | ✅ | -| `execution_graph.edges[]` with `source`, `target`, `relationship` | ✅ ExecutionGraph model | ✅ | -| Node types: `ROUTE`, `MIDDLEWARE`, `DB_QUERY`, `STATE_STORE`, `AUTH_HANDLER` | ✅ Detected via TechDetector | ✅ | -| Node metadata: `file_path`, `line_number`, `snippet` | ✅ From AST extraction | ✅ | - -**Status:** ✅ Graph rendering fully aligned. Mermaid diagram generation will work correctly. - ---- - -### 4. **Viva Intelligence Output** ✅ ALIGNED - -**UI expects viva card with these fields:** -```javascript -{ - category: "Architecture|Tradeoff|Security|Scalability|Failure-Path|Runtime", - topic: string, - difficulty: "hard|medium|foundational", // Note: UI expects "easy" but agent uses "foundational" - depth_score: number (0-10), - confidence: number (0-1), - question_target: string, - focus: string, - reasoning_summary: string, - related_node: string (graph node ID) -} -``` - -**Agent produces VivaTarget with:** -```python -class VivaTarget(BaseModel): - topic: str - question_target: str - difficulty: str # ← Note: agent may use "foundational" instead of "easy" - importance_score: float - focus: str - category: str = "Architecture" - depth_score: float = 5.0 - related_node: str = "" - confidence: float = 0.8 - reasoning_summary: str = "" -``` - -**Status:** ✅ **ALIGNED** with minor note: UI's `diffTag()` function handles "hard", "medium", and defaults to "easy", but agent may use "foundational" terminology. This is handled gracefully by the fallback. - ---- - -### 5. **Anomalies & Failure Paths Panel** ✅ ALIGNED - -**UI expects:** -```javascript -payload.runtime_risks = [ - { severity: string, value: string, confidence: number, evidence: string[] } -] -payload.failure_paths = [ - { value: string, confidence: number, evidence: string[] } -] -``` - -**Agent produces:** -```python -StructuredContext: - runtime_risks: List[RuntimeRisk] - failure_paths: List[EvidenceModel] -``` - -**Status:** ✅ **ALIGNED** — Both fields are properly populated by: -- `ExecutionGraphFailureAnalyzer.analyze_failure_scenarios()` → failure_paths -- `ObservableSignalsEngine.extract_signals()` → runtime_risks (via signal risk levels) - ---- - -### 6. **Benchmark Results Table** ✅ ALIGNED - -**UI expects evaluation_metrics:** -```javascript -{ - metrics: { stack_accuracy, auth_detection_accuracy }, - mismatches: [], - expected: { expected_stack, expected_protected_routes, expected_architecture } -} -``` - -**Agent produces via OracleEvaluator:** -```python -evaluation_metrics = { - "metrics": { stack_accuracy, auth_detection_accuracy }, - "mismatches": [], - "expected": { loaded from project_el.json } -} -``` - -**Status:** ✅ **ALIGNED** — Evaluation metrics correctly benchmarked against ground truth in `evaluation/expected_outputs/project_el.json` - ---- - -### 7. **Terminal/Logging Output** ✅ ALIGNED - -**UI expects log messages:** -```javascript -{ type: "log", message: string, log_type: "info|warn|success|error" } -``` - -**Agent produces via log_callback:** -```python -await send_log("[Oracle] Message here", "info|warn|success|error") -``` - -**Status:** ✅ **ALIGNED** — Terminal correctly displays all agent progress messages. - ---- - -## Potential Alignment Issues (Minor) - -### Issue 1: Difficulty Terminology 🟡 **ACCEPTABLE** -- **UI:** Uses `diffTag()` function that expects "hard", "medium", "easy" -- **Agent:** May produce "hard", "medium", "foundational" -- **Impact:** Low — UI fallback handles gracefully with `.tag.pass` class -- **Recommendation:** Consider standardizing to "easy|medium|hard" across the codebase in Phase 4 - -### Issue 2: Missing `observable_signals` in UI 🟡 **ACCEPTABLE** -- **Agent:** Produces `context.observable_signals = observable_signals` -- **UI:** Does not display them (focuses on failure_paths instead) -- **Impact:** Low — Data is available but unused -- **Recommendation:** Could enhance UI to show signal indicators in future versions - -### Issue 3: `implementation_viva_targets` vs `viva_intelligence_targets` 🟡 **ACCEPTABLE** -- **Agent:** Sets both fields (legacy support) -- **UI:** Reads `payload.viva_intelligence_targets` first, falls back to `implementation_viva_targets` -- **Impact:** None — UI handles both -- **Recommendation:** Consolidate to single field in Phase 4 - ---- - -## Data Flow Validation - -### Complete Request → Response Cycle - -``` -UI Browser - ↓ -[Input] repo_url: "https://github.com/Project-XI/Project-EL" - ↓ -WebSocket: ws://localhost:8001/ws/analyze - ↓ -FastAPI Handler (main.py:42) - ↓ -MainAgent.process() - ├─ GatekeeperAgent.process() - │ └─ Returns identity context - ├─ OracleAgent.process() - │ ├─ Clones repo - │ ├─ Extracts AST - │ ├─ Builds execution graph - │ ├─ Generates viva questions - │ ├─ Detects failure scenarios - │ └─ Returns StructuredContext - └─ SentinelAgent.process() - └─ Returns final context - ↓ -StructuredContext.model_dump() → JSON - ↓ -{ type: "result", data: { backend_framework, architecture_pattern, execution_graph, viva_intelligence_targets, ... } } - ↓ -UI Browser receives & renders - ├─ Dashboard cards - ├─ Execution graph - ├─ Viva intelligence list - ├─ Benchmark table - └─ Anomalies panel -``` - -✅ **All data flows correctly through the pipeline** - ---- - -## Conclusion - -**The ORACLE Agent UI is ✅ FULLY ALIGNED with the ORACLE Agent implementation.** - -### What's Working -1. ✅ WebSocket connection and real-time message streaming -2. ✅ Dashboard cards display detected frameworks, architecture, auth systems -3. ✅ Execution graph properly visualized with nodes and edges -4. ✅ Viva intelligence questions displayed with all metadata -5. ✅ Anomalies and failure paths shown in dedicated panel -6. ✅ Benchmark results compared against ground truth -7. ✅ Terminal logs show all agent progress - -### Minor Considerations -1. 🟡 Difficulty terminology ("foundational" vs "easy") — gracefully handled -2. 🟡 Observable signals not displayed — available but unused -3. 🟡 Dual viva target fields for legacy support — no functional impact - -### Recommendations for Next Phase -1. **Phase 4:** Standardize terminology across all enums (builder_confidence → implementation_familiarity_score) -2. **Phase 4:** Consolidate viva target field names -3. **Future Enhancement:** Add observable signals visualization to UI -4. **Testing:** Run end-to-end validation with live repo to confirm all fields populate correctly - ---- - -**Last Updated:** May 18, 2026 -**Verified By:** Architecture Review -**Status:** Ready for Real Human Testing Phase 1 ✅ diff --git a/VIVA_INTELLIGENCE_EXPLORATION.md b/VIVA_INTELLIGENCE_EXPLORATION.md deleted file mode 100644 index 679021d..0000000 --- a/VIVA_INTELLIGENCE_EXPLORATION.md +++ /dev/null @@ -1,577 +0,0 @@ -# Viva Intelligence Engine Exploration - -## Executive Summary - -The Project-EL system has **two parallel viva generation approaches**: -1. **Template-Based** (Primary): `VivaIntelligenceEngine` - Pattern matching on detected technologies -2. **Evidence-Grounded** (Secondary): `EvidenceGroundedVivaGenerator` - Based on actual failure scenarios and observable signals - -Both generate `VivaTarget` objects with 7+ attributes. Questions are **ranked by difficulty, topic, and depth keywords**. - ---- - -## 1. Core Classes & Methods - -### VivaIntelligenceEngine (viva_intelligence_engine.py) - -**Main Entry Point:** -```python -@staticmethod -def generate_targets(detections: Dict[str, Any], arch: EvidenceModel) -> List[VivaTarget] -``` - -**What it does:** -- Inspects detected technologies (FastAPI, Express, React, SQL, MongoDB, JWT, Redis, etc.) -- Generates 15-25 viva questions across 6 categories: - - **Architecture**: REST constraints, Microservices boundaries, FastAPI dependency injection - - **Tradeoffs**: Polyglot persistence, SPA vs SSR, relational vs NoSQL selection - - **Security**: JWT lifecycle/revocation, auth failure paths, token handling - - **Scalability**: Database connection pooling, cache eviction, horizontal scaling constraints - - **Failure-Path**: Cascading failures, auth service unavailability, single points of failure - - **Runtime**: Async/sync boundaries, event loop blocking, thread pool implications - -**Key Methods:** -- `detect_inconsistencies(doc_text, detections)` → Flags mismatches between docs and code (e.g., "Redis mentioned but not detected") -- `detect_complexity_mismatch(arch, detections)` → Identifies claims like "Microservices" with only 1 backend detected - -**Confidence & Depth Scoring:** -```python -VivaTarget( - topic="Security", - question_target="JWT Lifecycle & Revocation", - difficulty="hard", - importance_score=0.95, # Base importance - focus="JWT tokens are stateless...", # Question text - category="Security", - depth_score=9.5, # Engineering depth (0-10) - related_node="auth_middleware", # Execution graph node - confidence=0.97, # Engine confidence in relevance - reasoning_summary="JWT signals detected in middleware chain." -) -``` - ---- - -### EvidenceGroundedVivaGenerator (evidence_grounded_viva_generator.py) - -**Main Entry Point:** -```python -@staticmethod -def generate_questions( - failure_scenarios: List[Any], - observable_signals: Dict[str, List[Any]], - detections: Dict[str, Any], - repo_path: str -) -> List[VivaTarget] -``` - -**What it does:** -- Generates viva questions **grounded in actual code evidence** -- Uses 4 signal categories: - 1. **Failure Scenario Questions** - "Walk me through what happens when {scenario}" - 2. **Observable Signal Questions** - "Your code shows X, how do you handle Y?" - 3. **Technology Questions** - Framework/language-specific patterns - 4. **Architecture Questions** - Observable patterns from signal analysis - -**Evidence Tracing:** -- Each question references specific files that prompted the question -- Questions like: "I don't see circuit breaker patterns - how do you prevent cascading failures?" -- Example: If error handling signals show "high" risk, generates hard-difficulty questions - -**CodeGroundedVivaQuestion Model:** -```python -@dataclass -class CodeGroundedVivaQuestion: - topic: str - question: str - implementation_context: str # What code prompted this - evidence_files: List[str] # Traceable source files - expected_knowledge: str # What engineer should understand - difficulty: str -``` - ---- - -### VivaQuestionRanker (viva_question_ranker.py) - -**Scoring Logic:** -```python -@staticmethod -def rank_targets(targets: List[VivaTarget]) -> List[VivaTarget] -``` - -**Ranking Formula:** -``` -base_score = importance_score - -if difficulty == "hard": +0.3 -if topic == "security": +0.2 -if topic == "architecture": +0.1 - -if any(depth_keyword in focus): +0.2 - where depth_keywords = ["middleware", "lifecycle", "flow", "failure", "risk", "tradeoff", "why"] - -final_score = min(1.0, score) # Cap at 1.0 -``` - -**Result:** Sorted descending by importance_score, so highest-value questions appear first. - ---- - -## 2. Data Flow & Context Sources - -### What Feeds Into Viva Generation - -``` -┌─────────────────────────────────────────┐ -│ OracleAgent.process() │ -└────────┬────────────────────────────────┘ - │ - ┌────▼──────────────────────────────────┐ - │ Phase 1: Observable Signals Extraction│ - │ - Error handling patterns │ - │ - Resilience checks │ - │ - Architecture detection │ - └────┬─────────────────────────────────┘ - │ - ┌────▼──────────────────────────────────┐ - │ Phase 2: Failure Scenario Detection │ - │ (FailurePathIntelligenceEngine) │ - │ - Database failures │ - │ - Auth service unavailability │ - │ - Cascading failures │ - └────┬─────────────────────────────────┘ - │ - ┌────▼──────────────────────────────────┐ - │ Phase 3: CHOICE OF VIVA GENERATOR │ - │ │ - │ IF failure_scenarios detected: │ - │ → EvidenceGroundedVivaGenerator │ - │ │ - │ ELSE (fallback): │ - │ → VivaIntelligenceEngine │ - └────┬─────────────────────────────────┘ - │ - ┌────▼──────────────────────────────────┐ - │ Phase 4: Rank Targets │ - │ (VivaQuestionRanker) │ - │ → Output: Sorted VivaTarget[] │ - └────────────────────────────────────────┘ -``` - -### Input Data Structure (repo_detections) - -```python -repo_detections = { - "frontend_framework": EvidenceModel(value="React", confidence=0.95), - "backend_framework": EvidenceModel(value="FastAPI", confidence=0.98), - "database_used": EvidenceModel(value="SQL", confidence=0.92), - "authentication_system": EvidenceModel(value="JWT", confidence=0.97), - "cache_framework": EvidenceModel(value="Redis", confidence=0.85), - # ... 10+ more detection types -} - -arch_inference = EvidenceModel( - value="REST API + SPA", - confidence=0.9, - evidence=["FastAPI routes detected", "React component structure"] -) -``` - -### Observable Signals (Passed to Evidence-Grounded Generator) - -```python -observable_signals = { - "error_handling": [ - EngineeringSignal( - signal_name="Centralized Error Handler Detected", - category="error_handling", - confidence=0.85, - evidence_files=["routes/error_handler.py", "middleware/exception.py"], - description="Centralized error handling middleware found", - risk_level="N/A" - ) - ], - "resilience_patterns": [ - # Circuit breakers, retry logic, timeouts - ], - "auth_consistency": [ - # Auth check distribution across endpoints - ], - # ... 6 categories total -} -``` - -### Failure Scenarios - -```python -failure_scenarios = [ - { - "scenario_name": "Database Connection Exhaustion", - "trigger": "concurrent_requests_triple", - "propagation_risk": "critical", - "recovery_possible": True, - # → Generates question: "How does your system handle this? What's your recovery?" - }, - # ... 15+ failure modes in corpus -] -``` - ---- - -## 3. How Viva Questions Are Currently Generated - -### Template-Based Flow (VivaIntelligenceEngine) - -**Step 1: Technology Detection** -```python -has_fastapi = any("FastAPI" in str(m.value) for m in detections.values()) -has_jwt = any("JWT" in str(m.value) for m in detections.values()) -has_redis = any("Redis" in str(m.value) for m in detections.values()) -# ... check 8 technologies -``` - -**Step 2: Pattern Matching → Question Generation** -```python -if has_fastapi: - targets.append(VivaTarget( - topic="Architecture", - question_target="FastAPI Dependency Injection Graph", - difficulty="medium", - importance_score=0.85, - focus="Trace the full FastAPI dependency injection chain from request to database access..." - )) - -if has_jwt: - targets.append(VivaTarget( - topic="Security", - question_target="JWT Lifecycle & Revocation", - difficulty="hard", - importance_score=0.95, - focus="JWT tokens are stateless. Explain how this implementation handles token revocation..." - )) -``` - -**Step 3: Ranking** -- All targets passed to `VivaQuestionRanker.rank_targets()` -- Scoring boosts hard-difficulty, security, and depth-keyword questions -- Returns sorted list by `importance_score` (descending) - -### Evidence-Grounded Flow (EvidenceGroundedVivaGenerator) - -**Step 1: Analyze Failure Scenarios** -```python -for scenario in failure_scenarios: - if scenario.propagation_risk == "critical": - question_text = f"Walk me through what happens when {scenario.trigger.lower()}..." - questions.append(VivaTarget( - topic=scenario.scenario_name, - difficulty="hard", - importance_score=0.95, # High for critical failures - focus=question_text - )) -``` - -**Step 2: Analyze Observable Signals** -```python -# If auth signals don't show centralized checking: -if not any("Centralized" in str(s.signal_name) for s in auth_signals): - question = VivaTarget( - topic="Authentication", - question_target="Auth Consistency", - difficulty="hard", - importance_score=0.9, - focus="I see auth checks scattered across your codebase. How do you ensure..." - ) -``` - -**Step 3: Technology-Specific Questions** -- Questions tailored to detected tech: FastAPI patterns, Express blocking, React SSR tradeoffs - ---- - -## 4. Multi-Turn / Follow-Up Logic - -### VivaSession Simulation Framework (viva_simulation.py) - -**Architecture:** -```python -class VivaSession(BaseModel): - initial_question: str - student_responses: List[SimulatedStudentResponse] = [] - follow_up_questions: List[VivaFollowUpQuestion] = [] - weakness_probing_rate: float # % of follow-ups targeting revealed weaknesses - generic_question_rate: float # % of follow-ups that are generic -``` - -**Student Response Types:** -- `CORRECT`: Knows implementation and reasoning -- `TEXTBOOK`: Generic concepts, not codebase-specific -- `PARTIAL`: Correct but incomplete -- `CONTRADICTORY`: Internally inconsistent -- `MISUNDERSTANDING`: Wrong assumptions -- `WEAK_EXPLANATION`: Right answer, unclear reasoning - -**Follow-Up Question Quality Assessment:** -```python -class FollowUpQuestionQuality(str, Enum): - GENERIC = "generic" # Could apply to any codebase - PROBING = "probing" # Tests specific weakness - IRRELEVANT = "irrelevant" # Doesn't relate to prior answer - CLARIFYING = "clarifying" # Good clarification - DEEPENING = "deepening" # Pushes deeper - CONTRADICTING = "contradicting" # Points out inconsistency -``` - -**Current Status:** -- Follow-up framework is defined but **NOT YET INTEGRATED** into OracleAgent -- System can simulate follow-ups but doesn't auto-generate them in live ORACLE flow -- Evaluation harness exists: `evaluate_follow_up()` method in viva_simulation.py - ---- - -## 5. Viva Question Scoring & Evaluation - -### ComparativeVivaEvaluator (comparative_evaluator.py) - -**Single Question Evaluation:** -```python -def compare_viva_question( - self, - oracle_question: str, - human_evaluations: List[HumanVivaQuestionEvaluation] -) -> ComparativeVivaAnalysis -``` - -**Metrics Generated:** -- **quality_rate**: % of humans who rated question as non-generic -- **code_specificity**: Average specificity score (0-1) -- **distinguishes_levels**: % saying question would distinguish senior from junior engineers -- **would_ask_in_interview**: % saying they'd ask this in a real interview - -**Verdict Logic:** -``` -if generic_count / total_humans >= 0.75: - → "Question detected as textbook/generic" - → oracle_question_good = False - -elif quality_rate >= 0.7: - → "Good question" - → oracle_question_good = True - -else: - → "Mixed quality evaluation" -``` - -### Viva Question Quality Enum - -```python -class VivaQuestionQuality(str, Enum): - TEXTBOOK_GENERIC = "textbook_generic" - IMPLEMENTATION_DEEP_DIVE = "implementation_deep_dive" - ARCHITECTURAL_INSIGHT = "architectural_insight" - TOO_SIMPLE = "too_simple" - TOO_VAGUE = "too_vague" - CONTEXT_APPROPRIATE = "context_appropriate" - DISTINGUISHES_ENGINEER_LEVEL = "distinguishes_engineer_level" -``` - -### Dataset Metrics (Full Batch Evaluation) - -```python -def evaluate_viva_questions( - self, - oracle_questions: List[str], - human_question_evaluations: Dict[str, List[HumanVivaQuestionEvaluation]] -) -> Tuple[List[ComparativeVivaAnalysis], Dict[str, Any]] - -# Returns: -metrics = { - "total_questions": N, - "quality_questions": count of good questions, - "generic_questions": count flagged as generic, - "avg_quality_rate": 0.0-1.0, - "avg_code_specificity": 0.0-1.0, - "avg_distinguish_levels": 0.0-1.0, -} -``` - ---- - -## 6. Current Viva Capabilities Summary - -### What ORACLE Can Do ✅ - -1. **Generate 15-25 viva questions per repo** (template-based or evidence-grounded) -2. **Categorize questions** across 6 engineering domains -3. **Assign difficulty levels** (easy, medium, hard) -4. **Score by importance** (0-1 scale, ranked) -5. **Rank by engineering depth** (depth_score 0-10, difficulty boost, keyword analysis) -6. **Ground in observable code patterns** (if evidence-grounded) -7. **Detect inconsistencies** (doc claims vs actual code) -8. **Identify complexity mismatches** (e.g., "microservices" with 1 backend) - -### What ORACLE Cannot Do (Yet) ❌ - -1. **Generate true multi-turn follow-ups** - Framework exists but not integrated -2. **Adapt questions in real-time** - Based on student response quality -3. **Score student answers** - Only infrastructure for human evaluation exists -4. **Update question difficulty** - Based on initial response -5. **Surface follow-up follow-ups** - Only one level of follow-ups planned - ---- - -## 7. Key Data Structures - -### VivaTarget (models/context.py) - -```python -class VivaTarget(BaseModel): - topic: str # Architecture, Security, Scalability, etc. - question_target: str # Specific topic (e.g., "JWT Lifecycle") - difficulty: str # easy, medium, hard - importance_score: float # 0.0-1.0, boosted by ranker - focus: str # Full question text - - # Extended Intelligence - category: str # Same as topic, for filtering - depth_score: float # 0-10 engineering depth - related_node: str # Execution graph node (e.g., "auth_middleware") - confidence: float # 0.0-1.0, engine confidence - reasoning_summary: str # Why this question was generated -``` - -### HumanVivaQuestionEvaluation (human_evaluator_models.py) - -```python -class HumanVivaQuestionEvaluation(BaseModel): - question_text: str - human_verdict: List[VivaQuestionQuality] # Multiple verdicts possible - code_specificity_score: float # 0.0-1.0 - distinguishes_senior_engineer: bool # Would this screen out juniors? - technical_accuracy: bool - suggested_follow_up: Optional[str] -``` - -### ComparativeVivaAnalysis (comparative_evaluator.py) - -```python -class ComparativeVivaAnalysis(BaseModel): - question_text: str - human_evaluations: List[HumanVivaQuestionEvaluation] - - # Metrics - quality_rate: float # % rated as good - code_specificity: float # Average specificity - distinguishes_levels: float # % say it distinguishes levels - would_ask_in_interview: float # % say they'd ask this - - # Verdict - oracle_question_good: bool - textbook_pattern_detected: bool - consensus: str -``` - ---- - -## 8. File Map - -| File | Purpose | -|------|---------| -| `viva_intelligence_engine.py` | Template-based viva generation (primary) | -| `evidence_grounded_viva_generator.py` | Evidence-grounded viva generation (secondary) | -| `viva_question_ranker.py` | Ranks/scores viva targets | -| `implementation_flow_engine.py` | Orchestrates analysis, attaches viva targets | -| `observable_signals_engine.py` | Extracts engineering signals for evidence-grounding | -| `viva_simulation.py` | Simulates viva sessions with follow-up logic | -| `comparative_evaluator.py` | Evaluates viva questions against human assessments | -| `agents/oracle/agent.py` | Main orchestrator calling viva generators | -| `models/context.py` | VivaTarget, StructuredContext data models | - ---- - -## 9. Integration Points (OracleAgent Flow) - -```python -# File: backend/src/agents/oracle/agent.py, line 140-170 - -# Phase 1: Extract signals & failures -grounded_viva_targets = EvidenceGroundedVivaGenerator.generate_questions( - failure_scenarios, - observable_signals, - repo_detections, - repo_path -) - -# Phase 2: Fallback viva generation -viva_targets = VivaIntelligenceEngine.generate_targets( - repo_detections, - arch_inference -) - -# Phase 3: Use evidence-grounded if available, else fallback -final_viva_targets = grounded_viva_targets if grounded_viva_targets else viva_targets - -# Phase 4: Attach to context -context.viva_intelligence_targets = final_viva_targets - -# Phase 5: Implementation flow analysis (adds more viva targets) -context = ImplementationFlowEngine.analyze_implementation(repo_path, structure, context) -# This adds: context.implementation_viva_targets (basic, hardcoded for now) -``` - ---- - -## 10. Future Enhancement Opportunities - -### Short Term -1. **Wire follow-up generation into OracleAgent** - Use VivaSession simulation in live flow -2. **Auto-score student responses** - Integrate LLM evaluation for answer quality -3. **Adaptive difficulty** - Re-rank questions based on response quality -4. **Rich follow-up feedback** - Surface why question was asked and what weakness it targets - -### Medium Term -5. **Multi-turn conversations** - Chain 3-5 follow-ups before judging competency -6. **Personalized question selection** - Pick questions matching student's tech stack -7. **Failure scenario drills** - "Walk me through your system when X fails" - auto-scored -8. **Peer comparison** - Show how student's answers compare to engineering review corpus - -### Long Term -9. **Active learning** - Questions that maximize information gain about candidate -10. **Calibration** - Learn which questions best distinguish senior vs junior engineers -11. **Competency profiling** - Map responses to specific engineering competencies -12. **Career progression tracking** - See how candidate's knowledge evolves over time - ---- - -## 11. Known Limitations - -1. **No true semantic understanding** - Pattern matching, not semantic viva question generation -2. **Hardcoded follow-ups** - Implementation flow engine adds basic, fixed follow-ups -3. **Single-pass ranking** - Doesn't re-rank based on interdependencies -4. **No context from prior answers** - Each question generated independently -5. **Generic fallback** - If no evidence available, falls back to template-based -6. **Limited failure corpus** - Only 15 canned failure scenarios -7. **No student model** - Can't track student's knowledge over time within session - ---- - -## 12. Key Insights from Codebase - -### Design Philosophy -- **Evidence-first**: Questions should be grounded in observable code patterns -- **No speculative scoring**: Avoid arbitrary confidence inflation -- **Failure-driven**: Generate questions that probe likely failure modes -- **Comparative assessment**: Evaluate against human engineering reviews, not rubrics - -### Architecture Strengths -- Clean separation between template-based and evidence-grounded approaches -- Modular signal extraction (error handling, resilience, auth, observability) -- Ranked output (easy to filter by difficulty/importance) -- Rich metadata (depth_score, related_node, confidence, reasoning) - -### Open Questions for Phase 3+ -- Should follow-ups be generated online (during interview) or precomputed? -- How to prevent questions from being "asked before" in industry interviews? -- Should we learn from student responses to update question quality metrics? -- How to calibrate importance_score against real interview outcomes? diff --git a/backend/data/students.json b/backend/data/students.json new file mode 100644 index 0000000..3d33e8d --- /dev/null +++ b/backend/data/students.json @@ -0,0 +1,62 @@ +[ + { + "schema_version": "1.0.0", + "roll_number": "CS2021001", + "full_name": "Aman Koli", + "email": "aman.koli@college.edu", + "department": "CS", + "year": "3", + "batch": "A", + "program": "B.Tech", + "photo_reference": "photos/CS2021001.jpg", + "is_active": true + }, + { + "schema_version": "1.0.0", + "roll_number": "CS2021002", + "full_name": "Raj Koli", + "email": "raj.koli@college.edu", + "department": "CS", + "year": "3", + "batch": "A", + "program": "B.Tech", + "photo_reference": "photos/CS2021002.jpg", + "is_active": true + }, + { + "schema_version": "1.0.0", + "roll_number": "IT2022010", + "full_name": "Priya Sharma", + "email": "priya.sharma@college.edu", + "department": "IT", + "year": "2", + "batch": "B", + "program": "B.Tech", + "photo_reference": "photos/IT2022010.jpg", + "is_active": true + }, + { + "schema_version": "1.0.0", + "roll_number": "DS2020005", + "full_name": "Neha Patel", + "email": "neha.patel@college.edu", + "department": "DS", + "year": "4", + "batch": "C", + "program": "B.Tech", + "photo_reference": "photos/DS2020005.jpg", + "is_active": true + }, + { + "schema_version": "1.0.0", + "roll_number": "AI2023001", + "full_name": "Arjun Mehta", + "email": "arjun.mehta@college.edu", + "department": "AI", + "year": "1", + "batch": "A", + "program": "B.Tech", + "photo_reference": null, + "is_active": true + } +] diff --git a/backend/evaluation/CALIBRATION_QUICKSTART.md b/backend/evaluation/CALIBRATION_QUICKSTART.md deleted file mode 100644 index 011371e..0000000 --- a/backend/evaluation/CALIBRATION_QUICKSTART.md +++ /dev/null @@ -1,292 +0,0 @@ -# ORACLE Calibration & Validation: Quick Reference - -## What's Available - -### 📚 Documentation -- **[README.md](calibration/README.md)** - Detailed framework documentation -- **[SYSTEM_OVERVIEW.md](calibration/SYSTEM_OVERVIEW.md)** - Architecture and usage -- **[INTEGRATION_GUIDE.md](calibration/INTEGRATION_GUIDE.md)** - Integration instructions - -### 🧪 Test Framework -- **repository_fixtures.py** - 4 diverse test repositories -- **signal_validator.py** - Validate observable signals -- **failure_propagation_validator.py** - Validate failure scenarios -- **viva_quality_validator.py** - Validate interview questions -- **confidence_calibrator.py** - Calibrate confidence scores -- **observability.py** - Runtime tracing infrastructure - -### 🚀 Executable Scripts - -#### 1. Run Full Calibration -```bash -cd backend -python -m evaluation.calibration.calibration_runner -``` -**Output:** Comprehensive JSON report + dashboard metrics - -#### 2. Validate Against Fixtures -```bash -cd backend -python evaluation/validate_oracle_analysis.py -python evaluation/validate_oracle_analysis.py --fixtures clean,broken -``` -**Output:** Validation results for each fixture - -#### 3. Check Quality Thresholds -```bash -cd backend -python evaluation/check_calibration_thresholds.py -python evaluation/check_calibration_thresholds.py --strict # For main branch -``` -**Output:** Pass/fail for each metric + recommendations - -### 📊 Visualizations -- **[calibration_dashboard.html](../testing_oracle_ui/calibration_dashboard.html)** - - Open in browser: `file:///.../calibration_dashboard.html` - - Shows confidence calibration curves - - Repository-specific performance - - Issues and recommendations - -### ⚙️ CI/CD Integration -- **[.github/workflows/calibration.yml](../.github/workflows/calibration.yml)** - - Runs on every PR and push to main/develop - - Checks thresholds automatically - - Publishes results as PR comments - - Uploads artifacts for analysis - -## Workflow - -### For PR Review - -```bash -# Before submitting PR: -cd backend -python -m evaluation.calibration.calibration_runner -python evaluation/check_calibration_thresholds.py - -# Check that metrics pass: -# - Signal Precision: 0.80+ -# - Signal Recall: 0.80+ -# - Viva Validity: 0.85+ -# - etc. -``` - -### For Main Branch - -```bash -# CI/CD runs automatically on push -# Uses STRICT thresholds: -# - Signal Precision: 0.85+ -# - Viva Validity: 0.90+ -# - etc. - -# If checks fail, local debugging: -cd backend -python evaluation/check_calibration_thresholds.py --strict -``` - -### For Dashboard Updates - -```bash -# Run calibration -cd backend -python -m evaluation.calibration.calibration_runner - -# Results automatically feed dashboard -# Open: backend/testing_oracle_ui/calibration_dashboard.html -``` - -## Key Metrics - -### Signal Detection (Observable Signals) -| Metric | Standard | Strict | What It Means | -|--------|----------|--------|---------------| -| Precision | 0.80 | 0.85 | % of detected signals are correct | -| Recall | 0.80 | 0.82 | % of expected signals found | -| F1 Score | 0.80 | 0.83 | Harmonic mean of P/R | -| Confidence RMSE | 0.12 | 0.10 | Calibration accuracy | - -### Failure Propagation -| Metric | Standard | Strict | What It Means | -|--------|----------|--------|---------------| -| Precision | 0.75 | 0.80 | % of scenarios match expected | -| Propagation Accuracy | 0.80 | 0.85 | % of paths correctly identified | - -### Viva Questions -| Metric | Standard | Strict | What It Means | -|--------|----------|--------|---------------| -| Validity Rate | 0.85 | 0.90 | % of questions pass quality checks | -| Grounding Rate | 0.90 | 0.92 | % have code evidence | - -## Troubleshooting - -### Issue: "No calibration results found" -```bash -# Make sure you're running from backend directory -cd backend -python -m evaluation.calibration.calibration_runner -``` - -### Issue: Metrics below threshold -```bash -# Check detailed report -cd backend -python evaluation/check_calibration_thresholds.py - -# Read recommendations from report -# Typical issues: -# - Signal patterns need refinement -# - Confidence scores miscalibrated -# - Viva generation too generic -``` - -### Issue: CI/CD workflow failing -```bash -# Run locally to debug -cd backend -python -m evaluation.calibration.calibration_runner -python evaluation/validate_oracle_analysis.py -python evaluation/check_calibration_thresholds.py -``` - -## Integration Timeline - -### Week 1: Foundation (✅ DONE) -- [x] Validation framework built -- [x] Repository fixtures defined -- [x] All validators implemented -- [x] Observability infrastructure created -- [x] Dashboard built -- [x] Documentation complete - -### Week 2: Integration (🔄 IN PROGRESS) -- [ ] Wire validators into OracleAgent -- [ ] Add trace emission to engines -- [ ] Create validation wrapper -- [ ] Set up CI/CD workflow -- [ ] Test end-to-end pipeline - -### Week 3: Automation (📋 PLANNED) -- [ ] Automate calibration on every PR -- [ ] Publish dashboard to team -- [ ] Track metrics over time -- [ ] Set up alerts for degradation - -### Week 4: Refinement (📋 PLANNED) -- [ ] Analyze calibration results -- [ ] Refine algorithms based on findings -- [ ] Update confidence scoring -- [ ] Publish benchmarks - -## File Structure - -``` -backend/ -├── evaluation/ -│ ├── calibration/ -│ │ ├── __init__.py -│ │ ├── README.md ← Start here -│ │ ├── SYSTEM_OVERVIEW.md ← Architecture -│ │ ├── INTEGRATION_GUIDE.md ← Integration steps -│ │ ├── repository_fixtures.py ← Test dataset -│ │ ├── signal_validator.py -│ │ ├── failure_propagation_validator.py -│ │ ├── viva_quality_validator.py -│ │ ├── confidence_calibrator.py -│ │ ├── observability.py -│ │ ├── calibration_runner.py -│ │ └── results/ ← Generated reports -│ ├── validate_oracle_analysis.py ← Validation wrapper -│ └── check_calibration_thresholds.py ← Quality gating -│ -├── testing_oracle_ui/ -│ └── calibration_dashboard.html ← Visualization -│ -└── .github/workflows/ - └── calibration.yml ← CI/CD automation -``` - -## Example Output - -### Calibration Report -``` -ORACLE EVIDENCE-GROUNDED INTELLIGENCE CALIBRATION REPORT -========================================================= - -📊 AGGREGATE METRICS: - - signals: - - average_precision: 0.847 - - average_recall: 0.823 - - average_f1_score: 0.835 - - average_confidence: 0.814 - - failures: - - average_precision: 0.805 - - average_recall: 0.778 - - average_propagation_accuracy: 0.892 - - viva: - - average_validity_rate: 0.856 - - average_grounding_rate: 0.912 - - average_specificity: 0.821 - -💡 RECOMMENDATIONS: - ✅ Signal accuracy within acceptable ranges - ✅ Propagation analysis sound - ⚠️ Async pattern detection needs refinement - ⚠️ Monorepo consistency thresholds need calibration -``` - -### Threshold Check Output -``` -📋 Loading: calibration_report_2026-05-18T14-32-15.json - -✅ PASS | Signal Precision....................... 0.847 (threshold: 0.80) -✅ PASS | Signal Recall......................... 0.823 (threshold: 0.80) -✅ PASS | Signal F1 Score....................... 0.835 (threshold: 0.80) -✅ PASS | Failure Precision..................... 0.805 (threshold: 0.75) -✅ PASS | Failure Propagation Accuracy.......... 0.892 (threshold: 0.80) -✅ PASS | Viva Validity Rate................... 0.856 (threshold: 0.85) -✅ PASS | Viva Grounding Rate.................. 0.912 (threshold: 0.90) - -📊 RESULTS: 7/7 metrics passed - -✅ All calibration metrics within acceptable ranges! - -🎯 Status: READY FOR MERGE -``` - -## Next Steps - -1. **Read the docs** - - Start with [README.md](calibration/README.md) - - Review [SYSTEM_OVERVIEW.md](calibration/SYSTEM_OVERVIEW.md) - -2. **Run calibration locally** - ```bash - cd backend - python -m evaluation.calibration.calibration_runner - python evaluation/check_calibration_thresholds.py - ``` - -3. **Follow integration guide** - - See [INTEGRATION_GUIDE.md](calibration/INTEGRATION_GUIDE.md) - - Modify OracleAgent to emit traces - - Wire validators into pipeline - -4. **Set up CI/CD** - - GitHub Actions workflow ready in `.github/workflows/calibration.yml` - - Automatically runs on every PR/push - - Comments results on PRs - -## Questions? - -- **How do I interpret the metrics?** → See [SYSTEM_OVERVIEW.md](calibration/SYSTEM_OVERVIEW.md#interpreting-results) -- **How do I integrate with my code?** → See [INTEGRATION_GUIDE.md](calibration/INTEGRATION_GUIDE.md) -- **What if my metrics are low?** → See [Troubleshooting](#troubleshooting) -- **How does validation work?** → See [README.md](calibration/README.md#how-it-works-example) - ---- - -**ORACLE Evidence-Grounded Intelligence** | Validated | Calibrated | Observable | Production-Ready diff --git a/backend/evaluation/calibration/INTEGRATION_GUIDE.md b/backend/evaluation/calibration/INTEGRATION_GUIDE.md deleted file mode 100644 index edf5f4c..0000000 --- a/backend/evaluation/calibration/INTEGRATION_GUIDE.md +++ /dev/null @@ -1,470 +0,0 @@ -""" -ORACLE Calibration Integration Guide - -This guide shows how to integrate validation and observability into the -OracleAgent analysis pipeline. - -Integration involves: -1. Adding trace collection during analysis -2. Wiring validators into the pipeline -3. Exporting validation reports -4. Publishing calibration dashboards -""" - -# ============================================================================ -# STEP 1: INTEGRATION POINTS IN ORACLEAGENT -# ============================================================================ - -ORACLE_AGENT_MODIFICATIONS = """ -# File: backend/src/agents/oracle/agent.py - -from evaluation.calibration.observability import TraceCollector, get_trace_collector -from evaluation.calibration.signal_validator import ObservableSignalValidator -from evaluation.calibration.failure_propagation_validator import ExecutionGraphFailureValidator -from evaluation.calibration.viva_quality_validator import VivaQualityValidator - - -class OracleAgent(BaseAgent): - async def process(self, session_id: str, input_data: Dict[str, Any], log_callback=None): - # Initialize trace collection for this session - trace_collector = TraceCollector() - - # ... existing analysis code ... - - # PHASE 2A: Observable Signals with Tracing - observable_signals = ObservableSignalsEngine.extract_signals( - repo_path, repo_structure, repo_detections, project_graph - ) - - # Capture signal traces - for signal in observable_signals: - signal_trace = SignalGenerationTrace( - signal_name=signal.signal_name, - search_pattern=getattr(signal, 'search_pattern', ''), - files_searched=getattr(signal, 'files_searched', []), - matches_found=len(signal.evidence_files), - confidence_calculated=signal.confidence, - confidence_reasoning=f"{len(signal.evidence_files)} evidence files, " - f"confidence {signal.confidence:.2f}", - evidence_files_collected=signal.evidence_files, - ) - trace_collector.add_signal_trace(signal_trace) - - # PHASE 2B: Failure Scenarios with Tracing - failure_scenarios = ExecutionGraphFailureAnalyzer.analyze_failure_scenarios( - repo_path, repo_structure, repo_detections, observable_signals, project_graph - ) - - # Capture propagation traces - for scenario in failure_scenarios: - propagation_trace = PropagationTrace( - scenario_name=scenario.scenario_name, - trigger_node=scenario.trigger, - affected_path_count=len(scenario.affected_paths), - propagation_depth=len(scenario.affected_paths), - traversal_steps=[ - { - "component": component, - "risk_level": scenario.propagation_risk - } - for component in scenario.affected_paths - ], - risk_justification_steps=[ - f"Propagation risk {scenario.propagation_risk}: " - f"{len(scenario.affected_paths)} affected components", - f"Recovery possible: {scenario.recovery_possible}", - ], - ) - trace_collector.add_propagation_trace(propagation_trace) - - # Store traces in context for validation - context.trace_collector = trace_collector - - return context -""" - -# ============================================================================ -# STEP 2: VALIDATION WORKFLOW -# ============================================================================ - -VALIDATION_WORKFLOW = """ -# File: backend/evaluation/validate_oracle_analysis.py - -import asyncio -from pathlib import Path -from src.agents.oracle.agent import OracleAgent -from evaluation.calibration.calibration_runner import CalibrationRunner -from evaluation.calibration.repository_fixtures import ALL_FIXTURES -from evaluation.calibration.signal_validator import ObservableSignalValidator -from evaluation.calibration.failure_propagation_validator import ExecutionGraphFailureValidator -from evaluation.calibration.viva_quality_validator import VivaQualityValidator - - -async def validate_oracle_against_fixtures(): - ''' - Run ORACLE against fixture repositories and validate outputs. - ''' - oracle = OracleAgent() - runner = CalibrationRunner() - - all_results = {} - - for fixture in ALL_FIXTURES: - print(f"\\n[Validation] Testing: {fixture.name}") - - # In real scenario, would have fixture.repo_url - # For now, demonstrate with real project - if fixture.name == "Clean FastAPI REST API": - repo_url = "https://github.com/Project-XI/Project-EL.git" - else: - print(f"[Validation] Fixture {fixture.name} - skipping (no test repo available)") - continue - - # Run oracle analysis - context = await oracle.process( - session_id=f"validation_{fixture.name}", - input_data={"repo_url": repo_url}, - ) - - # Validate signals - signal_report = ObservableSignalValidator.validate_signals( - context.observable_signals, - fixture.expected_signals, - fixture.name - ) - print(f" Signals: Precision={signal_report.precision:.3f}, " - f"Recall={signal_report.recall:.3f}") - - # Validate failures - failure_report = ExecutionGraphFailureValidator.validate_failure_scenarios( - context.failure_scenarios, - fixture.expected_failure_scenarios, - context.project_graph, - fixture.name - ) - print(f" Failures: Precision={failure_report.precision:.3f}, " - f"Propagation Accuracy={failure_report.propagation_accuracy:.3f}") - - # Validate viva questions - viva_report = VivaQualityValidator.validate_viva_questions( - context.viva_intelligence_targets, - fixture.name, - context.observable_signals, - context.failure_scenarios, - ) - print(f" Viva: Validity={viva_report.validity_rate:.3f}, " - f"Grounding={viva_report.grounding_rate:.3f}") - - # Collect results - all_results[fixture.name] = { - "signal_validation": signal_report, - "failure_validation": failure_report, - "viva_validation": viva_report, - "traces": context.trace_collector.export_traces("json"), - } - - # Generate aggregate report - print("\\n" + "="*80) - print("ORACLE VALIDATION COMPLETE") - print("="*80) - - # Save validation results - validation_dir = Path("backend/evaluation/calibration/validation_results") - validation_dir.mkdir(parents=True, exist_ok=True) - - import json - from datetime import datetime - timestamp = datetime.now().isoformat() - - report_path = validation_dir / f"validation_{timestamp}.json" - with open(report_path, 'w') as f: - # Convert to JSON-serializable format - json_results = {} - for fixture_name, results in all_results.items(): - json_results[fixture_name] = { - "signals": results["signal_validation"].to_dict(), - "failures": results["failure_validation"].to_dict(), - "viva": results["viva_validation"].to_dict(), - } - json.dump(json_results, f, indent=2) - - print(f"Results saved to {report_path}") - return all_results - - -if __name__ == "__main__": - asyncio.run(validate_oracle_against_fixtures()) -""" - -# ============================================================================ -# STEP 3: CI/CD INTEGRATION -# ============================================================================ - -GITHUB_ACTIONS_WORKFLOW = """ -# File: .github/workflows/calibration.yml - -name: ORACLE Calibration Pipeline - -on: - push: - branches: [main, develop] - paths: - - 'backend/src/services/intelligence/**' - - 'backend/src/agents/oracle/**' - - 'backend/evaluation/calibration/**' - pull_request: - branches: [main] - -jobs: - calibration: - runs-on: ubuntu-latest - - steps: - - uses: actions/checkout@v3 - - - name: Set up Python - uses: actions/setup-python@v4 - with: - python-version: '3.11' - - - name: Install dependencies - run: | - cd backend - pip install -r requirements.txt - - - name: Run ORACLE calibration validation - run: | - cd backend - python -m evaluation.validate_oracle_analysis - - - name: Generate calibration report - run: | - cd backend - python -m evaluation.calibration.calibration_runner - - - name: Upload calibration results - uses: actions/upload-artifact@v3 - if: always() - with: - name: oracle-calibration-results - path: backend/evaluation/calibration/results/ - - - name: Check validation thresholds - run: | - # Fail if metrics below thresholds - python backend/evaluation/check_calibration_thresholds.py \\ - --signal-precision 0.80 \\ - --signal-recall 0.80 \\ - --failure-accuracy 0.80 \\ - --viva-validity 0.85 - - - name: Comment on PR with results - if: github.event_name == 'pull_request' - uses: actions/github-script@v6 - with: - script: | - const fs = require('fs'); - const results = JSON.parse(fs.readFileSync( - 'backend/evaluation/calibration/results/calibration_report_*.json' - )); - const comment = '## ORACLE Calibration Results\\n' + - '- Signal Precision: ' + results.aggregate_metrics.signals.average_precision + - '\\n- Viva Validity: ' + results.aggregate_metrics.viva.average_validity_rate; - github.rest.issues.createComment({ - issue_number: context.issue.number, - owner: context.repo.owner, - repo: context.repo.repo, - body: comment - }); -""" - -# ============================================================================ -# STEP 4: THRESHOLD CHECKING -# ============================================================================ - -THRESHOLD_CHECKER = """ -# File: backend/evaluation/check_calibration_thresholds.py - -import json -import sys -from pathlib import Path - - -def check_thresholds( - signal_precision: float = 0.80, - signal_recall: float = 0.80, - failure_accuracy: float = 0.80, - viva_validity: float = 0.85, -): - '''Verify calibration metrics meet thresholds.''' - - # Find latest calibration report - results_dir = Path("backend/evaluation/calibration/results") - if not results_dir.exists(): - print("❌ No calibration results found") - return False - - # Load latest report - reports = sorted(results_dir.glob("calibration_report_*.json")) - if not reports: - print("❌ No calibration reports found") - return False - - with open(reports[-1]) as f: - report = json.load(f) - - metrics = report["aggregate_metrics"] - passed = True - - # Check each threshold - checks = [ - ("Signal Precision", - metrics["signals"]["average_precision"], signal_precision), - ("Signal Recall", - metrics["signals"]["average_recall"], signal_recall), - ("Failure Propagation Accuracy", - metrics["failures"]["average_propagation_accuracy"], failure_accuracy), - ("Viva Validity Rate", - metrics["viva"]["average_validity_rate"], viva_validity), - ] - - print("\\n📊 CALIBRATION THRESHOLD CHECK") - print("="*60) - - for name, actual, threshold in checks: - status = "✅ PASS" if actual >= threshold else "❌ FAIL" - print(f"{status} | {name}: {actual:.3f} (threshold: {threshold:.3f})") - if actual < threshold: - passed = False - - print("="*60) - - if passed: - print("\\n✅ All calibration metrics within acceptable ranges!") - else: - print("\\n❌ Some metrics below thresholds. Review recommendations:") - for rec in report["calibration_recommendations"]: - print(f" {rec}") - - return passed - - -if __name__ == "__main__": - import argparse - parser = argparse.ArgumentParser() - parser.add_argument("--signal-precision", type=float, default=0.80) - parser.add_argument("--signal-recall", type=float, default=0.80) - parser.add_argument("--failure-accuracy", type=float, default=0.80) - parser.add_argument("--viva-validity", type=float, default=0.85) - - args = parser.parse_args() - - success = check_thresholds( - signal_precision=args.signal_precision, - signal_recall=args.signal_recall, - failure_accuracy=args.failure_accuracy, - viva_validity=args.viva_validity, - ) - - sys.exit(0 if success else 1) -""" - -# ============================================================================ -# STEP 5: QUICK START -# ============================================================================ - -QUICK_START = """ -## Quick Start: Running Oracle Calibration - -### 1. Run Full Calibration Pipeline - -```bash -cd /Users/rajkoli/Project-EL -python -m backend.evaluation.calibration.calibration_runner -``` - -This will: -- Test all 4 fixture repositories -- Validate signals, failures, and viva questions -- Calibrate confidence scores -- Generate comprehensive report -- Save results to `backend/evaluation/calibration/results/` - -### 2. Validate Against Real Repositories - -```bash -cd /Users/rajkoli/Project-EL/backend -python -m evaluation.validate_oracle_analysis -``` - -This will: -- Run OracleAgent against fixture repos -- Validate outputs against expected patterns -- Collect traces for each analysis -- Save validation results - -### 3. Check Calibration Metrics - -```bash -cd /Users/rajkoli/Project-EL -python backend/evaluation/check_calibration_thresholds.py -``` - -This will: -- Load latest calibration report -- Check against quality thresholds -- Show pass/fail for each metric -- Print recommendations if needed - -### 4. View Calibration Dashboard - -Open in browser: -``` -file:///Users/rajkoli/Project-EL/backend/testing_oracle_ui/calibration_dashboard.html -``` - -## Integration Timeline - -1. **Week 1**: Add trace collection to OracleAgent -2. **Week 2**: Wire validators into analysis pipeline -3. **Week 3**: Set up CI/CD calibration runs -4. **Week 4**: Publish calibration benchmarks -5. **Ongoing**: Monitor calibration trends, refine algorithms - -## Key Files to Modify - -- `backend/src/agents/oracle/agent.py` - Add trace collection -- `backend/src/services/intelligence/observable_signals_engine.py` - Add observability -- `backend/src/services/intelligence/execution_graph_failure_analyzer.py` - Add observability -- `backend/src/services/intelligence/evidence_grounded_viva_generator.py` - Add observability - -## Expected Baseline Metrics - -After full integration: -- Signal Detection Precision: ~0.85 -- Signal Detection Recall: ~0.82 -- Failure Propagation Accuracy: ~0.88 -- Viva Validity Rate: ~0.86 -- Viva Grounding Rate: ~0.91 -- Confidence Calibration RMSE: <0.10 - -## Continuous Improvement Workflow - -1. Run calibration on every PR -2. Track metrics over time (dashboard) -3. Identify degradation patterns -4. Refine algorithms based on findings -5. Re-calibrate confidence scores -6. Publish updated benchmarks -""" - -# ============================================================================ -# EXECUTION INSTRUCTIONS -# ============================================================================ - -print(__doc__) -print(ORACLE_AGENT_MODIFICATIONS) -print(VALIDATION_WORKFLOW) -print(GITHUB_ACTIONS_WORKFLOW) -print(THRESHOLD_CHECKER) -print(QUICK_START) diff --git a/backend/evaluation/calibration/SYSTEM_OVERVIEW.md b/backend/evaluation/calibration/SYSTEM_OVERVIEW.md deleted file mode 100644 index 24d0632..0000000 --- a/backend/evaluation/calibration/SYSTEM_OVERVIEW.md +++ /dev/null @@ -1,373 +0,0 @@ -# ORACLE Evidence-Grounded Intelligence: Validation & Calibration System - -## Executive Summary - -ORACLE now includes a comprehensive **validation and calibration framework** that ensures its evidence-grounded intelligence remains: -- ✅ **Technically Credible** - Grounded in real code evidence -- ✅ **Stress-Tested** - Validated against diverse repositories -- ✅ **Calibrated** - Confidence scores matched to actual accuracy -- ✅ **Observable** - Deep visibility into analysis reasoning -- ✅ **Measurable** - Continuous quality metrics - -This system prevents: -- ❌ Hallucinated signals/scenarios -- ❌ Generic/textbook viva questions -- ❌ Overconfident unreliable predictions -- ❌ Speculative reasoning -- ❌ Ungrounded quality judgments - -## What's New - -### Phase 2 Evolution -ORACLE now tracks three validation dimensions: - -| Dimension | What We Validate | Why It Matters | -|-----------|-----------------|-----------------| -| **Observable Signals** | Are detected patterns real and grounded? | Signals inform all downstream analysis | -| **Failure Propagation** | Do failures trace through real execution paths? | Determines risk assessment accuracy | -| **Viva Questions** | Are questions specific and grounded in code? | Ensures interview preparation is practical | - -### Confidence Calibration -Confidence scores are now **calibrated to actual validation accuracy**: -``` -Confidence 0.85 → We say "85% confident" -Validation shows → Actually 87% accurate -Calibration error → Only 2% (well-calibrated) -``` - -If calibration error exceeds 10%, system recommends retraining. - -## Core Components - -### 1. Repository Fixtures (Test Dataset) -Defines 4 diverse stress-test cases: -- **Clean FastAPI REST API** - Well-structured, comprehensive error handling -- **Messy Student Project** - Real messy code, mixed patterns, incomplete handling -- **Broken Async Project** - Deadlocks, race conditions, missing awaits -- **Monorepo with Shared State** - Microservices, cross-service dependencies - -For each fixture, we specify: -- Expected signals that SHOULD be detected -- Expected failure scenarios that SHOULD be found -- Expected viva question characteristics - -### 2. Validators - -#### Signal Validator -``` -Detects: ✓ Observable facts | ✗ Hallucinations -Measures: Precision, Recall, F1 Score, Confidence Calibration -``` - -#### Failure Propagation Validator -``` -Detects: ✓ Real execution paths | ✗ Imaginary propagation chains -Measures: Precision, Recall, Propagation Accuracy -``` - -#### Viva Quality Validator -``` -Detects: ✓ Code-specific questions | ✗ Generic textbook trivia -Rejects: "What is FastAPI?" (generic) - "How would you add ML?" (speculative) -Accepts: "If Redis crashes, what's your recovery strategy?" (grounded) -Measures: Specificity, Relevance, Realism (0.0-1.0 scores) -``` - -#### Confidence Calibrator -``` -Maps: Confidence score ranges → Actual accuracy percentages -Detects: ✓ Well-calibrated scores | ✗ Overconfident predictions -Metrics: RMSE, MAE (target < 0.10) -``` - -### 3. Runtime Observability -Every analysis now emits detailed traces: - -```python -# Signal trace -signal_name: "Async error recovery patterns" -search_pattern: r"try:.*await.*except" -files_searched: 45 -matches_found: 12 -confidence: 0.87 -evidence_files: ["routes/api.py:L45-60", "services/db.py:L120-135"] - -# Propagation trace -scenario: "Database connection loss" -trigger: "PostgreSQL unavailable" -affected_paths: 5 -propagation_depth: 3 -components_affected: ["queries", "transaction_handlers", "connection_pool"] - -# Viva trace -question_topic: "Database failover strategy" -grounding_source: "failure_scenario:db_loss" -code_patterns: ["connection_pool", "retry_logic", "fallback"] -evidence_files: ["db_config.py", "queries.py"] -``` - -### 4. Continuous Calibration -Dashboard shows: -- Calibration metrics per component -- Confidence accuracy curves -- Repository-specific performance -- Trending over time - -## Usage - -### Quick Validation - -Run full calibration pipeline: -```bash -python -m backend.evaluation.calibration.calibration_runner -``` - -Output: -``` -ORACLE EVIDENCE-GROUNDED INTELLIGENCE CALIBRATION REPORT -======================================================== - -📊 AGGREGATE METRICS: - - signals: - - average_precision: 0.847 - - average_recall: 0.823 - - average_f1_score: 0.835 - - failures: - - average_precision: 0.805 - - average_propagation_accuracy: 0.892 - - viva: - - average_validity_rate: 0.856 - - average_grounding_rate: 0.912 - -💡 RECOMMENDATIONS: - ✅ Signal accuracy within acceptable ranges - ⚠️ Async pattern detection needs refinement... -``` - -### View Results - -Open dashboard: -``` -backend/testing_oracle_ui/calibration_dashboard.html -``` - -## Key Metrics - -### Minimum Quality Thresholds - -| Metric | Threshold | Why | -|--------|-----------|-----| -| Signal Precision | 0.85+ | Only 15% false alarms acceptable | -| Signal Recall | 0.85+ | Catch 85% of expected patterns | -| F1 Score | 0.80+ | Balanced precision/recall | -| Propagation Accuracy | 0.85+ | Execution graphs correctly identified | -| Viva Validity | 0.85+ | 85% of questions pass quality checks | -| Confidence RMSE | <0.10 | Confidence well-calibrated to actual accuracy | -| Grounding Rate | 0.90+ | 90% of questions have code evidence | - -### Baseline Performance (After Full Implementation) - -``` -Component Precision Recall F1 Confidence Calibration -───────────────────────────────────────────────────────────────────── -Signals 0.847 0.823 0.835 RMSE: 0.062 (excellent) -Failures 0.805 0.778 0.791 RMSE: 0.084 (good) -Viva Questions 0.856 — — Specificity: 0.821 -``` - -## How It Works: Example - -### Scenario: Validating FastAPI REST API - -**1. Repository Fixture Defines Expectations** -```python -# Expected signals -- "Async error recovery patterns" (min confidence 0.85) -- "Redis cache resilience" (min confidence 0.80) -- "Request/response observability" (min confidence 0.75) - -# Expected failures -- "Database connection loss" (critical risk, 5 affected paths) -- "Redis cache failure" (high risk, 3 affected paths) -``` - -**2. ORACLE Analyzes Repository** -``` -ObservableSignalsEngine → - Finds: ["Async error recovery" (0.87), "Redis resilience" (0.82), ...] - -ExecutionGraphFailureAnalyzer → - Finds: ["DB loss" (critical, 5 paths), "Cache failure" (high, 3 paths), ...] - -EvidenceGroundedVivaGenerator → - Creates: ["If Redis crashes...?" (grounded in failure scenario), - "How handle DB failover?" (grounded in propagation analysis), ...] -``` - -**3. Validators Compare to Expectations** - -``` -Signal Validator: - Expected: 3 signals → Detected: 3 ✓ - Precision: 3/3 = 1.00 ✓ - Recall: 3/3 = 1.00 ✓ - -Failure Validator: - Expected: 2 scenarios → Detected: 2 ✓ - Precision: 2/2 = 1.00 ✓ - Propagation accuracy: 100% ✓ - -Viva Validator: - Generated: 8 questions - Valid (grounded, specific): 7 - Invalid (generic/speculative): 1 - Validity rate: 7/8 = 0.875 ✓ -``` - -**4. Confidence Calibration** - -``` -Signal "Redis resilience": confidence 0.82 -Repository type: clean_api -Validation result: True positive - -Bin [0.75-0.90]: - Confidence: 0.82 - Actual accuracy: 0.87 - Calibration error: 0.05 ✓ -``` - -**5. Report Generated** - -```json -{ - "repository": "Clean FastAPI REST API", - "signal_precision": 1.00, - "signal_recall": 1.00, - "failure_precision": 1.00, - "viva_validity": 0.875, - "confidence_calibration_rmse": 0.045, - "status": "PASSED", - "recommendations": "✅ Excellent validation results. Signal and failure detection working well." -} -``` - -## Integration Status - -### ✅ Complete -- [x] Validation framework built -- [x] Repository fixtures defined -- [x] All validators implemented -- [x] Observability infrastructure created -- [x] Calibration dashboard built -- [x] Documentation complete -- [x] Code committed - -### 🔄 In Progress -- [ ] Wire validators into OracleAgent analysis pipeline -- [ ] Add trace emission to intelligence engines -- [ ] Create validation wrapper for OracleAgent.process() -- [ ] Set up CI/CD calibration runs -- [ ] Integrate dashboard with real data - -### 📋 Next Steps -1. Modify OracleAgent to emit traces during analysis -2. Create validation harness that runs validators post-analysis -3. Set up automated calibration on every code change -4. Publish calibration benchmarks to team -5. Monitor metrics over time - -## Files Overview - -``` -backend/evaluation/calibration/ -├── __init__.py # Framework overview -├── README.md # Detailed documentation -├── INTEGRATION_GUIDE.md # How to integrate with OracleAgent -├── repository_fixtures.py # Test dataset definitions -├── signal_validator.py # Observable signal validation -├── failure_propagation_validator.py # Failure scenario validation -├── viva_quality_validator.py # Viva question validation -├── confidence_calibrator.py # Confidence score calibration -├── observability.py # Runtime tracing infrastructure -└── calibration_runner.py # Orchestration & reporting - -backend/testing_oracle_ui/ -└── calibration_dashboard.html # Interactive visualization - -backend/evaluation/ -└── validate_oracle_analysis.py (to create) # Integration wrapper -``` - -## Architecture Diagram - -``` - Oracle Agent Process - ↓ - ┌─────────────────────────────────────────┐ - │ Document Parsing │ - │ Repository Analysis │ - │ Execution Graph Build │ - │ → Observable Signals (with trace) │ ← TraceCollector - │ → Failure Scenarios (with trace) │ captures - │ → Viva Questions (with trace) │ reasoning - │ → Architecture Inference │ - └─────────────────────────────────────────┘ - ↓ - ┌───────────────────────────────┐ - │ Validation Pipeline │ - ├───────────────────────────────┤ - │ Signal Validator │ - │ Failure Validator │ - │ Viva Quality Validator │ - │ Confidence Calibrator │ - └───────────────────────────────┘ - ↓ - ┌───────────────────────────────┐ - │ Report Generation │ - │ - Precision/Recall metrics │ - │ - Confidence calibration │ - │ - Issue detection │ - │ - Recommendations │ - └───────────────────────────────┘ - ↓ - ┌───────────────────────────────┐ - │ Calibration Dashboard │ - │ - Visualizations │ - │ - Trend analysis │ - │ - Quality metrics │ - └───────────────────────────────┘ -``` - -## FAQ - -**Q: Why do we need validation if confidence scores are calibrated?** -A: Confidence scores tell you how accurate a *single prediction* is likely to be. Validation metrics tell you if the *entire system* is working correctly across all repository types. They measure different things. - -**Q: What counts as "grounded in code evidence"?** -A: A signal is grounded if specific file locations and patterns are referenced. A failure scenario is grounded if propagation paths exist in the execution graph. A viva question is grounded if it references actual code patterns found in the repository. - -**Q: Can I add my own fixtures?** -A: Yes! See `repository_fixtures.py` - add a new `RepositoryFixture` with expected signals, failures, and viva characteristics. Then run the calibration pipeline against it. - -**Q: What if calibration RMSE is high?** -A: High RMSE means confidence scores don't match actual accuracy. Recommendation: retrain the confidence scoring algorithm based on validation results. See `ConfidenceCalibrator` for details. - -**Q: How often should I run calibration?** -A: Ideally on every PR that modifies intelligence engines. Minimum: after each major algorithm change. The CI/CD workflow can automate this. - -## Next: Integration - -See [INTEGRATION_GUIDE.md](INTEGRATION_GUIDE.md) for step-by-step instructions on: -1. Wiring validators into OracleAgent -2. Adding trace emission to engines -3. Setting up CI/CD calibration -4. Publishing dashboard visualizations - ---- - -**ORACLE Evidence-Grounded Intelligence** | Validated | Calibrated | Observable | Deterministic diff --git a/backend/src/agents/base.py b/backend/src/agents/base.py index f9899a7..6f5a234 100644 --- a/backend/src/agents/base.py +++ b/backend/src/agents/base.py @@ -14,7 +14,7 @@ def __init__(self, name: str): def emit_event(self, session_id: str, event_type: EventType, payload: Dict[str, Any]): """Standard method for agents to emit structured events.""" # Also dispatch a repository event for progress updates - if event_type == "agent.progress" and self.github_token: + if str(event_type) == EventType.AGENT_PROGRESS.value and self.github_token: self._dispatch_github_event(session_id, payload) return EventEmitter.emit( diff --git a/backend/src/agents/oracle/agent.py b/backend/src/agents/oracle/agent.py index 7656601..d1f61c6 100644 --- a/backend/src/agents/oracle/agent.py +++ b/backend/src/agents/oracle/agent.py @@ -105,9 +105,16 @@ async def send_log(msg: str, type: str = "info"): await send_log("[Oracle] Extracting observable engineering signals...", "info") if repo_path: observable_signals = ObservableSignalsEngine.extract_signals(repo_path, repo_structure, repo_detections, project_graph) - self.emit_event(session_id, "OBSERVABLE_SIGNALS_EXTRACTED", { + self.emit_event(session_id, EventType.AGENT_PROGRESS, { + "agent": "Oracle", + "status": "running", + "milestone": "Observable signals extracted", "signal_count": len(observable_signals), - "critical_signals": len([s for s in observable_signals if s.risk_level == "high"]), + "critical_signals": len([ + s for s in observable_signals + if getattr(s, "risk_level", None) == "high" + or (isinstance(s, dict) and s.get("risk_level") == "high") + ]), }) # ===== Architecture Inference (AST-based, deterministic) ===== @@ -128,9 +135,16 @@ async def send_log(msg: str, type: str = "info"): failure_scenarios = ExecutionGraphFailureAnalyzer.analyze_failure_scenarios( repo_path, repo_structure, repo_detections, observable_signals, project_graph ) - self.emit_event(session_id, "FAILURE_SCENARIOS_ANALYZED", { + self.emit_event(session_id, EventType.AGENT_PROGRESS, { + "agent": "Oracle", + "status": "running", + "milestone": "Failure scenarios analyzed", "scenario_count": len(failure_scenarios), - "critical_scenarios": len([s for s in failure_scenarios if s.propagation_risk == "critical"]), + "critical_scenarios": len([ + s for s in failure_scenarios + if getattr(s, "propagation_risk", None) == "critical" + or (isinstance(s, dict) and s.get("propagation_risk") == "critical") + ]), }) # ===== PHASE 2: Evidence-Grounded Viva Generation ===== @@ -140,11 +154,14 @@ async def send_log(msg: str, type: str = "info"): grounded_viva_targets = EvidenceGroundedVivaGenerator.generate_questions( failure_scenarios, observable_signals, repo_detections, repo_path ) - self.emit_event(session_id, "EVIDENCE_GROUNDED_VIVA_GENERATED", { + self.emit_event(session_id, EventType.AGENT_PROGRESS, { + "agent": "Oracle", + "status": "running", + "milestone": "Evidence-grounded viva generated", "viva_count": len(grounded_viva_targets), - "difficulty_breakdown": f"hard: {len([v for v in grounded_viva_targets if v.difficulty == 'hard'])}, " - f"medium: {len([v for v in grounded_viva_targets if v.difficulty == 'medium'])}, " - f"foundational: {len([v for v in grounded_viva_targets if v.difficulty == 'foundational'])}", + "difficulty_breakdown": f"hard: {len([v for v in grounded_viva_targets if getattr(v, 'difficulty', None) == 'hard'])}, " + f"medium: {len([v for v in grounded_viva_targets if getattr(v, 'difficulty', None) == 'medium'])}, " + f"foundational: {len([v for v in grounded_viva_targets if getattr(v, 'difficulty', None) == 'foundational'])}", }) # Fallback viva generation (backward compatibility) @@ -173,10 +190,6 @@ async def send_log(msg: str, type: str = "info"): complexity_mismatch=complexity_mismatch ) - # Attach Phase 2 evidence-grounded analysis to context - context.observable_signals = observable_signals - context.failure_scenarios = failure_scenarios - # 5. Implementation Intelligence Phase if repo_url and repo_path: self.log_info("Starting implementation flow analysis...") @@ -184,7 +197,7 @@ async def send_log(msg: str, type: str = "info"): context = ImplementationFlowEngine.analyze_implementation(repo_path, repo_structure, context) self.emit_event(session_id, EventType.IMPLEMENTATION_FLOW_DETECTED, {"nodes": len(context.execution_graph.nodes)}) - self.emit_event(session_id, "AGENT_PROGRESS", {"agent": "Oracle", "status": "complete", "milestone": "Submission Intelligence"}) + self.emit_event(session_id, EventType.AGENT_PROGRESS, {"agent": "Oracle", "status": "complete", "milestone": "Submission Intelligence"}) await send_log("[Oracle] Submission intelligence complete.", "success") return context diff --git a/backend/src/agents/sentinel/agent.py b/backend/src/agents/sentinel/agent.py index 42c1b08..f6ee6a3 100644 --- a/backend/src/agents/sentinel/agent.py +++ b/backend/src/agents/sentinel/agent.py @@ -1,5 +1,6 @@ from typing import Any, Dict from src.agents.base import BaseAgent +from src.models.events import EventType from src.models.context import StructuredContext class SentinelAgent(BaseAgent): @@ -20,6 +21,6 @@ async def send_log(msg: str, type: str = "info"): # In the future, this agent will analyze user interaction, detect anomalies, etc. # For now, it's a pass-through. - self.emit_event(session_id, "AGENT_PROGRESS", {"agent": "Sentinel", "status": "complete", "milestone": "Behavior Analyzed (Placeholder)"}) + self.emit_event(session_id, EventType.AGENT_PROGRESS, {"agent": "Sentinel", "status": "complete", "milestone": "Behavior Analyzed (Placeholder)"}) return input_data diff --git a/backend/src/main.py b/backend/src/main.py index 664c66f..2f2a553 100644 --- a/backend/src/main.py +++ b/backend/src/main.py @@ -1,13 +1,17 @@ from typing import List, Optional +from datetime import datetime from fastapi import FastAPI, BackgroundTasks, WebSocket, WebSocketDisconnect from fastapi.middleware.cors import CORSMiddleware from pydantic import BaseModel from .core.config import settings from .agents.main_agent.agent import MainAgent from .services.face_detection import FaceDetectionService +from .services.exam_session_service import ExamSessionService, SessionTransitionError +from .models.exam_session import ExamSessionConfig, StudentSubmission app = FastAPI(title=settings.PROJECT_NAME) face_service = FaceDetectionService() +exam_session_service = ExamSessionService() app.add_middleware( CORSMiddleware, @@ -20,6 +24,7 @@ class AnalyzeRequest(BaseModel): repo_url: str report_path: Optional[str] = None + roll_number: Optional[str] = None enable_viva: bool = True enable_debug: bool = True generate_report: bool = False @@ -32,26 +37,124 @@ class FaceVerifyRequest(BaseModel): class ResolveAlertRequest(BaseModel): conflict_id: str approved: bool = False - reviewer_id: Optional[str] = None - reason: Optional[str] = None -class AdminReviewRequest(BaseModel): - conflict_id: str - approved: bool - reviewer_id: str - reason: str -# Initialize the orchestrator and agents -main_agent = MainAgent() -gatekeeper_pipeline = main_agent.gatekeeper._pipeline # Access the global pipeline instance +class ExamSessionCreateRequest(BaseModel): + admin_id: str + title: str + config: Optional[ExamSessionConfig] = None + -class GatekeeperVerifyRequest(BaseModel): +class ExamSessionAssignRequest(BaseModel): + submissions: List[StudentSubmission] + + +class ExamSessionRollNumberRequest(BaseModel): roll_number: str - face_id: str = None + + +class ExamSessionConfigureRequest(BaseModel): + config: ExamSessionConfig + +# Initialize the main orchestrator agent +main_agent = MainAgent() @app.get("/") async def root(): - return {"message": "Welcome to ORACLE API", "status": "operational"} + return {"message": "Welcome to ORACLE Viva API", "status": "operational"} + + +@app.get("/exam-sessions") +async def list_exam_sessions(): + sessions = exam_session_service.list_sessions() + return {"items": [session.model_dump(mode="json") for session in sessions]} + + +@app.post("/exam-sessions") +async def create_exam_session(request: ExamSessionCreateRequest): + session = exam_session_service.create_session(request.admin_id, request.title, request.config) + return {"session": session.model_dump(mode="json")} + + +@app.get("/exam-sessions/{session_id}") +async def get_exam_session(session_id: str): + session = exam_session_service.get_session(session_id) + if session is None: + return {"session": None} + return {"session": session.model_dump(mode="json")} + + +@app.post("/exam-sessions/{session_id}/configure") +async def configure_exam_session(session_id: str, request: ExamSessionConfigureRequest): + try: + session = exam_session_service.configure_session(session_id, request.config) + return {"session": session.model_dump(mode="json")} + except SessionTransitionError as exc: + return {"error": str(exc)} + + +@app.post("/exam-sessions/{session_id}/students") +async def assign_exam_students(session_id: str, request: ExamSessionAssignRequest): + session = exam_session_service.assign_students(session_id, request.submissions) + return {"session": session.model_dump(mode="json")} + + +@app.post("/exam-sessions/{session_id}/ready") +async def mark_exam_session_ready(session_id: str): + try: + session = exam_session_service.set_ready(session_id) + return {"session": session.model_dump(mode="json")} + except SessionTransitionError as exc: + return {"error": str(exc)} + + +@app.post("/exam-sessions/{session_id}/activate") +async def activate_exam_session(session_id: str): + try: + session = exam_session_service.activate_session(session_id) + return {"session": session.model_dump(mode="json")} + except SessionTransitionError as exc: + return {"error": str(exc)} + + +@app.post("/exam-sessions/{session_id}/gatekeeper/precheck") +async def gatekeeper_precheck(session_id: str, request: ExamSessionRollNumberRequest): + try: + decision = exam_session_service.gatekeeper_precheck(session_id, request.roll_number) + session = exam_session_service.get_session(session_id) + return { + "decision": decision.model_dump(mode="json"), + "session": session.model_dump(mode="json") if session else None, + } + except SessionTransitionError as exc: + return {"error": str(exc)} + + +@app.post("/exam-sessions/{session_id}/oracle/start") +async def start_oracle_analysis(session_id: str, request: ExamSessionRollNumberRequest): + try: + session = await exam_session_service.start_oracle_analysis(session_id, request.roll_number) + return {"session": session.model_dump(mode="json")} + except SessionTransitionError as exc: + return {"error": str(exc)} + + +@app.post("/exam-sessions/{session_id}/complete") +async def complete_exam_session(session_id: str): + try: + session = exam_session_service.complete_session(session_id) + return {"session": session.model_dump(mode="json")} + except SessionTransitionError as exc: + return {"error": str(exc)} + + +@app.post("/exam-sessions/{session_id}/archive") +async def archive_exam_session(session_id: str): + try: + session = exam_session_service.archive_session(session_id) + return {"session": session.model_dump(mode="json")} + except SessionTransitionError as exc: + return {"error": str(exc)} @app.post("/face/verify") async def verify_face(request: FaceVerifyRequest): @@ -73,41 +176,13 @@ async def get_pending_alerts(): @app.post("/face/resolve-alert") async def resolve_alert(request: ResolveAlertRequest): - success = face_service.resolve_alert( - request.conflict_id, - request.approved, - request.reviewer_id, - request.reason - ) + success = face_service.resolve_alert(request.conflict_id, request.approved) return {"success": success} -@app.get("/face/conflict/{conflict_id}") -async def get_conflict_details(conflict_id: str): - details = face_service.get_conflict_details(conflict_id) - if details is None: - return {"error": "Conflict not found"}, 404 - return details - -@app.post("/admin/review-conflict") -async def admin_review_conflict(request: AdminReviewRequest): - success = face_service.admin_review_conflict( - request.conflict_id, - request.approved, - request.reviewer_id, - request.reason - ) - if not success: - return {"error": "Conflict not found"}, 404 - return {"success": True, "message": "Review decision recorded"} - -@app.get("/admin/override-log") -async def get_override_log(): - return face_service.get_override_log() - @app.post("/analyze") async def analyze_repo(request: AnalyzeRequest): # Legacy REST endpoint for backward compatibility - input_data = {"repo_url": request.repo_url, "report_path": request.report_path} + input_data = {"repo_url": request.repo_url, "report_path": request.report_path, "roll_number": request.roll_number} context = await main_agent.process("api_session", input_data) try: data = context.model_dump() @@ -115,22 +190,6 @@ async def analyze_repo(request: AnalyzeRequest): data = context.dict() return {"status": "success", "data": data} -@app.post("/gatekeeper/verify") -async def gatekeeper_verify(request: GatekeeperVerifyRequest): - """ - Direct endpoint to run the Gatekeeper verification pipeline. - """ - result = gatekeeper_pipeline.run(request.roll_number, request.face_id) - return {"status": "success", "data": result.to_dict()} - -@app.get("/gatekeeper/registry") -async def gatekeeper_registry(): - """ - Endpoint to fetch all active registered students. - """ - students = gatekeeper_pipeline._registry.all_active() - return {"status": "success", "data": [s.to_dict() for s in students]} - @app.websocket("/ws/analyze") async def websocket_analyze(websocket: WebSocket): await websocket.accept() diff --git a/backend/src/models/__init__.py b/backend/src/models/__init__.py index e69de29..70a4ccb 100644 --- a/backend/src/models/__init__.py +++ b/backend/src/models/__init__.py @@ -0,0 +1,39 @@ +from .events import EventType, PlatformEvent +from .exam_session import ( + ExamRubric, + ExamSession, + ExamSessionConfig, + ExamSessionState, + GatekeeperAdmissionDecision, + RubricCriterion, + SessionAuditEvent, + SessionTimingWindow, + StudentSubmission, +) +from .intelligence_artifact import ( + IntelligenceArtifact, + IntelligenceCategory, + IntelligenceHandoffEvent, + VivaTarget, + ExecutionNode, + ExecutionPath, + RuntimeDependency, + FailureScenario, + ImplementationSignal, + WeakPoint, + AdaptiveThreshold, + VivaSessionState, + VoiceSessionConfig, +) +from .stage_7_8_9 import ( + IntegritySignalType, + IntegritySeverity, + SentinelIntegrityEvent, + SentinelAlert, + ContradictionChainEntry, + EvaluationArtifact, + CoreSubject, + CurriculumQuestion, + CurriculumTransitionState, +) + diff --git a/backend/src/models/events.py b/backend/src/models/events.py index ddd6e67..4d5fb24 100644 --- a/backend/src/models/events.py +++ b/backend/src/models/events.py @@ -5,6 +5,16 @@ import uuid class EventType(str, Enum): + SESSION_CREATED = "session_created" + SESSION_CONFIGURED = "session_configured" + SESSION_READY = "session_ready" + SESSION_LIVE = "session_live" + SESSION_ACTIVE_VIVA = "session_active_viva" + SESSION_ARCHIVED = "session_archived" + STUDENT_ADMITTED = "student_admitted" + STUDENT_REJECTED = "student_rejected" + ORACLE_ANALYSIS_STARTED = "oracle_analysis_started" + ORACLE_ANALYSIS_COMPLETED = "oracle_analysis_completed" SESSION_STARTED = "session_started" SESSION_COMPLETED = "session_completed" IDENTITY_VERIFIED = "identity_verified" @@ -49,6 +59,46 @@ class EventType(str, Enum): FAILURE_PATH_DETECTED = "failure_path.detected" DEAD_PATH_DETECTED = "dead_path.detected" EXCEPTION_FLOW_ANALYZED = "exception_flow.analyzed" + + # Stage 4: ORACLE Intelligence Handoff + ORACLE_INTELLIGENCE_READY = "oracle_intelligence_ready" + + # Stage 5: MAIN Agent Viva Events + VIVA_SESSION_STARTED = "viva_session_started" + VIVA_QUESTION_ASKED = "viva_question_asked" + VIVA_RESPONSE_RECEIVED = "viva_response_received" + VIVA_EVALUATION_COMPLETE = "viva_evaluation_complete" + VIVA_FOLLOW_UP_GENERATED = "viva_follow_up_generated" + VIVA_CONTRADICTION_DETECTED = "viva_contradiction_detected" + VIVA_TOPIC_ESCALATED = "viva_topic_escalated" + VIVA_SESSION_COMPLETED = "viva_session_completed" + + # Stage 6: Voice Infrastructure Events + VOICE_SESSION_STARTED = "voice_session_started" + VOICE_QUESTION_PLAYED = "voice_question_played" + VOICE_LISTENING_STARTED = "voice_listening_started" + VOICE_LISTENING_STOPPED = "voice_listening_stopped" + VOICE_TRANSCRIPTION_RECEIVED = "voice_transcription_received" + VOICE_TRANSCRIPTION_NORMALIZED = "voice_transcription_normalized" + VOICE_SESSION_ENDED = "voice_session_ended" + + # Stage 7: SENTINEL Parallel Oversight Events + INTEGRITY_ALERT_GENERATED = "integrity_alert_generated" + PROLONGED_OFFSCREEN_FOCUS = "prolonged_offscreen_focus" + REPEATED_GAZE_SHIFT = "repeated_gaze_shift" + SESSION_INTERRUPTION = "session_interruption" + SUSPICIOUS_AUDIO_PATTERN = "suspicious_audio_pattern" + LOW_VISIBILITY_WARNING = "low_visibility_warning" + MANUAL_REVIEW_RECOMMENDED = "manual_review_recommended" + + # Stage 8: MAIN Agent Evaluation Loop Events + IMPLEMENTATION_FAMILIARITY_UPDATED = "implementation_familiarity_updated" + CONTRADICTION_CHAIN_UPDATED = "contradiction_chain_updated" + FOLLOW_UP_ESCALATION = "follow_up_escalation" + + # Stage 9: Curriculum Progression Events + CURRICULUM_TRANSITION_STARTED = "curriculum_transition_started" + CURRICULUM_TOPIC_COMPLETED = "curriculum_topic_completed" class PlatformEvent(BaseModel): event_id: str = Field(default_factory=lambda: str(uuid.uuid4())) diff --git a/backend/src/models/exam_session.py b/backend/src/models/exam_session.py new file mode 100644 index 0000000..7f6e3ff --- /dev/null +++ b/backend/src/models/exam_session.py @@ -0,0 +1,107 @@ +from __future__ import annotations + +from datetime import datetime +from enum import Enum +from typing import Any, Dict, List, Optional +import uuid + +from pydantic import BaseModel, Field + + +class ExamSessionState(str, Enum): + DRAFT = "DRAFT" + CONFIGURED = "CONFIGURED" + READY = "READY" + LIVE = "LIVE" + ACTIVE_VIVA = "ACTIVE_VIVA" + COMPLETED = "COMPLETED" + ARCHIVED = "ARCHIVED" + + +class SessionTimingWindow(BaseModel): + opens_at: Optional[datetime] = None + closes_at: Optional[datetime] = None + viva_duration_minutes: int = Field(default=15, ge=1, le=240) + check_in_grace_minutes: int = Field(default=5, ge=0, le=60) + + +class RubricCriterion(BaseModel): + name: str + description: Optional[str] = None + max_score: int = Field(default=10, ge=1, le=100) + + +class ExamRubric(BaseModel): + title: str = "Default Viva Rubric" + criteria: List[RubricCriterion] = Field(default_factory=list) + + +class StudentSubmission(BaseModel): + roll_number: str + repository_url: Optional[str] = None + document_paths: List[str] = Field(default_factory=list) + batch_label: Optional[str] = None + assignment_state: str = "assigned" + + +class ExamSessionConfig(BaseModel): + subject: str + course: str + semester: str + subject_code: Optional[str] = None + academic_year: Optional[str] = None + department: Optional[str] = None + instructor_name: Optional[str] = None + exam_coordinator: Optional[str] = None + timing_window: SessionTimingWindow = Field(default_factory=SessionTimingWindow) + rubric: ExamRubric = Field(default_factory=ExamRubric) + notes: Optional[str] = None + + +class SessionAuditEvent(BaseModel): + event_id: str = Field(default_factory=lambda: str(uuid.uuid4())) + timestamp: datetime = Field(default_factory=datetime.utcnow) + session_id: str + event_type: str + actor: str + payload: Dict[str, Any] = Field(default_factory=dict) + + +class GatekeeperAdmissionDecision(BaseModel): + decision_id: str = Field(default_factory=lambda: str(uuid.uuid4())) + timestamp: datetime = Field(default_factory=datetime.utcnow) + session_id: str + student_roll_number: str + admitted: bool + reason: Optional[str] = None + session_state: ExamSessionState + timing_valid: bool = False + submission_present: bool = False + duplicate_join: bool = False + suspicious: bool = False + metadata: Dict[str, Any] = Field(default_factory=dict) + + +class ExamSession(BaseModel): + session_id: str + admin_id: str + title: str = "Untitled Viva Session" + state: ExamSessionState = ExamSessionState.DRAFT + created_at: datetime = Field(default_factory=datetime.utcnow) + updated_at: datetime = Field(default_factory=datetime.utcnow) + activated_at: Optional[datetime] = None + completed_at: Optional[datetime] = None + archived_at: Optional[datetime] = None + config: Optional[ExamSessionConfig] = None + assigned_students: List[StudentSubmission] = Field(default_factory=list) + gatekeeper_decisions: List[GatekeeperAdmissionDecision] = Field(default_factory=list) + analysis_artifacts: List[Dict[str, Any]] = Field(default_factory=list) + audit_events: List[SessionAuditEvent] = Field(default_factory=list) + admitted_roll_numbers: List[str] = Field(default_factory=list) + active_student_roll_number: Optional[str] = None + oracle_started_at: Optional[datetime] = None + oracle_completed_at: Optional[datetime] = None + oracle_status: str = "not_started" + + def touch(self) -> None: + self.updated_at = datetime.utcnow() diff --git a/backend/src/models/intelligence_artifact.py b/backend/src/models/intelligence_artifact.py new file mode 100644 index 0000000..e3ddfea --- /dev/null +++ b/backend/src/models/intelligence_artifact.py @@ -0,0 +1,196 @@ +""" +ORACLE Intelligence Artifacts — Stage 4 Handoff Model + +Structured, deterministic intelligence handoff from ORACLE analysis to MAIN Agent. +All artifacts are evidence-grounded, explainable, and audit-safe. +""" + +from typing import List, Dict, Any, Optional +from pydantic import BaseModel, Field +from enum import Enum +from datetime import datetime + + +class IntelligenceCategory(str, Enum): + """Categorizes the type of intelligence artifact.""" + ARCHITECTURE = "ARCHITECTURE" + RUNTIME_FLOW = "RUNTIME_FLOW" + SECURITY = "SECURITY" + SCALABILITY = "SCALABILITY" + FAILURE_PATH = "FAILURE_PATH" + OBSERVABLE_SIGNAL = "OBSERVABLE_SIGNAL" + IMPLEMENTATION_RISK = "IMPLEMENTATION_RISK" + WEAK_POINT = "WEAK_POINT" + + +class RuntimeDependency(BaseModel): + """Tracks runtime dependencies critical to understanding implementation.""" + name: str + type: str # "LIBRARY", "SERVICE", "MIDDLEWARE", "DATABASE", "CACHE" + usage_pattern: str + criticality: str # "LOW", "MEDIUM", "HIGH", "CRITICAL" + evidence_file: Optional[str] = None + evidence_snippet: Optional[str] = None + + +class FailureScenario(BaseModel): + """Describes a specific failure scenario and its propagation.""" + scenario_name: str + trigger: str # What causes this failure + propagation_path: List[str] # How failure propagates through system + impact: str # What breaks as a result + severity: str # "LOW", "MEDIUM", "HIGH", "CRITICAL" + detectability: str # "EASY", "MODERATE", "HARD" + evidence_file: Optional[str] = None + related_nodes: List[str] = Field(default_factory=list) + + +class ExecutionNode(BaseModel): + """A single node in the execution graph.""" + node_id: str + label: str + node_type: str # "REQUEST_HANDLER", "MIDDLEWARE", "DB_QUERY", "SERVICE_CALL", "CACHE", "AUTH", "ERROR_HANDLER" + implementation_details: str # Brief description of what happens here + dependencies: List[str] = Field(default_factory=list) # Other nodes this depends on + failure_modes: List[str] = Field(default_factory=list) # How this can fail + + +class ExecutionPath(BaseModel): + """A traced execution path through the system.""" + path_id: str + description: str + nodes: List[str] # Order of ExecutionNode IDs + scenario: str # "HAPPY_PATH", "ERROR_PATH", "EDGE_CASE" + criticality: str # "LOW", "MEDIUM", "HIGH" + evidence_file: Optional[str] = None + + +class ImplementationSignal(BaseModel): + """Observable evidence of implementation decisions.""" + signal_type: str # "DESIGN_PATTERN", "ERROR_HANDLING", "CACHING_STRATEGY", "ASYNC_HANDLING", "STATE_MANAGEMENT" + description: str + evidence: str # Actual code or evidence + confidence: float # 0.0 - 1.0 + risk_level: str # "LOW", "MEDIUM", "HIGH" + related_risk: Optional[str] = None + + +class WeakPoint(BaseModel): + """Areas where implementation is fragile or shows poor understanding.""" + area: str # What part of the system + weakness: str # Specific weakness + why_problematic: str # Why this is concerning for a viva + testing_approach: str # How to probe this in viva + evidence_file: Optional[str] = None + + +class VivaTarget(BaseModel): + """Focused viva target with grounding and evidence.""" + target_id: str + question: str + category: IntelligenceCategory + difficulty: str # "FOUNDATIONAL", "MEDIUM", "HARD" + depth_score: float # 0-10, how deep understanding is required + why_important: str # Why ask this question + evidence_references: List[str] = Field(default_factory=list) # Files/lines this relates to + follow_up_paths: List[str] = Field(default_factory=list) # Possible follow-ups if answer is shallow + expected_coverage: List[str] = Field(default_factory=list) # What student should cover + red_flags: List[str] = Field(default_factory=list) # Concerning responses to watch for + + +class AdaptiveThreshold(BaseModel): + """Thresholds for adapting viva difficulty.""" + topic: str + weak_point_triggers: List[str] = Field(default_factory=list) # Triggers for escalation + strong_point_indicators: List[str] = Field(default_factory=list) # Triggers for advancement + contradiction_escalation: bool = True # Escalate on contradictions + + +class IntelligenceArtifact(BaseModel): + """ + Complete intelligence handoff from ORACLE to MAIN Agent. + Stage 4 output: structured, deterministic, evidence-grounded. + """ + # Metadata + artifact_id: str = Field(default_factory=lambda: f"artifact_{datetime.utcnow().isoformat()}") + session_id: str + oracle_version: str = "v1" + generated_at: datetime = Field(default_factory=datetime.utcnow) + analysis_duration_seconds: float + + # Project context + project_name: str + project_type: str + backend_stack: Dict[str, str] # framework, db, cache, etc. + frontend_stack: Optional[Dict[str, str]] = None + architecture_pattern: str # e.g., "MVC", "Microservices", "Monolith" + + # Core execution intelligence + execution_graph_nodes: List[ExecutionNode] = Field(default_factory=list) + execution_paths: List[ExecutionPath] = Field(default_factory=list) + runtime_dependencies: List[RuntimeDependency] = Field(default_factory=list) + + # Failure and risk intelligence + failure_scenarios: List[FailureScenario] = Field(default_factory=list) + implementation_risks: List[Dict[str, Any]] = Field(default_factory=list) + weak_points: List[WeakPoint] = Field(default_factory=list) + + # Viva intelligence + viva_targets: List[VivaTarget] = Field(default_factory=list) + adaptive_thresholds: List[AdaptiveThreshold] = Field(default_factory=list) + + # Implementation signals (observable evidence) + implementation_signals: List[ImplementationSignal] = Field(default_factory=list) + + # Explainability + summary: str # Human-readable summary of analysis + key_findings: List[str] = Field(default_factory=list) + analysis_confidence: float # Overall confidence 0.0-1.0 + limitations: List[str] = Field(default_factory=list) # What wasn't analyzed + + # Session binding + serialization_version: str = "1.0" + deterministic_hash: Optional[str] = None # For replay verification + + +class IntelligenceHandoffEvent(BaseModel): + """Event emitted when ORACLE completes and hands off to MAIN.""" + event_type: str = "ORACLE_INTELLIGENCE_READY" + session_id: str + timestamp: datetime = Field(default_factory=datetime.utcnow) + artifact_id: str + artifact_summary: Dict[str, Any] # Quick stats: num_targets, num_risks, etc. + next_action: str = "MAIN_AGENT_START_VIVA" # Always this at end of Stage 4 + + +class VivaSessionState(BaseModel): + """Tracks state of a viva session in Stage 5.""" + session_id: str + viva_phase: str # "STARTED", "INTRODUCTORY", "CORE", "DEEP_DIVE", "CONTRADICTION_PROBE", "CLOSING" + current_topic: Optional[str] = None + current_target_id: Optional[str] = None + questions_asked: int = 0 + contradictions_found: int = 0 + weak_areas_detected: List[str] = Field(default_factory=list) + strong_areas_detected: List[str] = Field(default_factory=list) + adaptive_difficulty: float = 5.0 # 0-10, increases/decreases based on performance + + # Transcript references + transcript_segment_ids: List[str] = Field(default_factory=list) + last_question_id: Optional[str] = None + last_response_text: Optional[str] = None + evaluation_score: Optional[float] = None + + +class VoiceSessionConfig(BaseModel): + """Configuration for voice viva in Stage 6.""" + enabled: bool = True + tts_provider: str = "system" # "system", "deepgram", "google" + stt_provider: str = "deepgram" # "deepgram", "google", "azure" + voice_language: str = "en-US" + speech_rate: float = 1.0 + silence_timeout_ms: int = 3000 + max_response_duration_seconds: int = 120 + enable_transcript_normalization: bool = True + save_audio_recordings: bool = True + audio_storage_path: Optional[str] = None diff --git a/backend/src/models/stage_7_8_9.py b/backend/src/models/stage_7_8_9.py new file mode 100644 index 0000000..b10f76b --- /dev/null +++ b/backend/src/models/stage_7_8_9.py @@ -0,0 +1,109 @@ +""" +Stage 7-9 Runtime Models + +Deterministic, evidence-grounded models for: +- Stage 7 SENTINEL integrity oversight +- Stage 8 MAIN evaluation loop artifacts +- Stage 9 curriculum-linked questioning +""" + +from datetime import datetime +from enum import Enum +from typing import Any, Dict, List, Optional + +from pydantic import BaseModel, Field + + +class IntegritySignalType(str, Enum): + PROLONGED_OFFSCREEN_FOCUS = "prolonged_offscreen_focus" + REPEATED_GAZE_SHIFT = "repeated_gaze_shift" + SESSION_INTERRUPTION = "session_interruption" + SUSPICIOUS_AUDIO_PATTERN = "suspicious_audio_pattern" + LOW_VISIBILITY_WARNING = "low_visibility_warning" + CONTRADICTION_ESCALATION = "contradiction_escalation" + CONFIDENCE_INSTABILITY = "confidence_instability" + EXCESSIVE_SILENCE_PATTERN = "excessive_silence_pattern" + ENVIRONMENT_CHANGE = "environment_change" + + +class IntegritySeverity(str, Enum): + LOW = "LOW" + MEDIUM = "MEDIUM" + HIGH = "HIGH" + + +class SentinelIntegrityEvent(BaseModel): + event_id: str + session_id: str + signal_type: IntegritySignalType + severity: IntegritySeverity + observed_at: datetime = Field(default_factory=datetime.utcnow) + evidence: Dict[str, Any] = Field(default_factory=dict) + explanation: str + replay_metadata: Dict[str, Any] = Field(default_factory=dict) + + +class SentinelAlert(BaseModel): + alert_id: str + session_id: str + created_at: datetime = Field(default_factory=datetime.utcnow) + event_ids: List[str] = Field(default_factory=list) + manual_review_recommended: bool = False + reason: str + + +class ContradictionChainEntry(BaseModel): + chain_id: str + target_id: str + previous_claim: str + current_claim: str + severity: str + turn_index: int + detected_at: datetime = Field(default_factory=datetime.utcnow) + + +class EvaluationArtifact(BaseModel): + session_id: str + turn_index: int + target_id: str + implementation_specificity: float + runtime_understanding: float + operational_reasoning: float + architectural_understanding: float + failure_path_awareness: float + tradeoff_understanding: float + consistency_score: float + implementation_familiarity: float + topic_coverage: Dict[str, float] = Field(default_factory=dict) + weak_areas: List[str] = Field(default_factory=list) + follow_up_chain: List[str] = Field(default_factory=list) + contradiction_chain: List[ContradictionChainEntry] = Field(default_factory=list) + created_at: datetime = Field(default_factory=datetime.utcnow) + + +class CoreSubject(str, Enum): + DSA = "DSA" + DBMS = "DBMS" + OPERATING_SYSTEMS = "OPERATING_SYSTEMS" + COMPUTER_NETWORKS = "COMPUTER_NETWORKS" + OOP = "OOP" + SOFTWARE_ENGINEERING = "SOFTWARE_ENGINEERING" + SYSTEM_DESIGN = "SYSTEM_DESIGN" + CLOUD_DEVOPS = "CLOUD_DEVOPS" + + +class CurriculumQuestion(BaseModel): + question_id: str + subject: CoreSubject + prompt: str + linked_implementation_signal: str + difficulty: str + expected_coverage: List[str] = Field(default_factory=list) + + +class CurriculumTransitionState(BaseModel): + session_id: str + transition_started_at: datetime = Field(default_factory=datetime.utcnow) + started: bool = False + completed_subjects: List[CoreSubject] = Field(default_factory=list) + asked_questions: List[str] = Field(default_factory=list) diff --git a/backend/src/services/curriculum_question_engine.py b/backend/src/services/curriculum_question_engine.py new file mode 100644 index 0000000..ca8f711 --- /dev/null +++ b/backend/src/services/curriculum_question_engine.py @@ -0,0 +1,144 @@ +""" +Stage 9 - Curriculum + Core Subject Questioning + +Deterministically links implementation evidence to foundational subjects. +""" + +from typing import Any, Dict, List, Optional, Tuple + +from src.models.intelligence_artifact import IntelligenceArtifact +from src.models.stage_7_8_9 import ( + CoreSubject, + CurriculumQuestion, + CurriculumTransitionState, +) + + +class CurriculumQuestionEngine: + """Implementation-linked curriculum questioning engine.""" + + def __init__(self, artifact: IntelligenceArtifact, session_id: str): + self.artifact = artifact + self.state = CurriculumTransitionState(session_id=session_id) + self.question_bank = self._build_question_bank() + + def should_start_transition(self, implementation_turns_completed: int) -> bool: + """Transition after implementation-aware rounds are complete.""" + + return implementation_turns_completed >= 2 + + def start_transition(self) -> CurriculumTransitionState: + self.state.started = True + return self.state + + def get_next_question(self) -> Optional[CurriculumQuestion]: + if not self.state.started: + return None + + for question in self.question_bank: + if question.question_id not in self.state.asked_questions: + self.state.asked_questions.append(question.question_id) + return question + + return None + + def evaluate_answer(self, question: CurriculumQuestion, answer: str) -> Tuple[float, bool]: + """Return (score, subject_completed).""" + + normalized = answer.lower() + hits = sum(1 for term in question.expected_coverage if term.lower() in normalized) + score = round(min(1.0, hits / max(1, len(question.expected_coverage))), 3) + + subject_completed = score >= 0.6 + if subject_completed and question.subject not in self.state.completed_subjects: + self.state.completed_subjects.append(question.subject) + + return score, subject_completed + + def _build_question_bank(self) -> List[CurriculumQuestion]: + """Build deterministic curriculum set from implementation signals.""" + + questions: List[CurriculumQuestion] = [] + q_index = 0 + + backend = {k.lower(): str(v).lower() for k, v in self.artifact.backend_stack.items()} + execution_text = " ".join(node.implementation_details.lower() for node in self.artifact.execution_graph_nodes) + + if "redis" in " ".join(backend.values()) or "cache" in execution_text: + questions.append( + CurriculumQuestion( + question_id=f"curr_{q_index}", + subject=CoreSubject.OPERATING_SYSTEMS, + prompt="Your project uses caching. Explain memory locality and eviction trade-offs relevant to this cache design.", + linked_implementation_signal="cache", + difficulty="MEDIUM", + expected_coverage=["memory", "eviction", "latency", "trade-off"], + ) + ) + q_index += 1 + + if "postgres" in " ".join(backend.values()) or "database" in " ".join(backend.keys()): + questions.append( + CurriculumQuestion( + question_id=f"curr_{q_index}", + subject=CoreSubject.DBMS, + prompt="In your database-backed flow, how do indexing and transactions affect consistency and query performance?", + linked_implementation_signal="database", + difficulty="MEDIUM", + expected_coverage=["index", "transaction", "consistency", "query"], + ) + ) + q_index += 1 + + if "async" in execution_text or "queue" in execution_text: + questions.append( + CurriculumQuestion( + question_id=f"curr_{q_index}", + subject=CoreSubject.DSA, + prompt="Your runtime path uses asynchronous behavior. Which data structures and scheduling considerations influence throughput under concurrency?", + linked_implementation_signal="async", + difficulty="HARD", + expected_coverage=["queue", "complexity", "concurrency", "throughput"], + ) + ) + q_index += 1 + + if "http" in execution_text or "request" in execution_text: + questions.append( + CurriculumQuestion( + question_id=f"curr_{q_index}", + subject=CoreSubject.COMPUTER_NETWORKS, + prompt="Map one request path in your project to HTTP/TCP behavior, including timeout and retry implications.", + linked_implementation_signal="networking", + difficulty="MEDIUM", + expected_coverage=["http", "tcp", "timeout", "retry"], + ) + ) + q_index += 1 + + if "auth" in execution_text or "jwt" in execution_text: + questions.append( + CurriculumQuestion( + question_id=f"curr_{q_index}", + subject=CoreSubject.SOFTWARE_ENGINEERING, + prompt="Relate your authentication implementation to secure design principles and boundary placement.", + linked_implementation_signal="authentication", + difficulty="HARD", + expected_coverage=["boundary", "validation", "security", "principle"], + ) + ) + q_index += 1 + + if not questions: + questions.append( + CurriculumQuestion( + question_id="curr_fallback_0", + subject=CoreSubject.SYSTEM_DESIGN, + prompt="Explain one architectural trade-off in your implementation and how it impacts runtime behavior.", + linked_implementation_signal="architecture", + difficulty="MEDIUM", + expected_coverage=["trade-off", "architecture", "runtime", "impact"], + ) + ) + + return questions diff --git a/backend/src/services/exam_session_service.py b/backend/src/services/exam_session_service.py new file mode 100644 index 0000000..bc614f1 --- /dev/null +++ b/backend/src/services/exam_session_service.py @@ -0,0 +1,289 @@ +from __future__ import annotations + +import json +from datetime import datetime +from pathlib import Path +from typing import Any, Dict, Iterable, List, Optional +from uuid import uuid4 + +from src.agents.gatekeeper.registry.lookup import RegistryLookup +from src.agents.gatekeeper.registry.registry_store import StudentRegistry +from src.agents.oracle.agent import OracleAgent +from src.core.events import EventEmitter +from src.models.events import EventType +from src.models.exam_session import ( + ExamRubric, + ExamSession, + ExamSessionConfig, + ExamSessionState, + GatekeeperAdmissionDecision, + SessionAuditEvent, + SessionTimingWindow, + StudentSubmission, +) + + +ALLOWED_TRANSITIONS = { + ExamSessionState.DRAFT: {ExamSessionState.CONFIGURED}, + ExamSessionState.CONFIGURED: {ExamSessionState.READY}, + ExamSessionState.READY: {ExamSessionState.LIVE}, + ExamSessionState.LIVE: {ExamSessionState.ACTIVE_VIVA, ExamSessionState.COMPLETED}, + ExamSessionState.ACTIVE_VIVA: {ExamSessionState.COMPLETED}, + ExamSessionState.COMPLETED: {ExamSessionState.ARCHIVED}, + ExamSessionState.ARCHIVED: set(), +} + + +class SessionTransitionError(ValueError): + pass + + +class ExamSessionService: + def __init__(self, storage_dir: Optional[str] = None, registry: Optional[StudentRegistry] = None) -> None: + repo_root = Path(__file__).resolve().parents[2] + default_dir = repo_root / "data" / "exam_sessions" + self.storage_dir = Path(storage_dir) if storage_dir else default_dir + self.storage_dir.mkdir(parents=True, exist_ok=True) + self.registry = registry or StudentRegistry() + self.lookup = RegistryLookup(self.registry) + self.oracle = OracleAgent() + + def create_session( + self, + admin_id: str, + title: str, + config: Optional[ExamSessionConfig] = None, + ) -> ExamSession: + session = ExamSession( + session_id=f"ES-{uuid4().hex[:10].upper()}", + admin_id=admin_id, + title=title, + config=config, + state=ExamSessionState.CONFIGURED if config else ExamSessionState.DRAFT, + ) + self._append_audit(session, "session_created", "admin", {"admin_id": admin_id, "title": title}) + self._emit(session.session_id, EventType.SESSION_CREATED, {"state": session.state.value, "title": title}) + if config: + self._emit(session.session_id, EventType.SESSION_CONFIGURED, {"state": session.state.value}) + self._save(session) + return session + + def list_sessions(self) -> List[ExamSession]: + sessions: List[ExamSession] = [] + for path in sorted(self.storage_dir.glob("*.json")): + loaded = self._load(path.stem) + if loaded is not None: + sessions.append(loaded) + return sessions + + def get_session(self, session_id: str) -> Optional[ExamSession]: + return self._load(session_id) + + def configure_session(self, session_id: str, config: ExamSessionConfig) -> ExamSession: + session = self._require(session_id) + if session.state not in {ExamSessionState.DRAFT, ExamSessionState.CONFIGURED}: + raise SessionTransitionError(f"Cannot configure session in state {session.state.value}") + session.config = config + session.state = ExamSessionState.CONFIGURED + session.touch() + self._append_audit(session, "session_configured", "admin", config.model_dump(mode="json")) + self._emit(session.session_id, EventType.SESSION_CONFIGURED, {"state": session.state.value, "title": session.title}) + self._save(session) + return session + + def assign_students(self, session_id: str, submissions: Iterable[StudentSubmission]) -> ExamSession: + session = self._require(session_id) + session.assigned_students = list(submissions) + session.touch() + self._append_audit(session, "students_assigned", "admin", {"count": len(session.assigned_students)}) + self._save(session) + return session + + def set_ready(self, session_id: str) -> ExamSession: + session = self._require(session_id) + if not session.config: + raise SessionTransitionError("Session must be configured before it can be marked ready.") + if not session.assigned_students: + raise SessionTransitionError("Session must have assigned students before it can be marked ready.") + self._transition(session, ExamSessionState.READY, actor="admin", event_type=EventType.SESSION_READY) + self._save(session) + return session + + def activate_session(self, session_id: str, actor: str = "admin") -> ExamSession: + session = self._require(session_id) + if session.state != ExamSessionState.READY: + raise SessionTransitionError("Only READY sessions can be activated.") + session.activated_at = session.activated_at or datetime.utcnow() + self._transition(session, ExamSessionState.LIVE, actor=actor, event_type=EventType.SESSION_LIVE) + self._save(session) + return session + + def gatekeeper_precheck(self, session_id: str, roll_number: str, actor: str = "gatekeeper") -> GatekeeperAdmissionDecision: + session = self._require(session_id) + normalized_roll = roll_number.strip().upper() + lookup_result = self.lookup.by_roll_number(normalized_roll) + submission = next((item for item in session.assigned_students if item.roll_number == lookup_result.roll_number), None) + duplicate_join = lookup_result.roll_number in session.admitted_roll_numbers + timing_valid = self._is_within_timing_window(session) + session_live = session.state in {ExamSessionState.LIVE, ExamSessionState.ACTIVE_VIVA} + submission_present = submission is not None + + admitted = all([session_live, lookup_result.success, timing_valid, submission_present, not duplicate_join]) + reason = None + if not session_live: + reason = f"session_not_live:{session.state.value}" + elif not lookup_result.success: + reason = f"student_lookup_failed:{lookup_result.failure_reason.value if lookup_result.failure_reason else 'unknown'}" + elif not timing_valid: + reason = "outside_allowed_timing_window" + elif not submission_present: + reason = "required_submission_missing" + elif duplicate_join: + reason = "duplicate_session_join" + + decision = GatekeeperAdmissionDecision( + session_id=session_id, + student_roll_number=lookup_result.roll_number or normalized_roll, + admitted=admitted, + reason=reason, + session_state=session.state, + timing_valid=timing_valid, + submission_present=submission_present, + duplicate_join=duplicate_join, + suspicious=not admitted, + metadata={ + "student_found": lookup_result.success, + "student_name": lookup_result.profile.full_name if lookup_result.profile else None, + "submission_repository_url": submission.repository_url if submission else None, + }, + ) + session.gatekeeper_decisions.append(decision) + self._append_audit( + session, + "student_admitted" if admitted else "student_rejected", + actor, + decision.model_dump(mode="json"), + ) + self._emit( + session_id, + EventType.STUDENT_ADMITTED if admitted else EventType.STUDENT_REJECTED, + decision.model_dump(mode="json"), + ) + if admitted: + session.admitted_roll_numbers.append(decision.student_roll_number) + session.active_student_roll_number = decision.student_roll_number + if session.state == ExamSessionState.LIVE: + self._transition(session, ExamSessionState.ACTIVE_VIVA, actor=actor, event_type=EventType.SESSION_ACTIVE_VIVA) + self._save(session) + return decision + + async def start_oracle_analysis(self, session_id: str, roll_number: str, actor: str = "oracle") -> ExamSession: + session = self._require(session_id) + if session.state not in {ExamSessionState.LIVE, ExamSessionState.ACTIVE_VIVA}: + raise SessionTransitionError("ORACLE analysis can only start after the session is live and the student is admitted.") + + normalized_roll = roll_number.strip().upper() + decision = next((item for item in reversed(session.gatekeeper_decisions) if item.student_roll_number == normalized_roll), None) + if decision is None or not decision.admitted: + raise SessionTransitionError("Gatekeeper admission is required before ORACLE analysis can begin.") + + session.oracle_status = "running" + session.oracle_started_at = session.oracle_started_at or datetime.utcnow() + self._append_audit(session, "oracle_analysis_started", actor, {"roll_number": normalized_roll}) + self._emit(session_id, EventType.ORACLE_ANALYSIS_STARTED, {"roll_number": normalized_roll}) + + submission = next((item for item in session.assigned_students if item.roll_number == normalized_roll), None) + payload = { + "repo_url": submission.repository_url if submission else None, + "report_path": submission.document_paths[0] if submission and submission.document_paths else None, + "roll_number": normalized_roll, + } + context = await self.oracle.process(session_id, payload) + artifacts = context.model_dump() if hasattr(context, "model_dump") else context.dict() + session.analysis_artifacts.append({ + "artifact_type": "oracle_context", + "payload": artifacts, + }) + session.oracle_completed_at = datetime.utcnow() + session.oracle_status = "completed" + self._append_audit(session, "oracle_analysis_completed", actor, {"roll_number": normalized_roll}) + self._emit(session_id, EventType.ORACLE_ANALYSIS_COMPLETED, {"roll_number": normalized_roll}) + self._save(session) + return session + + def complete_session(self, session_id: str, actor: str = "admin") -> ExamSession: + session = self._require(session_id) + if session.state not in {ExamSessionState.ACTIVE_VIVA, ExamSessionState.LIVE, ExamSessionState.READY}: + raise SessionTransitionError(f"Cannot complete session in state {session.state.value}") + session.completed_at = session.completed_at or datetime.utcnow() + self._transition(session, ExamSessionState.COMPLETED, actor=actor, event_type=EventType.SESSION_COMPLETED) + self._save(session) + return session + + def archive_session(self, session_id: str, actor: str = "admin") -> ExamSession: + session = self._require(session_id) + if session.state != ExamSessionState.COMPLETED: + raise SessionTransitionError("Only completed sessions can be archived.") + session.archived_at = session.archived_at or datetime.utcnow() + self._transition(session, ExamSessionState.ARCHIVED, actor=actor, event_type=EventType.SESSION_ARCHIVED) + self._save(session) + return session + + def _is_within_timing_window(self, session: ExamSession) -> bool: + config = session.config + if not config or not config.timing_window.opens_at or not config.timing_window.closes_at: + return True + now = datetime.utcnow() + return config.timing_window.opens_at <= now <= config.timing_window.closes_at + + def _transition(self, session: ExamSession, new_state: ExamSessionState, actor: str, event_type: EventType) -> None: + allowed = ALLOWED_TRANSITIONS.get(session.state, set()) + if new_state not in allowed: + raise SessionTransitionError(f"Invalid transition from {session.state.value} to {new_state.value}") + previous_state = session.state + session.state = new_state + session.touch() + self._append_audit(session, f"transition:{new_state.value.lower()}", actor, {"from": previous_state.value, "to": new_state.value}) + self._emit(session.session_id, event_type, {"state": new_state.value, "previous_state": previous_state.value, "actor": actor}) + + def _append_audit(self, session: ExamSession, event_type: str, actor: str, payload: Dict[str, Any]) -> None: + session.audit_events.append( + SessionAuditEvent( + session_id=session.session_id, + event_type=event_type, + actor=actor, + payload=payload, + ) + ) + session.touch() + + def _emit(self, session_id: str, event_type: EventType, payload: Dict[str, Any]) -> None: + try: + EventEmitter.emit( + session_id=session_id, + agent_name="ExamSession", + event_type=event_type, + payload=payload, + ) + except Exception: + pass + + def _session_path(self, session_id: str) -> Path: + return self.storage_dir / f"{session_id}.json" + + def _save(self, session: ExamSession) -> None: + with self._session_path(session.session_id).open("w", encoding="utf-8") as handle: + json.dump(session.model_dump(mode="json"), handle, indent=2, sort_keys=True, default=str) + + def _load(self, session_id: str) -> Optional[ExamSession]: + path = self._session_path(session_id) + if not path.exists(): + return None + with path.open("r", encoding="utf-8") as handle: + return ExamSession.model_validate(json.load(handle)) + + def _require(self, session_id: str) -> ExamSession: + session = self._load(session_id) + if session is None: + raise SessionTransitionError(f"Unknown exam session: {session_id}") + return session diff --git a/backend/src/services/intelligence/flows/api_flow_analyzer.py b/backend/src/services/intelligence/flows/api_flow_analyzer.py index e9dd08c..aa54e01 100644 --- a/backend/src/services/intelligence/flows/api_flow_analyzer.py +++ b/backend/src/services/intelligence/flows/api_flow_analyzer.py @@ -1,3 +1,4 @@ +import os from typing import List, Dict, Any from ..intermediate_representation.execution_graph_builder import ExecutionGraphBuilder from ....models.context import ImplementationFlow, FlowNodeType diff --git a/backend/src/services/intelligence/flows/auth_flow_analyzer.py b/backend/src/services/intelligence/flows/auth_flow_analyzer.py index acba62a..21e50df 100644 --- a/backend/src/services/intelligence/flows/auth_flow_analyzer.py +++ b/backend/src/services/intelligence/flows/auth_flow_analyzer.py @@ -1,3 +1,4 @@ +import os from typing import List, Dict, Any from ..intermediate_representation.execution_graph_builder import ExecutionGraphBuilder from ....models.context import ImplementationFlow, FlowNodeType diff --git a/backend/src/services/intelligence/flows/db_flow_analyzer.py b/backend/src/services/intelligence/flows/db_flow_analyzer.py index 881b8f7..ae3120c 100644 --- a/backend/src/services/intelligence/flows/db_flow_analyzer.py +++ b/backend/src/services/intelligence/flows/db_flow_analyzer.py @@ -1,3 +1,4 @@ +import os from typing import List, Dict, Any from ..intermediate_representation.execution_graph_builder import ExecutionGraphBuilder from ....models.context import ImplementationFlow, FlowNodeType @@ -32,13 +33,13 @@ def analyze(repo_path: str, structure: Dict[str, Any], builder: ExecutionGraphBu if not any(marker in content for marker in db_markers): continue - builder.add_node( - node_id=f"db_{file_path}", - label=f"Data Access Layer ({file_path})", - node_type=FlowNodeType.DB_QUERY - ) - steps.append(f"DB interaction point: {file_path}") - evidence.append(f"Persistence related keywords found in {file_path}") + builder.add_node( + node_id=f"db_{file_path}", + label=f"Data Access Layer ({file_path})", + node_type=FlowNodeType.DB_QUERY + ) + steps.append(f"DB interaction point: {file_path}") + evidence.append(f"Persistence related keywords found in {file_path}") return ImplementationFlow( steps=steps, diff --git a/backend/src/services/intelligence_artifact_builder.py b/backend/src/services/intelligence_artifact_builder.py new file mode 100644 index 0000000..46b5621 --- /dev/null +++ b/backend/src/services/intelligence_artifact_builder.py @@ -0,0 +1,536 @@ +""" +Intelligence Artifact Builder — Converts ORACLE StructuredContext to IntelligenceArtifact + +Deterministic transformation that packages ORACLE's analysis into a structured, +explainable handoff format for MAIN Agent. +""" + +import hashlib +import json +from typing import List, Dict, Any, Optional +from datetime import datetime +from src.models.intelligence_artifact import ( + IntelligenceArtifact, + VivaTarget, + ExecutionNode, + ExecutionPath, + RuntimeDependency, + FailureScenario, + ImplementationSignal, + WeakPoint, + IntelligenceCategory, + AdaptiveThreshold, + IntelligenceHandoffEvent, +) +from src.models.context import StructuredContext, RuntimeRisk, VivaTarget as OracleVivaTarget + + +class IntelligenceArtifactBuilder: + """ + Deterministic builder that transforms ORACLE StructuredContext into + IntelligenceArtifact for MAIN Agent consumption. + """ + + @staticmethod + def build( + session_id: str, + structured_context: StructuredContext, + analysis_duration_seconds: float = 0.0, + repo_path: Optional[str] = None, + ) -> IntelligenceArtifact: + """ + Build IntelligenceArtifact from StructuredContext. + + Args: + session_id: The exam session ID + structured_context: ORACLE's StructuredContext output + analysis_duration_seconds: Time taken for analysis + repo_path: Optional path to cloned repo for evidence collection + + Returns: + IntelligenceArtifact ready for MAIN Agent + """ + + # Extract backend stack + backend_stack = { + "framework": structured_context.backend_framework.value or "Unknown", + "database": structured_context.database_used.value or "Unknown", + "authentication": structured_context.authentication_system.value or "Unknown", + } + + # Extract frontend stack if present + frontend_stack = None + if structured_context.frontend_framework: + frontend_stack = {"framework": structured_context.frontend_framework.value or "Unknown"} + + # Build execution graph nodes + execution_nodes = IntelligenceArtifactBuilder._build_execution_nodes(structured_context) + + # Build execution paths + execution_paths = IntelligenceArtifactBuilder._build_execution_paths( + structured_context, execution_nodes + ) + + # Extract runtime dependencies + runtime_dependencies = IntelligenceArtifactBuilder._extract_runtime_dependencies( + structured_context, execution_nodes + ) + + # Extract failure scenarios + failure_scenarios = IntelligenceArtifactBuilder._extract_failure_scenarios( + structured_context, execution_paths + ) + + # Extract implementation risks + implementation_risks = IntelligenceArtifactBuilder._extract_implementation_risks( + structured_context + ) + + # Extract weak points + weak_points = IntelligenceArtifactBuilder._extract_weak_points( + structured_context, implementation_risks + ) + + # Build viva targets + viva_targets = IntelligenceArtifactBuilder._build_viva_targets( + structured_context, weak_points, execution_nodes + ) + + # Build adaptive thresholds + adaptive_thresholds = IntelligenceArtifactBuilder._build_adaptive_thresholds(viva_targets) + + # Extract implementation signals + implementation_signals = IntelligenceArtifactBuilder._extract_implementation_signals( + structured_context + ) + + # Build summary and key findings + summary, key_findings = IntelligenceArtifactBuilder._build_summary( + structured_context, failure_scenarios, weak_points, viva_targets + ) + + # Create artifact + artifact = IntelligenceArtifact( + session_id=session_id, + oracle_version="v1", + analysis_duration_seconds=analysis_duration_seconds, + project_name=structured_context.project_name.value, + project_type=structured_context.project_type.value, + backend_stack=backend_stack, + frontend_stack=frontend_stack, + architecture_pattern=structured_context.architecture_pattern.value, + execution_graph_nodes=execution_nodes, + execution_paths=execution_paths, + runtime_dependencies=runtime_dependencies, + failure_scenarios=failure_scenarios, + implementation_risks=implementation_risks, + weak_points=weak_points, + viva_targets=viva_targets, + adaptive_thresholds=adaptive_thresholds, + implementation_signals=implementation_signals, + summary=summary, + key_findings=key_findings, + analysis_confidence=structured_context.complexity_mismatch.confidence + if structured_context.complexity_mismatch + else 0.8, + ) + + # Compute deterministic hash for replay verification + artifact.deterministic_hash = IntelligenceArtifactBuilder._compute_hash(artifact) + + return artifact + + @staticmethod + def _build_execution_nodes(context: StructuredContext) -> List[ExecutionNode]: + """Extract execution nodes from context.""" + nodes = [] + + # Build from execution graph + if context.execution_graph: + for i, flow_node in enumerate(context.execution_graph.nodes): + node = ExecutionNode( + node_id=flow_node.id, + label=flow_node.label, + node_type=flow_node.type.value if hasattr(flow_node.type, "value") else str(flow_node.type), + implementation_details=f"Node: {flow_node.label}, Metadata: {flow_node.metadata}", + dependencies=[], + failure_modes=[], + ) + nodes.append(node) + + # Build from middleware chain + for middleware in context.middleware_chain: + node = ExecutionNode( + node_id=f"middleware_{len(nodes)}", + label=middleware.value, + node_type="MIDDLEWARE", + implementation_details=f"Middleware: {middleware.value}, Confidence: {middleware.confidence}", + dependencies=[], + failure_modes=["Middleware exception", "Request rejection"], + ) + nodes.append(node) + + return nodes + + @staticmethod + def _build_execution_paths(context: StructuredContext, nodes: List[ExecutionNode]) -> List[ExecutionPath]: + """Build execution paths from flows.""" + paths = [] + + # Happy path + if context.execution_graph.nodes: + happy_path_nodes = [n.id for n in context.execution_graph.nodes[:5]] # First 5 nodes + paths.append( + ExecutionPath( + path_id="happy_path", + description="Normal request lifecycle", + nodes=happy_path_nodes, + scenario="HAPPY_PATH", + criticality="HIGH", + ) + ) + + # Error path + if context.execution_graph.failure_paths: + paths.append( + ExecutionPath( + path_id="error_path", + description="Error handling and recovery", + nodes=context.execution_graph.failure_paths[:3], + scenario="ERROR_PATH", + criticality="HIGH", + ) + ) + + return paths + + @staticmethod + def _extract_runtime_dependencies(context: StructuredContext, nodes: List[ExecutionNode]) -> List[RuntimeDependency]: + """Extract runtime dependencies from context.""" + deps = [] + + # Database dependency + if context.database_used and context.database_used.value != "Unknown": + deps.append( + RuntimeDependency( + name=context.database_used.value, + type="DATABASE", + usage_pattern="Query/Mutation in request lifecycle", + criticality="CRITICAL", + evidence_snippet=f"Technology detected: {context.database_used.value}", + ) + ) + + # Authentication dependency + if context.authentication_system and context.authentication_system.value != "Unknown": + deps.append( + RuntimeDependency( + name=context.authentication_system.value, + type="MIDDLEWARE", + usage_pattern="Request validation and session management", + criticality="CRITICAL", + ) + ) + + # Framework dependency + if context.backend_framework and context.backend_framework.value != "Unknown": + deps.append( + RuntimeDependency( + name=context.backend_framework.value, + type="LIBRARY", + usage_pattern="Request routing and response handling", + criticality="CRITICAL", + ) + ) + + return deps + + @staticmethod + def _extract_failure_scenarios(context: StructuredContext, paths: List[ExecutionPath]) -> List[FailureScenario]: + """Extract failure scenarios from context.""" + scenarios = [] + + # Build from runtime risks + for i, risk in enumerate(context.runtime_risks): + scenario = FailureScenario( + scenario_name=f"Risk: {risk.value}", + trigger="Runtime condition: " + risk.value, + propagation_path=["Request Handler", "Service Layer", "Error Handler"], + impact=risk.value, + severity=risk.severity, + detectability="MODERATE", + evidence_snippet=f"Risk severity: {risk.severity}, Evidence: {risk.evidence}", + ) + scenarios.append(scenario) + + # Add common failure scenarios + scenarios.extend( + [ + FailureScenario( + scenario_name="Database Connection Failure", + trigger="DB unreachable or timeout", + propagation_path=["DB Query", "Service Layer", "Error Handler"], + impact="Request failure, user sees error", + severity="HIGH", + detectability="EASY", + ), + FailureScenario( + scenario_name="Authentication Bypass", + trigger="Invalid token or missing auth header", + propagation_path=["Auth Middleware", "Error Handler"], + impact="Unauthorized access or request rejection", + severity="CRITICAL", + detectability="HARD", + ), + FailureScenario( + scenario_name="Race Condition on Concurrent Writes", + trigger="Multiple requests modifying same resource", + propagation_path=["DB Query", "Update Logic", "Inconsistency"], + impact="Data corruption or lost updates", + severity="CRITICAL", + detectability="HARD", + ), + ] + ) + + return scenarios + + @staticmethod + def _extract_implementation_risks(context: StructuredContext) -> List[Dict[str, Any]]: + """Extract implementation risks from context.""" + risks = [] + + for runtime_risk in context.runtime_risks: + risks.append( + { + "area": "Runtime", + "risk": runtime_risk.value, + "severity": runtime_risk.severity, + "evidence": runtime_risk.evidence, + } + ) + + for inconsistency in context.inconsistencies: + risks.append( + { + "area": "Consistency", + "risk": inconsistency.issue, + "severity": inconsistency.severity, + "evidence": inconsistency.evidence, + } + ) + + # Add common implementation risks + risks.extend( + [ + { + "area": "Caching Strategy", + "risk": "Cache invalidation not properly handled", + "severity": "HIGH", + "evidence": ["Concurrent write scenarios not addressed"], + }, + { + "area": "Error Handling", + "risk": "Generic error messages hide specific failures", + "severity": "MEDIUM", + "evidence": ["Insufficient error logging"], + }, + { + "area": "Async/Concurrency", + "risk": "Race conditions in state updates", + "severity": "HIGH", + "evidence": ["No locking mechanism observed"], + }, + ] + ) + + return risks + + @staticmethod + def _extract_weak_points(context: StructuredContext, risks: List[Dict[str, Any]]) -> List[WeakPoint]: + """Extract weak points for viva probing.""" + weak_points = [] + + for risk in risks: + weak_point = WeakPoint( + area=risk["area"], + weakness=risk["risk"], + why_problematic="Understanding this shows depth of implementation knowledge", + testing_approach="Ask student to explain how they would handle this scenario", + evidence_file=None, + ) + weak_points.append(weak_point) + + return weak_points + + @staticmethod + def _build_viva_targets( + context: StructuredContext, weak_points: List[WeakPoint], nodes: List[ExecutionNode] + ) -> List[VivaTarget]: + """Build viva targets from context.""" + targets = [] + + # Convert ORACLE VivaTargets to IntelligenceArtifact VivaTargets + for i, oracle_target in enumerate(context.viva_intelligence_targets or []): + target = VivaTarget( + target_id=f"target_{i}", + question=oracle_target.question_target if hasattr(oracle_target, "question_target") else oracle_target.topic, + category=IntelligenceCategory.ARCHITECTURE, + difficulty=oracle_target.difficulty, + depth_score=oracle_target.depth_score if hasattr(oracle_target, "depth_score") else 5.0, + why_important=f"Question targets {oracle_target.focus}" if hasattr(oracle_target, "focus") else "Core implementation knowledge", + evidence_references=[], + follow_up_paths=[], + expected_coverage=["Implementation details", "Design rationale", "Edge cases"], + red_flags=["Generic answers", "Vague explanations", "Contradictions"], + ) + targets.append(target) + + # Add weak-point-based targets + for i, weak_point in enumerate(weak_points): + target = VivaTarget( + target_id=f"weak_point_{i}", + question=f"How would you handle {weak_point.weakness}?", + category=IntelligenceCategory.WEAK_POINT, + difficulty="HARD", + depth_score=8.0, + why_important=f"Tests practical handling of: {weak_point.weakness}", + evidence_references=[], + follow_up_paths=[ + "Probe error handling", + "Ask about retry logic", + "Inquire about monitoring", + ], + expected_coverage=[ + "Specific technical approach", + "Trade-offs considered", + "Testing strategy", + ], + red_flags=[ + "Hand-waving solution", + "Ignoring trade-offs", + "No mention of testing", + ], + ) + targets.append(target) + + return targets + + @staticmethod + def _build_adaptive_thresholds(targets: List[VivaTarget]) -> List[AdaptiveThreshold]: + """Build adaptive thresholds for viva difficulty escalation.""" + thresholds = [] + + # Group targets by category for adaptive logic + categories = set(t.category for t in targets) + for category in categories: + category_targets = [t for t in targets if t.category == category] + weak_trigger = [t.question for t in category_targets if t.difficulty == "HARD"] + + threshold = AdaptiveThreshold( + topic=category.value, + weak_point_triggers=weak_trigger[:3], + strong_point_indicators=[t.question for t in category_targets if t.difficulty == "FOUNDATIONAL"], + contradiction_escalation=True, + ) + thresholds.append(threshold) + + return thresholds + + @staticmethod + def _extract_implementation_signals(context: StructuredContext) -> List[ImplementationSignal]: + """Extract observable implementation signals from context.""" + signals = [] + + # Technology stack signals + if context.backend_framework and context.backend_framework.value != "Unknown": + signals.append( + ImplementationSignal( + signal_type="DESIGN_PATTERN", + description=f"Backend framework choice: {context.backend_framework.value}", + evidence=f"Framework: {context.backend_framework.value}", + confidence=context.backend_framework.confidence, + risk_level="LOW", + ) + ) + + # Architecture pattern signal + if context.architecture_pattern and context.architecture_pattern.value: + signals.append( + ImplementationSignal( + signal_type="DESIGN_PATTERN", + description=f"Architecture pattern: {context.architecture_pattern.value}", + evidence=f"Inferred pattern: {context.architecture_pattern.value}", + confidence=context.architecture_pattern.confidence, + risk_level="LOW", + ) + ) + + # Middleware signals + for middleware in context.middleware_chain: + signals.append( + ImplementationSignal( + signal_type="ERROR_HANDLING", + description=f"Middleware layer: {middleware.value}", + evidence=middleware.value, + confidence=middleware.confidence, + risk_level="MEDIUM", + ) + ) + + return signals + + @staticmethod + def _build_summary( + context: StructuredContext, + failure_scenarios: List[FailureScenario], + weak_points: List[WeakPoint], + viva_targets: List[VivaTarget], + ) -> tuple[str, List[str]]: + """Build human-readable summary and key findings.""" + + summary = f""" +ORACLE Analysis Summary for {context.project_name.value} + +Project: {context.project_type.value} +Architecture: {context.architecture_pattern.value} +Backend: {context.backend_framework.value} +Database: {context.database_used.value} + +Analysis Results: +- Execution paths identified: {len(context.execution_graph.nodes)} nodes +- Failure scenarios detected: {len(failure_scenarios)} +- Implementation weak points: {len(weak_points)} +- Viva targets generated: {len(viva_targets)} + +The analysis focused on implementation-aware viva preparation, +identifying areas where student understanding will be probed. + """.strip() + + key_findings = [ + f"Backend stack: {context.backend_framework.value}, {context.database_used.value}", + f"Architecture pattern: {context.architecture_pattern.value}", + f"Identified {len(failure_scenarios)} critical failure scenarios", + f"Detected {len(weak_points)} areas requiring deep probing", + f"Generated {len(viva_targets)} implementation-aware viva targets", + ] + + return summary, key_findings + + @staticmethod + def _compute_hash(artifact: IntelligenceArtifact) -> str: + """ + Compute deterministic hash for replay verification. + Ensures artifact is deterministic and reproducible. + """ + # Serialize key fields deterministically + data_to_hash = { + "session_id": artifact.session_id, + "project_name": artifact.project_name, + "backend_stack": artifact.backend_stack, + "num_viva_targets": len(artifact.viva_targets), + "num_failure_scenarios": len(artifact.failure_scenarios), + } + + data_string = json.dumps(data_to_hash, sort_keys=True, default=str) + return hashlib.sha256(data_string.encode()).hexdigest()[:16] diff --git a/backend/src/services/main_agent_evaluation_loop.py b/backend/src/services/main_agent_evaluation_loop.py new file mode 100644 index 0000000..5394304 --- /dev/null +++ b/backend/src/services/main_agent_evaluation_loop.py @@ -0,0 +1,186 @@ +""" +Stage 8 - MAIN Agent Evaluation Loop + +Deterministic evaluation loop executed after each finalized response. +""" + +from datetime import datetime +from typing import Any, Dict, List, Optional, Tuple + +from src.models.intelligence_artifact import VivaTarget +from src.models.stage_7_8_9 import ContradictionChainEntry, EvaluationArtifact + + +class MainAgentEvaluationLoop: + """Continuous technical evaluation engine for live viva.""" + + def __init__(self, main_orchestrator): + self.main_orchestrator = main_orchestrator + self.evaluation_history: List[EvaluationArtifact] = [] + self.contradiction_chain: List[ContradictionChainEntry] = [] + self.topic_coverage: Dict[str, float] = {} + + def process_finalized_response( + self, + turn_index: int, + target: VivaTarget, + response_text: str, + ) -> Tuple[Dict[str, Any], EvaluationArtifact, Optional[str]]: + """ + Process full evaluation cycle for one response. + + Flow: + Question -> Response -> Evaluation -> Session Memory Update -> + Contradiction Analysis -> Follow-Up Generation + """ + + base_evaluation = self.main_orchestrator.evaluate_answer(response_text, target) + + implementation_specificity = self._score_implementation_specificity(response_text) + runtime_understanding = self._score_runtime_understanding(response_text) + operational_reasoning = self._score_operational_reasoning(response_text) + architectural_understanding = self._score_architectural_understanding(response_text) + failure_path_awareness = self._score_failure_path_awareness(response_text) + tradeoff_understanding = self._score_tradeoff_understanding(response_text) + consistency_score = self._score_consistency(base_evaluation) + + implementation_familiarity = round( + ( + implementation_specificity + + runtime_understanding + + operational_reasoning + + architectural_understanding + + failure_path_awareness + + tradeoff_understanding + + consistency_score + ) + / 7.0, + 3, + ) + + category_key = target.category.value + current_coverage = float(self.topic_coverage.get(category_key, 0.0)) + coverage_increment = float(base_evaluation.get("coverage_score", 0.0)) + self.topic_coverage[category_key] = min(1.0, round(current_coverage + coverage_increment * 0.4, 3)) + + for contradiction in base_evaluation.get("contradictions", []): + self.contradiction_chain.append( + ContradictionChainEntry( + chain_id=f"chain_{self.main_orchestrator.session_state.session_id}_{len(self.contradiction_chain)}", + target_id=target.target_id, + previous_claim=contradiction.get("previous", "unknown"), + current_claim=contradiction.get("current", "unknown"), + severity=contradiction.get("severity", "MEDIUM"), + turn_index=turn_index, + ) + ) + + follow_up = self._generate_runtime_aware_follow_up(target, response_text, base_evaluation) + if not follow_up: + follow_up = self.main_orchestrator.generate_follow_up(base_evaluation, target) + + evaluation_artifact = EvaluationArtifact( + session_id=self.main_orchestrator.session_state.session_id, + turn_index=turn_index, + target_id=target.target_id, + implementation_specificity=implementation_specificity, + runtime_understanding=runtime_understanding, + operational_reasoning=operational_reasoning, + architectural_understanding=architectural_understanding, + failure_path_awareness=failure_path_awareness, + tradeoff_understanding=tradeoff_understanding, + consistency_score=consistency_score, + implementation_familiarity=implementation_familiarity, + topic_coverage=dict(self.topic_coverage), + weak_areas=list(self.main_orchestrator.weak_areas_detected), + follow_up_chain=[follow_up] if follow_up else [], + contradiction_chain=list(self.contradiction_chain), + ) + + self.evaluation_history.append(evaluation_artifact) + + enriched = { + **base_evaluation, + "implementation_specificity": implementation_specificity, + "runtime_understanding": runtime_understanding, + "operational_reasoning": operational_reasoning, + "architectural_understanding": architectural_understanding, + "failure_path_awareness": failure_path_awareness, + "tradeoff_understanding": tradeoff_understanding, + "consistency_score": consistency_score, + "implementation_familiarity": implementation_familiarity, + "topic_coverage": dict(self.topic_coverage), + "contradiction_chain_length": len(self.contradiction_chain), + "evaluated_at": datetime.utcnow().isoformat(), + } + + return enriched, evaluation_artifact, follow_up + + def _score_implementation_specificity(self, answer: str) -> float: + keywords = ["middleware", "controller", "service", "repository", "handler", "function"] + return self._keyword_score(answer, keywords) + + def _score_runtime_understanding(self, answer: str) -> float: + keywords = ["request", "response", "latency", "concurrent", "queue", "retry"] + return self._keyword_score(answer, keywords) + + def _score_operational_reasoning(self, answer: str) -> float: + keywords = ["monitor", "rollback", "deploy", "alert", "incident", "slo"] + return self._keyword_score(answer, keywords) + + def _score_architectural_understanding(self, answer: str) -> float: + keywords = ["architecture", "module", "dependency", "layer", "interface", "boundary"] + return self._keyword_score(answer, keywords) + + def _score_failure_path_awareness(self, answer: str) -> float: + keywords = ["failure", "fallback", "timeout", "circuit", "error", "exception"] + return self._keyword_score(answer, keywords) + + def _score_tradeoff_understanding(self, answer: str) -> float: + keywords = ["trade-off", "tradeoff", "cost", "throughput", "consistency", "availability"] + return self._keyword_score(answer, keywords) + + def _score_consistency(self, base_evaluation: Dict[str, Any]) -> float: + contradiction_penalty = min(1.0, 0.25 * len(base_evaluation.get("contradictions", []))) + red_flag_penalty = min(1.0, 0.1 * len(base_evaluation.get("red_flags", []))) + return round(max(0.0, 1.0 - contradiction_penalty - red_flag_penalty), 3) + + def _keyword_score(self, answer: str, keywords: List[str]) -> float: + if not answer.strip(): + return 0.0 + normalized = answer.lower() + hits = sum(1 for key in keywords if key in normalized) + return round(min(1.0, hits / max(1, len(keywords) // 2)), 3) + + def _generate_runtime_aware_follow_up( + self, + target: VivaTarget, + answer: str, + base_evaluation: Dict[str, Any], + ) -> Optional[str]: + """Generate technical follow-up with implementation/runtime focus.""" + + if base_evaluation.get("depth_level") in {"DEEP", "EXPERT"} and not base_evaluation.get("contradictions"): + return None + + answer_lower = answer.lower() + question_lower = target.question.lower() + + if "redis" in answer_lower or "cache" in question_lower: + return "Where exactly is Redis used in your request lifecycle, and how do you prevent stale cache during rapid updates?" + + if "database" in answer_lower or "db" in answer_lower or "database" in question_lower: + return "What is the exact failure path when the database call times out, and where is retry or fallback enforced in code?" + + if "jwt" in answer_lower or "auth" in answer_lower or "token" in answer_lower: + return "Point to the validation boundary for JWT in your flow: middleware, controller, or both, and explain why that placement is safe." + + if base_evaluation.get("contradictions"): + contradiction = base_evaluation["contradictions"][0] + return ( + "You stated two different implementation behaviors: " + f"{contradiction.get('previous')} vs {contradiction.get('current')}. " + "Which one is accurate in runtime, and where in code is it enforced?" + ) + + return "Describe the exact runtime path for this behavior from entry point to failure handling, including one concrete code-level boundary." diff --git a/backend/src/services/main_agent_viva_orchestrator.py b/backend/src/services/main_agent_viva_orchestrator.py new file mode 100644 index 0000000..5b2eb65 --- /dev/null +++ b/backend/src/services/main_agent_viva_orchestrator.py @@ -0,0 +1,537 @@ +""" +MAIN Agent Viva Orchestration Engine — Stage 5 + +Deterministic live viva orchestration that: +1. Consumes ORACLE IntelligenceArtifact +2. Manages viva session state and progression +3. Selects implementation-aware questions +4. Generates adaptive follow-ups based on answer depth +5. Tracks contradictions and weak areas +6. Maintains professional examiner demeanor +""" + +import json +from typing import List, Dict, Any, Optional, Tuple +from datetime import datetime +from enum import Enum + +from src.models.intelligence_artifact import ( + IntelligenceArtifact, + VivaTarget, + VivaSessionState, + AdaptiveThreshold, +) + + +class VivaPhase(str, Enum): + """Phases of viva progression.""" + + STARTED = "STARTED" + INTRODUCTORY = "INTRODUCTORY" # Warm-up questions + CORE = "CORE" # Main implementation questions + DEEP_DIVE = "DEEP_DIVE" # Probing weak areas + CONTRADICTION_PROBE = "CONTRADICTION_PROBE" # Addressing contradictions + CLOSING = "CLOSING" # Final summary + + +class AnswerDepthLevel(str, Enum): + """Categorizes depth of student response.""" + + GENERIC = "GENERIC" # Surface-level, generic answer + SHALLOW = "SHALLOW" # Shows basic understanding + ADEQUATE = "ADEQUATE" # Covers basics with some detail + DEEP = "DEEP" # Shows implementation understanding + EXPERT = "EXPERT" # Demonstrates deep expertise + + +class MainAgentVivaOrchestrator: + """ + Orchestrates live viva examination with the student. + + Responsibilities: + - Question selection and sequencing + - Answer evaluation and depth assessment + - Adaptive follow-up generation + - Session state management + - Contradiction detection + - Weak point escalation + """ + + def __init__(self, artifact: IntelligenceArtifact): + self.artifact = artifact + self.session_state: Optional[VivaSessionState] = None + self.question_history: List[Dict[str, Any]] = [] + self.answer_history: List[Dict[str, Any]] = [] + self.contradictions: List[Dict[str, Any]] = [] + self.weak_areas_detected: List[str] = [] + self.strong_areas_detected: List[str] = [] + # Stage 7/8/9 integrations (optional injection) + self.sentinel_monitor = None + self.evaluation_loop = None + self.curriculum_engine = None + + def initialize_session(self, session_id: str) -> VivaSessionState: + """Initialize viva session from artifact.""" + + self.session_state = VivaSessionState( + session_id=session_id, + viva_phase=VivaPhase.STARTED.value, + current_topic=None, + current_target_id=None, + questions_asked=0, + contradictions_found=0, + weak_areas_detected=[], + strong_areas_detected=[], + adaptive_difficulty=5.0, # Start at medium difficulty + ) + + return self.session_state + + def attach_sentinel(self, sentinel_monitor) -> None: + """Attach a SENTINEL monitor instance for parallel oversight.""" + self.sentinel_monitor = sentinel_monitor + + def attach_evaluation_loop(self, evaluation_loop) -> None: + """Attach evaluation loop instance to enrich evaluation and follow-ups.""" + self.evaluation_loop = evaluation_loop + + def attach_curriculum_engine(self, curriculum_engine) -> None: + """Attach curriculum engine for Stage 9 transitions.""" + self.curriculum_engine = curriculum_engine + + def get_next_question(self, previous_answer: Optional[str] = None) -> Tuple[VivaTarget, str]: + """ + Select next question based on: + - Viva phase progression + - Adaptive difficulty + - Previous answer evaluation + - Topic coverage + + Returns: + (VivaTarget, formatted question text) + """ + + if not self.session_state: + raise ValueError("Session not initialized") + + # Determine phase + phase = self._determine_phase() + self.session_state.viva_phase = phase.value + + # Filter candidates based on phase and difficulty + candidates = self._filter_question_candidates(phase) + + if not candidates: + # Fallback: use any remaining questions + used_ids = {q.get("target_id") for q in self.question_history} + candidates = [t for t in self.artifact.viva_targets if t.target_id not in used_ids] + + if not candidates: + # All questions exhausted + return None, "All questions completed." + + # Select question + selected = self._select_question(candidates, phase) + + # Format question + formatted_question = self._format_question(selected, phase) + + # Update state + self.session_state.current_target_id = selected.target_id + self.session_state.current_topic = selected.category.value + self.session_state.questions_asked += 1 + + # Log question + self.question_history.append( + { + "target_id": selected.target_id, + "question": formatted_question, + "category": selected.category.value, + "difficulty": selected.difficulty, + "depth_score": selected.depth_score, + "timestamp": datetime.utcnow().isoformat(), + "phase": phase.value, + } + ) + + return selected, formatted_question + + def evaluate_answer(self, answer_text: str, target: VivaTarget) -> Dict[str, Any]: + """ + Evaluate student answer for: + - Depth level + - Coverage of expected topics + - Red flags + - Contradictions with previous answers + - Evidence of implementation understanding + + Returns: + Evaluation dict with scores, flags, follow-up needs + """ + + depth_level = self._assess_depth(answer_text, target) + coverage = self._assess_coverage(answer_text, target) + red_flags = self._detect_red_flags(answer_text, target) + contradictions = self._detect_contradictions(answer_text, target) + + evaluation = { + "target_id": target.target_id, + "answer_text": answer_text, + "depth_level": depth_level.value, + "depth_score": self._depth_to_score(depth_level), + "coverage_score": coverage, + "red_flags": red_flags, + "contradictions": contradictions, + "requires_follow_up": depth_level in [AnswerDepthLevel.GENERIC, AnswerDepthLevel.SHALLOW] + or len(red_flags) > 0, + "timestamp": datetime.utcnow().isoformat(), + } + + # Non-invasive SENTINEL hook: accept any observational data present in evaluation payload + try: + sentinel_obs = evaluation.get("sentinel_observation") + if sentinel_obs and self.sentinel_monitor: + # turn index derived from questions_asked + turn_index = self.session_state.questions_asked + self.sentinel_monitor.evaluate_observation(turn_index, sentinel_obs) + except Exception: + pass + # Update session state + self.session_state.evaluation_score = coverage # Update with latest score + self.session_state.last_response_text = answer_text + + # Log answer + self.answer_history.append(evaluation) + + # Track weak areas if needed + if depth_level in [AnswerDepthLevel.GENERIC, AnswerDepthLevel.SHALLOW]: + if target.category.value not in self.weak_areas_detected: + self.weak_areas_detected.append(target.category.value) + + # Track strong areas if needed + if depth_level in [AnswerDepthLevel.DEEP, AnswerDepthLevel.EXPERT]: + if target.category.value not in self.strong_areas_detected: + self.strong_areas_detected.append(target.category.value) + + # Update adaptive difficulty + self._update_adaptive_difficulty(depth_level) + + return evaluation + + def generate_follow_up(self, evaluation: Dict[str, Any], target: VivaTarget) -> Optional[str]: + """ + Generate adaptive follow-up question based on answer evaluation. + + Follow-up strategy: + - Shallow answer: probe implementation details + - Generic answer: ask for specific examples + - Red flags: investigate assumptions + - Contradictions: highlight and reconcile + + Returns: + Follow-up question text or None if no follow-up needed + """ + + if not evaluation.get("requires_follow_up"): + return None + + depth_level = AnswerDepthLevel(evaluation["depth_level"]) + red_flags = evaluation.get("red_flags", []) + contradictions = evaluation.get("contradictions", []) + + follow_up = None + + if contradictions: + # Contradiction detected + contradiction = contradictions[0] + follow_up = f"I notice you mentioned {contradiction['current']} earlier, but now you're saying {contradiction['previous']}. Can you clarify?" + self.session_state.contradictions_found += 1 + self.contradictions.append( + { + "target_id": target.target_id, + "contradiction": contradiction, + "timestamp": datetime.utcnow().isoformat(), + } + ) + + elif red_flags: + # Red flag detected + flag = red_flags[0] + if "Generic" in flag: + follow_up = f"Can you provide a specific implementation example of what you just described?" + elif "Vague" in flag: + follow_up = f"That's too broad. What exactly happens at the code level?" + else: + follow_up = f"I want to dig deeper: {flag}" + + elif depth_level == AnswerDepthLevel.SHALLOW: + # Escalate depth + if target.follow_up_paths: + follow_up = f"Let's explore this further: {target.follow_up_paths[0]}" + else: + follow_up = "Can you walk through the exact implementation steps?" + + elif depth_level == AnswerDepthLevel.GENERIC: + # Request specificity + follow_up = "That's a common answer. What makes your implementation different or better?" + + if follow_up: + # Log follow-up + self.question_history.append( + { + "type": "FOLLOW_UP", + "follow_up_to": target.target_id, + "question": follow_up, + "timestamp": datetime.utcnow().isoformat(), + } + ) + + # If evaluation loop is attached, emit escalation events + try: + if self.evaluation_loop: + # record follow-up escalation in evaluation history + self.evaluation_loop.topic_coverage.get(target.category.value, 0.0) + except Exception: + pass + + return follow_up + + def get_session_summary(self) -> Dict[str, Any]: + """Generate viva session summary for persistence and evaluation.""" + + if not self.session_state: + return {} + + avg_depth = ( + sum(a.get("depth_score", 0) for a in self.answer_history) + / len(self.answer_history) + if self.answer_history + else 0 + ) + + avg_coverage = ( + sum(a.get("coverage_score", 0) for a in self.answer_history) / len(self.answer_history) + if self.answer_history + else 0 + ) + + return { + "session_id": self.session_state.session_id, + "total_questions": self.session_state.questions_asked, + "total_answers": len(self.answer_history), + "contradictions_found": self.session_state.contradictions_found, + "weak_areas": self.weak_areas_detected, + "strong_areas": self.strong_areas_detected, + "average_depth_score": avg_depth, + "average_coverage_score": avg_coverage, + "final_adaptive_difficulty": self.session_state.adaptive_difficulty, + "viva_phase": self.session_state.viva_phase, + "questions": self.question_history, + "answers": self.answer_history, + "contradictions": self.contradictions, + "timestamp": datetime.utcnow().isoformat(), + } + + # ===== Helper Methods ===== + + def _determine_phase(self) -> VivaPhase: + """Determine current viva phase based on progress.""" + + q = self.session_state.questions_asked + + if q == 0: + return VivaPhase.INTRODUCTORY + elif q < 3: + return VivaPhase.CORE + elif self.weak_areas_detected and q < 7: + return VivaPhase.DEEP_DIVE + elif self.contradictions and q < 10: + return VivaPhase.CONTRADICTION_PROBE + else: + return VivaPhase.CLOSING + + def _filter_question_candidates(self, phase: VivaPhase) -> List[VivaTarget]: + """Filter questions appropriate for current phase.""" + + used_ids = {q.get("target_id") for q in self.question_history} + available = [t for t in self.artifact.viva_targets if t.target_id not in used_ids] + + if phase == VivaPhase.INTRODUCTORY: + # Easy warm-up questions + return [q for q in available if q.difficulty == "FOUNDATIONAL"][:5] + + elif phase == VivaPhase.CORE: + # Medium to hard main questions + return [q for q in available if q.difficulty in ["MEDIUM", "HARD"]][:10] + + elif phase == VivaPhase.DEEP_DIVE: + # Hard questions on weak areas + weak_categories = set(a.split("_")[0] for a in self.weak_areas_detected) + return [ + q + for q in available + if q.difficulty == "HARD" and any(cat in q.category.value for cat in weak_categories) + ][:5] + + elif phase == VivaPhase.CONTRADICTION_PROBE: + # Questions that probe contradictions + return [q for q in available if q.depth_score > 7.0][:5] + + else: # CLOSING + # Summary-style questions + return available[:3] + + def _select_question(self, candidates: List[VivaTarget], phase: VivaPhase) -> VivaTarget: + """Select best question from candidates using adaptive logic.""" + + if not candidates: + return None + + # Prefer questions matching weak areas for deep dive + if phase == VivaPhase.DEEP_DIVE: + weak_cats = set(self.weak_areas_detected) + for q in candidates: + if q.category.value in weak_cats: + return q + + # Balance topic coverage + covered_topics = set(q.get("category") for q in self.question_history) + uncovered = [q for q in candidates if q.category.value not in covered_topics] + if uncovered: + return uncovered[0] + + # Default: first candidate + return candidates[0] + + def _format_question(self, target: VivaTarget, phase: VivaPhase) -> str: + """Format question with professional examiner tone.""" + + base_question = target.question + + if phase == VivaPhase.INTRODUCTORY: + return f"To start: {base_question}" + elif phase == VivaPhase.CORE: + return f"{base_question}" + elif phase == VivaPhase.DEEP_DIVE: + return f"Let me probe deeper: {base_question}" + elif phase == VivaPhase.CONTRADICTION_PROBE: + return f"I want to clarify something: {base_question}" + else: # CLOSING + return f"Finally: {base_question}" + + def _assess_depth(self, answer_text: str, target: VivaTarget) -> AnswerDepthLevel: + """Assess depth level of answer.""" + + # Heuristic assessment based on response length and keywords + word_count = len(answer_text.split()) + + if word_count < 10: + return AnswerDepthLevel.GENERIC + + expected_keywords = target.expected_coverage if target.expected_coverage else [] + + keyword_matches = sum(1 for keyword in expected_keywords if keyword.lower() in answer_text.lower()) + + if keyword_matches == 0: + return AnswerDepthLevel.SHALLOW + elif keyword_matches < len(expected_keywords) // 2: + return AnswerDepthLevel.ADEQUATE + elif keyword_matches >= len(expected_keywords): + # Check for depth indicators + depth_indicators = ["because", "however", "specifically", "implementation", "trade-off"] + depth_score = sum(1 for ind in depth_indicators if ind in answer_text.lower()) + if depth_score >= 2: + return AnswerDepthLevel.EXPERT + else: + return AnswerDepthLevel.DEEP + + return AnswerDepthLevel.ADEQUATE + + def _assess_coverage(self, answer_text: str, target: VivaTarget) -> float: + """Assess coverage score (0-1) of answer.""" + + if not target.expected_coverage: + return 0.5 + + keyword_matches = sum(1 for keyword in target.expected_coverage if keyword.lower() in answer_text.lower()) + + return min(1.0, keyword_matches / len(target.expected_coverage)) + + def _detect_red_flags(self, answer_text: str, target: VivaTarget) -> List[str]: + """Detect red flags in answer.""" + + flags = [] + + if target.red_flags: + for flag in target.red_flags: + if flag.lower() in answer_text.lower(): + flags.append(f"Red flag: {flag}") + + # Generic answer detection + generic_phrases = ["it depends", "generally", "typically", "probably", "maybe"] + if any(phrase in answer_text.lower() for phrase in generic_phrases): + flags.append("Generic answer: lacks specificity") + + # Vague answer detection + if len(answer_text.split()) < 20: + flags.append("Vague explanation: too brief") + + return flags + + def _detect_contradictions(self, answer_text: str, target: VivaTarget) -> List[Dict[str, str]]: + """Detect contradictions with previous answers.""" + + contradictions = [] + + for prev_answer in self.answer_history: + prev_text = prev_answer.get("answer_text", "").lower() + curr_text = answer_text.lower() + + # Simple contradiction detection: opposite claims + if "redis" in prev_text and "no caching" in curr_text: + contradictions.append( + { + "previous": "Redis is used", + "current": "No caching", + "severity": "HIGH", + } + ) + + if "async" in prev_text and "synchronous" in curr_text: + contradictions.append( + { + "previous": "Async processing", + "current": "Synchronous", + "severity": "HIGH", + } + ) + + return contradictions + + def _depth_to_score(self, depth: AnswerDepthLevel) -> float: + """Convert depth level to numeric score.""" + + mapping = { + AnswerDepthLevel.GENERIC: 1.0, + AnswerDepthLevel.SHALLOW: 2.5, + AnswerDepthLevel.ADEQUATE: 5.0, + AnswerDepthLevel.DEEP: 7.5, + AnswerDepthLevel.EXPERT: 10.0, + } + + return mapping.get(depth, 5.0) + + def _update_adaptive_difficulty(self, depth: AnswerDepthLevel) -> None: + """Update adaptive difficulty based on answer depth.""" + + current = self.session_state.adaptive_difficulty + + if depth in [AnswerDepthLevel.DEEP, AnswerDepthLevel.EXPERT]: + # Increase difficulty + self.session_state.adaptive_difficulty = min(10.0, current + 1.0) + elif depth == AnswerDepthLevel.GENERIC: + # Decrease difficulty + self.session_state.adaptive_difficulty = max(1.0, current - 1.0) + + # Clamp to valid range + self.session_state.adaptive_difficulty = max(1.0, min(10.0, self.session_state.adaptive_difficulty)) diff --git a/backend/src/services/oracle_main_handoff.py b/backend/src/services/oracle_main_handoff.py new file mode 100644 index 0000000..4397721 --- /dev/null +++ b/backend/src/services/oracle_main_handoff.py @@ -0,0 +1,224 @@ +""" +ORACLE → MAIN Agent Handoff Orchestrator — Stage 4 + +Deterministic handoff pipeline that: +1. Takes ORACLE StructuredContext +2. Builds IntelligenceArtifact +3. Persists artifact to session storage +4. Emits ORACLE_INTELLIGENCE_READY event +5. Signals MAIN Agent to start viva +""" + +import json +import time +from typing import Dict, Any, Optional +from datetime import datetime +from src.models.intelligence_artifact import IntelligenceArtifact, IntelligenceHandoffEvent +from src.models.context import StructuredContext +from src.models.events import EventType, PlatformEvent +from src.services.intelligence_artifact_builder import IntelligenceArtifactBuilder +from src.services.storage import FileStorageProvider + + +class OracleMainHandoffOrchestrator: + """ + Manages deterministic handoff from ORACLE analysis to MAIN Agent. + + Responsibilities: + - Convert StructuredContext to IntelligenceArtifact + - Persist artifact for audit/replay + - Emit handoff event + - Ensure deterministic, explainable transition + """ + + def __init__(self, storage_provider: Optional[FileStorageProvider] = None): + import os + if storage_provider is None: + base_path = os.path.join(os.getcwd(), "session_storage", "artifacts") + os.makedirs(base_path, exist_ok=True) + storage_provider = FileStorageProvider(base_path) + self.storage = storage_provider + self.handoff_log = [] + + async def handoff( + self, + session_id: str, + oracle_context: StructuredContext, + analysis_duration_seconds: float = 0.0, + repo_path: Optional[str] = None, + ) -> IntelligenceArtifact: + """ + Execute handoff from ORACLE to MAIN Agent. + + Args: + session_id: Exam session ID + oracle_context: ORACLE's StructuredContext output + analysis_duration_seconds: Time spent in ORACLE analysis + repo_path: Optional repo path for evidence collection + + Returns: + IntelligenceArtifact ready for MAIN Agent + """ + + handoff_start = time.time() + + # Step 1: Build IntelligenceArtifact + artifact = IntelligenceArtifactBuilder.build( + session_id=session_id, + structured_context=oracle_context, + analysis_duration_seconds=analysis_duration_seconds, + repo_path=repo_path, + ) + + # Step 2: Persist artifact to session storage + await self._persist_artifact(session_id, artifact) + + # Step 3: Emit handoff event + handoff_event = IntelligenceHandoffEvent( + session_id=session_id, + artifact_id=artifact.artifact_id, + artifact_summary={ + "project": artifact.project_name, + "num_execution_nodes": len(artifact.execution_graph_nodes), + "num_execution_paths": len(artifact.execution_paths), + "num_viva_targets": len(artifact.viva_targets), + "num_failure_scenarios": len(artifact.failure_scenarios), + "num_weak_points": len(artifact.weak_points), + "analysis_confidence": artifact.analysis_confidence, + }, + ) + + # Step 4: Log handoff transaction + handoff_duration = time.time() - handoff_start + self._log_handoff(session_id, artifact, handoff_duration) + + return artifact + + async def _persist_artifact(self, session_id: str, artifact: IntelligenceArtifact) -> None: + """Persist IntelligenceArtifact to session storage for audit/replay.""" + + # Serialize artifact + artifact_data = artifact.model_dump_json(indent=2) + + # Store with session + artifact_filename = f"oracle_intelligence_{artifact.artifact_id}.json" + artifact_path = f"sessions/{session_id}/{artifact_filename}" + + # Use storage provider + self.storage.append_artifact( + session_id=session_id, + artifact_type="ORACLE_INTELLIGENCE_ARTIFACT", + payload={ + "artifact_id": artifact.artifact_id, + "project_name": artifact.project_name, + "timestamp": artifact.generated_at.isoformat(), + "deterministic_hash": artifact.deterministic_hash, + "viva_targets": len(artifact.viva_targets), + "failure_scenarios": len(artifact.failure_scenarios), + }, + ) + + def _log_handoff(self, session_id: str, artifact: IntelligenceArtifact, duration_seconds: float) -> None: + """Log handoff transaction for audit trail.""" + + log_entry = { + "timestamp": datetime.utcnow().isoformat(), + "session_id": session_id, + "artifact_id": artifact.artifact_id, + "handoff_duration_seconds": duration_seconds, + "artifact_summary": { + "project": artifact.project_name, + "viva_targets": len(artifact.viva_targets), + "failure_scenarios": len(artifact.failure_scenarios), + "weak_points": len(artifact.weak_points), + }, + "deterministic_hash": artifact.deterministic_hash, + "status": "SUCCESS", + } + + self.handoff_log.append(log_entry) + + def get_artifact(self, session_id: str, artifact_id: str) -> Optional[IntelligenceArtifact]: + """Retrieve persisted artifact for MAIN Agent or audit purposes.""" + + # This would load from storage + # Simplified for now + return None + + +class OracleIntelligenceService: + """ + Wrapper service that orchestrates ORACLE analysis and handoff in a single flow. + Combines ORACLE analysis with handoff to MAIN Agent. + """ + + def __init__(self, oracle_agent, handoff_orchestrator: Optional[OracleMainHandoffOrchestrator] = None): + self.oracle_agent = oracle_agent + self.handoff_orchestrator = handoff_orchestrator or OracleMainHandoffOrchestrator() + + async def analyze_and_handoff( + self, + session_id: str, + input_data: Dict[str, Any], + log_callback=None, + ) -> IntelligenceArtifact: + """ + Execute full ORACLE analysis and handoff to MAIN Agent. + + Flow: + 1. Run ORACLE analysis → StructuredContext + 2. Convert to IntelligenceArtifact + 3. Persist and emit event + 4. Return artifact ready for MAIN Agent + + Args: + session_id: Exam session ID + input_data: Input to ORACLE (repo_url, report_path, etc.) + log_callback: Optional async callback for logging + + Returns: + IntelligenceArtifact ready for MAIN Agent viva + """ + + async def send_log(msg: str, type_: str = "info"): + if log_callback: + await log_callback({"message": msg, "type": type_}) + + await send_log("[Stage 4] Starting ORACLE analysis...", "info") + + # Step 1: Run ORACLE analysis + analysis_start = time.time() + oracle_context = await self.oracle_agent.process(session_id, input_data, log_callback) + analysis_duration = time.time() - analysis_start + + await send_log(f"[Stage 4] ORACLE analysis complete ({analysis_duration:.2f}s)", "info") + + # Step 2: Handoff to MAIN Agent + await send_log("[Stage 4] Building intelligence artifact for MAIN Agent...", "info") + artifact = await self.handoff_orchestrator.handoff( + session_id=session_id, + oracle_context=oracle_context, + analysis_duration_seconds=analysis_duration, + repo_path=input_data.get("repo_path"), + ) + + await send_log( + f"[Stage 4] Intelligence handoff complete. Artifact: {artifact.artifact_id}", "success" + ) + + # Step 3: Emit platform event + if hasattr(self.oracle_agent, "emit_event"): + self.oracle_agent.emit_event( + session_id, + EventType.ORACLE_INTELLIGENCE_READY, + { + "artifact_id": artifact.artifact_id, + "viva_targets": len(artifact.viva_targets), + "weak_points": len(artifact.weak_points), + "next_stage": "MAIN_AGENT_START_VIVA", + }, + ) + + await send_log("[Stage 4] Ready to start viva with MAIN Agent.", "success") + + return artifact diff --git a/backend/src/services/runtime_event_orchestrator.py b/backend/src/services/runtime_event_orchestrator.py new file mode 100644 index 0000000..a0012db --- /dev/null +++ b/backend/src/services/runtime_event_orchestrator.py @@ -0,0 +1,500 @@ +""" +Runtime Event Flow Orchestrator — Stages 4-6 + +Comprehensive event emission and coordination for: +- Stage 4: ORACLE Intelligence Handoff +- Stage 5: MAIN Agent Live Viva +- Stage 6: Voice Infrastructure + +All events are: +- Deterministic and reproducible +- Audit-safe with timestamps +- Session-bound for traceability +- Explainable with structured payloads +""" + +from typing import Dict, Any, Optional, Callable, List +from datetime import datetime +from enum import Enum + +from src.models.events import EventType, PlatformEvent +from src.models.intelligence_artifact import IntelligenceArtifact +from src.services.storage import FileStorageProvider + + +class EventEmitter: + """ + Central event emitter for viva pipeline. + + Coordinates event propagation across all stages. + """ + + def __init__(self, storage_provider: Optional[FileStorageProvider] = None): + import os + if storage_provider is None: + base_path = os.path.join(os.getcwd(), "session_storage", "events") + os.makedirs(base_path, exist_ok=True) + storage_provider = FileStorageProvider(base_path) + self.storage = storage_provider + self.event_log: List[PlatformEvent] = [] + self.subscribers: Dict[EventType, List[Callable]] = {} + + def subscribe(self, event_type: EventType, callback: Callable) -> None: + """Subscribe to event type.""" + + if event_type not in self.subscribers: + self.subscribers[event_type] = [] + + self.subscribers[event_type].append(callback) + + async def emit( + self, + session_id: str, + event_type: EventType, + payload: Dict[str, Any], + agent_name: str = "VivaPipeline", + ) -> PlatformEvent: + """ + Emit structured event. + + Performs: + - Event creation with timestamp + - Persistence to session storage + - Callback notifications + - Event logging for audit trail + """ + + event = PlatformEvent( + session_id=session_id, + agent_name=agent_name, + event_type=event_type, + payload=payload, + metadata={ + "timestamp_iso": datetime.utcnow().isoformat(), + "event_sequence": len(self.event_log), + }, + ) + + # Log event + self.event_log.append(event) + + # Persist to storage + await self._persist_event(session_id, event) + + # Notify subscribers + if event_type in self.subscribers: + for callback in self.subscribers[event_type]: + try: + await callback(event) if hasattr(callback, "__await__") else callback(event) + except Exception as e: + print(f"Subscriber callback error: {e}") + + return event + + async def _persist_event(self, session_id: str, event: PlatformEvent) -> None: + """Persist event to session storage.""" + + event_data = event.model_dump_json(indent=2) + + event_filename = f"event_{event.event_id}_{event.event_type}.json" + + self.storage.append_artifact( + session_id=session_id, + artifact_type="RUNTIME_EVENT", + payload={ + "event_id": event.event_id, + "event_type": event.event_type, + "agent": event.agent_name, + "timestamp": event.timestamp.isoformat(), + "payload": event.payload, + }, + ) + + +class Stage4EventCoordinator: + """Coordinates event emission for Stage 4: ORACLE → MAIN Handoff.""" + + def __init__(self, event_emitter: EventEmitter): + self.emitter = event_emitter + + async def emit_oracle_analysis_started(self, session_id: str) -> PlatformEvent: + """ORACLE analysis beginning.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.ORACLE_ANALYSIS_STARTED, + payload={"stage": "4", "status": "starting"}, + agent_name="OracleAgent", + ) + + async def emit_oracle_analysis_complete( + self, session_id: str, artifact: IntelligenceArtifact + ) -> PlatformEvent: + """ORACLE analysis complete, artifact ready.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.ORACLE_INTELLIGENCE_READY, + payload={ + "artifact_id": artifact.artifact_id, + "project_name": artifact.project_name, + "num_viva_targets": len(artifact.viva_targets), + "num_failure_scenarios": len(artifact.failure_scenarios), + "num_weak_points": len(artifact.weak_points), + "analysis_confidence": artifact.analysis_confidence, + "next_stage": "MAIN_AGENT_START_VIVA", + }, + agent_name="OracleAgent", + ) + + +class Stage5EventCoordinator: + """Coordinates event emission for Stage 5: MAIN Agent Live Viva.""" + + def __init__(self, event_emitter: EventEmitter): + self.emitter = event_emitter + + async def emit_viva_session_started(self, session_id: str) -> PlatformEvent: + """Viva session started.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VIVA_SESSION_STARTED, + payload={ + "stage": "5", + "status": "started", + "timestamp": datetime.utcnow().isoformat(), + }, + agent_name="MainAgent", + ) + + async def emit_question_asked( + self, session_id: str, target_id: str, question: str, difficulty: str + ) -> PlatformEvent: + """Question asked to student.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VIVA_QUESTION_ASKED, + payload={ + "target_id": target_id, + "question": question, + "difficulty": difficulty, + }, + agent_name="MainAgent", + ) + + async def emit_response_received( + self, session_id: str, target_id: str, response_text: str + ) -> PlatformEvent: + """Student response received.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VIVA_RESPONSE_RECEIVED, + payload={ + "target_id": target_id, + "response_length": len(response_text), + "timestamp": datetime.utcnow().isoformat(), + }, + agent_name="MainAgent", + ) + + async def emit_evaluation_complete( + self, + session_id: str, + target_id: str, + depth_level: str, + coverage_score: float, + red_flags: List[str], + ) -> PlatformEvent: + """Answer evaluation complete.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VIVA_EVALUATION_COMPLETE, + payload={ + "target_id": target_id, + "depth_level": depth_level, + "coverage_score": coverage_score, + "red_flags_count": len(red_flags), + "has_red_flags": len(red_flags) > 0, + }, + agent_name="MainAgent", + ) + + async def emit_follow_up_generated( + self, session_id: str, target_id: str, follow_up_question: str + ) -> PlatformEvent: + """Adaptive follow-up generated.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VIVA_FOLLOW_UP_GENERATED, + payload={ + "target_id": target_id, + "follow_up": follow_up_question, + }, + agent_name="MainAgent", + ) + + async def emit_contradiction_detected( + self, + session_id: str, + target_id: str, + previous_claim: str, + current_claim: str, + severity: str, + ) -> PlatformEvent: + """Contradiction detected.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VIVA_CONTRADICTION_DETECTED, + payload={ + "target_id": target_id, + "previous_claim": previous_claim, + "current_claim": current_claim, + "severity": severity, + }, + agent_name="MainAgent", + ) + + async def emit_topic_escalated( + self, session_id: str, topic: str, reason: str + ) -> PlatformEvent: + """Topic escalated to deeper questioning.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VIVA_TOPIC_ESCALATED, + payload={ + "topic": topic, + "reason": reason, + "increased_difficulty": True, + }, + agent_name="MainAgent", + ) + + async def emit_viva_session_completed( + self, session_id: str, summary: Dict[str, Any] + ) -> PlatformEvent: + """Viva session completed.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VIVA_SESSION_COMPLETED, + payload={ + "total_questions": summary.get("total_questions"), + "average_depth_score": summary.get("average_depth_score"), + "contradictions_found": summary.get("contradictions_found"), + "weak_areas": summary.get("weak_areas"), + "strong_areas": summary.get("strong_areas"), + }, + agent_name="MainAgent", + ) + + +class Stage6EventCoordinator: + """Coordinates event emission for Stage 6: Voice Infrastructure.""" + + def __init__(self, event_emitter: EventEmitter): + self.emitter = event_emitter + + async def emit_voice_session_started(self, session_id: str) -> PlatformEvent: + """Voice viva session started.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VOICE_SESSION_STARTED, + payload={"stage": "6", "status": "started"}, + agent_name="VoiceInfrastructure", + ) + + async def emit_question_played( + self, session_id: str, turn_number: int, duration_seconds: float + ) -> PlatformEvent: + """Question played via TTS.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VOICE_QUESTION_PLAYED, + payload={ + "turn_number": turn_number, + "tts_duration_seconds": duration_seconds, + "provider": "system_tts", + }, + agent_name="VoiceInfrastructure", + ) + + async def emit_listening_started(self, session_id: str, turn_number: int) -> PlatformEvent: + """Started listening for student response.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VOICE_LISTENING_STARTED, + payload={"turn_number": turn_number}, + agent_name="VoiceInfrastructure", + ) + + async def emit_listening_stopped( + self, session_id: str, turn_number: int, duration_seconds: float + ) -> PlatformEvent: + """Stopped listening (silence detected).""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VOICE_LISTENING_STOPPED, + payload={ + "turn_number": turn_number, + "recording_duration_seconds": duration_seconds, + }, + agent_name="VoiceInfrastructure", + ) + + async def emit_transcription_received( + self, + session_id: str, + turn_number: int, + transcript_raw: str, + confidence: float, + ) -> PlatformEvent: + """Speech transcription received.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VOICE_TRANSCRIPTION_RECEIVED, + payload={ + "turn_number": turn_number, + "transcript_length": len(transcript_raw), + "stt_confidence": confidence, + "provider": "mock_stt", + }, + agent_name="VoiceInfrastructure", + ) + + async def emit_transcription_normalized( + self, + session_id: str, + turn_number: int, + technical_terms: List[str], + ) -> PlatformEvent: + """Transcript normalized with technical terminology.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VOICE_TRANSCRIPTION_NORMALIZED, + payload={ + "turn_number": turn_number, + "technical_terms_corrected": technical_terms, + }, + agent_name="VoiceInfrastructure", + ) + + async def emit_voice_session_ended(self, session_id: str, total_turns: int) -> PlatformEvent: + """Voice viva session ended.""" + + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.VOICE_SESSION_ENDED, + payload={ + "total_voice_turns": total_turns, + "status": "completed", + }, + agent_name="VoiceInfrastructure", + ) + +class Stage7EventCoordinator: + """Coordinates SENTINEL integrity events emission for Stage 7.""" + + def __init__(self, event_emitter: EventEmitter): + self.emitter = event_emitter + + async def emit_integrity_alert(self, session_id: str, alert_payload: Dict[str, Any]) -> PlatformEvent: + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.INTEGRITY_ALERT_GENERATED, + payload=alert_payload, + agent_name="Sentinel", + ) + +class Stage8EventCoordinator: + """Coordinates evaluation loop events for Stage 8.""" + + def __init__(self, event_emitter: EventEmitter): + self.emitter = event_emitter + + async def emit_implementation_familiarity_updated(self, session_id: str, payload: Dict[str, Any]) -> PlatformEvent: + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.IMPLEMENTATION_FAMILIARITY_UPDATED, + payload=payload, + agent_name="MainAgentEvaluation", + ) + + async def emit_contradiction_chain_updated(self, session_id: str, payload: Dict[str, Any]) -> PlatformEvent: + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.CONTRADICTION_CHAIN_UPDATED, + payload=payload, + agent_name="MainAgentEvaluation", + ) + +class Stage9EventCoordinator: + """Coordinates curriculum transition events for Stage 9.""" + + def __init__(self, event_emitter: EventEmitter): + self.emitter = event_emitter + + async def emit_curriculum_transition_started(self, session_id: str, payload: Dict[str, Any]) -> PlatformEvent: + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.CURRICULUM_TRANSITION_STARTED, + payload=payload, + agent_name="CurriculumEngine", + ) + + async def emit_curriculum_topic_completed(self, session_id: str, payload: Dict[str, Any]) -> PlatformEvent: + return await self.emitter.emit( + session_id=session_id, + event_type=EventType.CURRICULUM_TOPIC_COMPLETED, + payload=payload, + agent_name="CurriculumEngine", + ) + + +class RuntimeEventOrchestrator: + """ + Master orchestrator for all runtime events across Stages 4-6. + + Provides unified interface for event emission and subscription. + """ + + def __init__(self, storage_provider: Optional[FileStorageProvider] = None): + self.emitter = EventEmitter(storage_provider) + self.stage4 = Stage4EventCoordinator(self.emitter) + self.stage5 = Stage5EventCoordinator(self.emitter) + self.stage6 = Stage6EventCoordinator(self.emitter) + self.stage7 = Stage7EventCoordinator(self.emitter) + self.stage8 = Stage8EventCoordinator(self.emitter) + self.stage9 = Stage9EventCoordinator(self.emitter) + + def subscribe(self, event_type: EventType, callback: Callable) -> None: + """Subscribe to specific event type.""" + + self.emitter.subscribe(event_type, callback) + + def get_event_log(self) -> List[PlatformEvent]: + """Get complete event log for audit/replay.""" + + return self.emitter.event_log + + def get_events_for_session(self, session_id: str) -> List[PlatformEvent]: + """Get all events for a specific session.""" + + return [e for e in self.emitter.event_log if e.session_id == session_id] + + def get_events_by_type(self, event_type: EventType) -> List[PlatformEvent]: + """Get all events of a specific type.""" + + return [e for e in self.emitter.event_log if e.event_type == event_type] diff --git a/backend/src/services/sentinel_parallel_monitor.py b/backend/src/services/sentinel_parallel_monitor.py new file mode 100644 index 0000000..fbe2228 --- /dev/null +++ b/backend/src/services/sentinel_parallel_monitor.py @@ -0,0 +1,190 @@ +""" +Stage 7 - SENTINEL Parallel Oversight + +SENTINEL is a monitoring-only component. It never asks viva questions or +changes pacing. It only surfaces observable integrity signals. +""" + +from datetime import datetime +from typing import Any, Dict, List, Optional + +from src.models.stage_7_8_9 import ( + IntegritySeverity, + IntegritySignalType, + SentinelAlert, + SentinelIntegrityEvent, +) +from src.services.storage import FileStorageProvider + + +class SentinelParallelMonitor: + """Deterministic integrity monitor for live viva sessions.""" + + def __init__(self, session_id: str, storage_provider: Optional[FileStorageProvider] = None): + self.session_id = session_id + if storage_provider is None: + import os + + base_path = os.path.join(os.getcwd(), "session_storage", "sentinel") + os.makedirs(base_path, exist_ok=True) + storage_provider = FileStorageProvider(base_path) + + self.storage = storage_provider + self.integrity_events: List[SentinelIntegrityEvent] = [] + self.alerts: List[SentinelAlert] = [] + + def evaluate_observation(self, turn_index: int, observation: Dict[str, Any]) -> List[SentinelIntegrityEvent]: + """ + Evaluate observable signals and produce structured integrity events. + + Expected observation fields: + - gaze_offscreen_seconds: float + - gaze_shift_count: int + - interruption_count: int + - audio_anomaly_score: float (0-1) + - visibility_ratio: float (0-1) + - response_confidence: float (0-1) + - contradiction_count: int + - silence_ratio: float (0-1) + - environment_change_detected: bool + """ + + events: List[SentinelIntegrityEvent] = [] + + def _emit(signal_type: IntegritySignalType, severity: IntegritySeverity, explanation: str, evidence: Dict[str, Any]) -> None: + event = SentinelIntegrityEvent( + event_id=f"sentinel_{self.session_id}_{turn_index}_{len(self.integrity_events) + len(events)}", + session_id=self.session_id, + signal_type=signal_type, + severity=severity, + explanation=explanation, + evidence=evidence, + replay_metadata={ + "turn_index": turn_index, + "observation_snapshot": observation, + }, + ) + events.append(event) + + gaze_offscreen_seconds = float(observation.get("gaze_offscreen_seconds", 0.0)) + if gaze_offscreen_seconds >= 15.0: + _emit( + IntegritySignalType.PROLONGED_OFFSCREEN_FOCUS, + IntegritySeverity.HIGH if gaze_offscreen_seconds >= 25.0 else IntegritySeverity.MEDIUM, + f"Student looked away from screen for {gaze_offscreen_seconds:.1f} seconds during active response window.", + {"gaze_offscreen_seconds": gaze_offscreen_seconds}, + ) + + gaze_shift_count = int(observation.get("gaze_shift_count", 0)) + if gaze_shift_count >= 6: + _emit( + IntegritySignalType.REPEATED_GAZE_SHIFT, + IntegritySeverity.MEDIUM, + f"Repeated gaze shifts detected ({gaze_shift_count} shifts) in a single turn.", + {"gaze_shift_count": gaze_shift_count}, + ) + + interruption_count = int(observation.get("interruption_count", 0)) + if interruption_count >= 2: + _emit( + IntegritySignalType.SESSION_INTERRUPTION, + IntegritySeverity.MEDIUM if interruption_count < 4 else IntegritySeverity.HIGH, + f"Session interruption pattern detected ({interruption_count} interruptions).", + {"interruption_count": interruption_count}, + ) + + audio_anomaly_score = float(observation.get("audio_anomaly_score", 0.0)) + if audio_anomaly_score >= 0.7: + _emit( + IntegritySignalType.SUSPICIOUS_AUDIO_PATTERN, + IntegritySeverity.MEDIUM if audio_anomaly_score < 0.85 else IntegritySeverity.HIGH, + f"Unusual audio pattern score observed ({audio_anomaly_score:.2f}).", + {"audio_anomaly_score": audio_anomaly_score}, + ) + + visibility_ratio = float(observation.get("visibility_ratio", 1.0)) + if visibility_ratio <= 0.55: + _emit( + IntegritySignalType.LOW_VISIBILITY_WARNING, + IntegritySeverity.MEDIUM, + f"Low camera visibility ratio observed ({visibility_ratio:.2f}).", + {"visibility_ratio": visibility_ratio}, + ) + + response_confidence = float(observation.get("response_confidence", 1.0)) + if response_confidence <= 0.45: + _emit( + IntegritySignalType.CONFIDENCE_INSTABILITY, + IntegritySeverity.MEDIUM, + f"Speech/transcript confidence unstable at {response_confidence:.2f}.", + {"response_confidence": response_confidence}, + ) + + contradiction_count = int(observation.get("contradiction_count", 0)) + if contradiction_count >= 2: + _emit( + IntegritySignalType.CONTRADICTION_ESCALATION, + IntegritySeverity.HIGH if contradiction_count >= 3 else IntegritySeverity.MEDIUM, + f"Contradiction escalation frequency detected ({contradiction_count} contradictions).", + {"contradiction_count": contradiction_count}, + ) + + silence_ratio = float(observation.get("silence_ratio", 0.0)) + if silence_ratio >= 0.65: + _emit( + IntegritySignalType.EXCESSIVE_SILENCE_PATTERN, + IntegritySeverity.MEDIUM, + f"Excessive silence detected with ratio {silence_ratio:.2f}.", + {"silence_ratio": silence_ratio}, + ) + + if bool(observation.get("environment_change_detected", False)): + _emit( + IntegritySignalType.ENVIRONMENT_CHANGE, + IntegritySeverity.MEDIUM, + "Environmental change detected during active viva turn.", + {"environment_change_detected": True}, + ) + + self.integrity_events.extend(events) + for event in events: + self.storage.append_artifact( + session_id=self.session_id, + artifact_type="SENTINEL_EVENT", + payload=event.model_dump(mode="json"), + ) + + # Manual review recommendation is deterministic from count and severity. + high_severity_count = sum(1 for e in events if e.severity == IntegritySeverity.HIGH) + if high_severity_count >= 1 or len(events) >= 3: + alert = SentinelAlert( + alert_id=f"alert_{self.session_id}_{turn_index}_{len(self.alerts)}", + session_id=self.session_id, + event_ids=[e.event_id for e in events], + manual_review_recommended=True, + reason="Integrity threshold reached based on observable signals.", + ) + self.alerts.append(alert) + self.storage.append_artifact( + session_id=self.session_id, + artifact_type="SENTINEL_ALERT", + payload=alert.model_dump(mode="json"), + ) + + return events + + def get_active_alerts(self) -> List[SentinelAlert]: + """Return all generated alerts for session-level attachment.""" + + return self.alerts + + def attach_alerts_to_exam_session(self, exam_session: Dict[str, Any]) -> Dict[str, Any]: + """Attach SENTINEL alerts to an exam session payload.""" + + updated = dict(exam_session) + existing_alerts = list(updated.get("integrity_alerts", [])) + existing_alerts.extend([a.model_dump(mode="json") for a in self.alerts]) + updated["integrity_alerts"] = existing_alerts + updated["manual_review_recommended"] = any(a.manual_review_recommended for a in self.alerts) + updated["integrity_last_updated_at"] = datetime.utcnow().isoformat() + return updated diff --git a/backend/src/services/viva_session_persistence.py b/backend/src/services/viva_session_persistence.py new file mode 100644 index 0000000..e01ddc4 --- /dev/null +++ b/backend/src/services/viva_session_persistence.py @@ -0,0 +1,286 @@ +""" +Viva Session Persistence Layer — Stage 5 + +Handles deterministic, replay-safe persistence of: +- Viva session state +- Question-answer chains +- Contradiction events +- Evaluation results +- Timestamps for audit trail +""" + +import json +import os +from typing import Dict, Any, Optional, List +from datetime import datetime +from src.models.intelligence_artifact import VivaSessionState +from src.services.storage import FileStorageProvider + + +class VivaSessionStore: + """ + Persistent store for viva session data with audit trail support. + + Responsibilities: + - Save/load session state + - Persist question-answer chains + - Log contradiction events + - Support replay verification + - Maintain deterministic artifact references + """ + + def __init__(self, storage_provider: Optional[FileStorageProvider] = None): + import os + if storage_provider is None: + base_path = os.path.join(os.getcwd(), "session_storage", "viva") + os.makedirs(base_path, exist_ok=True) + storage_provider = FileStorageProvider(base_path) + self.storage = storage_provider + + async def save_session_state( + self, session_id: str, viva_session_state: VivaSessionState + ) -> str: + """ + Save viva session state to persistent storage. + + Returns: + Path/ID of saved state + """ + + state_data = viva_session_state.model_dump_json(indent=2) + + state_id = f"viva_state_{datetime.utcnow().isoformat().replace(':', '-')}.json" + + self.storage.append_artifact( + session_id=session_id, + artifact_type="VIVA_SESSION_STATE", + payload={ + "viva_phase": viva_session_state.viva_phase, + "questions_asked": viva_session_state.questions_asked, + "timestamp": datetime.utcnow().isoformat(), + }, + ) + + return state_id + + async def save_question_answer_pair( + self, + session_id: str, + question_data: Dict[str, Any], + answer_data: Dict[str, Any], + evaluation_data: Dict[str, Any], + ) -> str: + """ + Save a question-answer-evaluation triplet. + + Returns: + Transcript segment ID + """ + + qa_pair = { + "question": question_data, + "answer": answer_data, + "evaluation": evaluation_data, + "timestamp": datetime.utcnow().isoformat(), + } + + qa_json = json.dumps(qa_pair, indent=2) + + qa_id = f"qa_pair_{len(datetime.utcnow().isoformat())}_{question_data.get('target_id', 'unknown')}.json" + + self.storage.append_artifact( + session_id=session_id, + artifact_type="VIVA_QA_PAIR", + payload={ + "target_id": question_data.get("target_id"), + "depth_level": answer_data.get("depth_level"), + "timestamp": datetime.utcnow().isoformat(), + }, + ) + + return qa_id + + async def save_contradiction_event( + self, session_id: str, contradiction_event: Dict[str, Any] + ) -> str: + """Save detected contradiction for audit trail.""" + + event_data = { + "type": "CONTRADICTION_DETECTED", + "event": contradiction_event, + "timestamp": datetime.utcnow().isoformat(), + } + + event_json = json.dumps(event_data, indent=2) + + event_id = f"contradiction_{datetime.utcnow().isoformat().replace(':', '-')}.json" + + self.storage.append_artifact( + session_id=session_id, + artifact_type="VIVA_CONTRADICTION", + payload={ + "target_id": contradiction_event.get("target_id"), + "severity": contradiction_event.get("contradiction", {}).get("severity", "MEDIUM"), + "timestamp": datetime.utcnow().isoformat(), + }, + ) + + return event_id + + async def save_session_summary( + self, session_id: str, summary: Dict[str, Any], oracle_artifact_id: str + ) -> str: + """ + Save final viva session summary with references to ORACLE artifact. + + Includes: + - Performance metrics + - Weak/strong areas + - Contradiction events + - ORACLE artifact reference (for replay) + """ + + summary_data = { + "session_summary": summary, + "oracle_artifact_reference": oracle_artifact_id, + "saved_at": datetime.utcnow().isoformat(), + "format_version": "1.0", + } + + summary_json = json.dumps(summary_data, indent=2) + + summary_id = f"viva_summary_{session_id}_{datetime.utcnow().isoformat().replace(':', '-')}.json" + + self.storage.append_artifact( + session_id=session_id, + artifact_type="VIVA_SESSION_SUMMARY", + payload={ + "total_questions": summary.get("total_questions"), + "contradictions": summary.get("contradictions_found"), + "timestamp": datetime.utcnow().isoformat(), + }, + ) + + return summary_id + + async def load_session_history(self, session_id: str) -> Dict[str, Any]: + """ + Load complete viva session history for analysis or replay. + + Returns session transcript, state changes, and key events. + """ + + # This would load from storage; simplified for now + return { + "session_id": session_id, + "qa_pairs": [], + "state_changes": [], + "contradictions": [], + } + + +class VivaTranscriptBuilder: + """ + Builds human-readable and machine-parseable transcripts from viva sessions. + + Supports: + - Plain text transcripts for export + - JSON transcripts for analysis + - Marked contradiction events + - Performance metrics + """ + + @staticmethod + def build_text_transcript(session_summary: Dict[str, Any]) -> str: + """Build plain text transcript.""" + + lines = [] + + lines.append("=" * 60) + lines.append(f"VIVA SESSION TRANSCRIPT") + lines.append(f"Session ID: {session_summary['session_id']}") + lines.append(f"Timestamp: {session_summary['timestamp']}") + lines.append(f"Total Questions: {session_summary['total_questions']}") + lines.append("=" * 60) + lines.append("") + + for i, qa in enumerate(session_summary.get("questions", []), 1): + lines.append(f"Q{i}: {qa.get('question', 'N/A')}") + lines.append(f" [Category: {qa.get('category')}, Difficulty: {qa.get('difficulty')}]") + lines.append("") + + lines.append("") + lines.append("=" * 60) + lines.append("PERFORMANCE SUMMARY") + lines.append("=" * 60) + lines.append(f"Average Depth Score: {session_summary.get('average_depth_score', 'N/A'):.2f}") + lines.append(f"Average Coverage: {session_summary.get('average_coverage_score', 'N/A'):.2f}") + lines.append(f"Weak Areas: {', '.join(session_summary.get('weak_areas', []))}") + lines.append(f"Strong Areas: {', '.join(session_summary.get('strong_areas', []))}") + lines.append(f"Contradictions Found: {session_summary.get('contradictions_found', 0)}") + lines.append("") + + if session_summary.get("contradictions"): + lines.append("=" * 60) + lines.append("CONTRADICTIONS DETECTED") + lines.append("=" * 60) + for c in session_summary.get("contradictions"): + lines.append(f" - {c.get('contradiction', {}).get('previous')} vs {c.get('contradiction', {}).get('current')}") + lines.append("") + + return "\n".join(lines) + + @staticmethod + def build_json_transcript(session_summary: Dict[str, Any]) -> str: + """Build JSON transcript for programmatic analysis.""" + + return json.dumps(session_summary, indent=2, default=str) + + @staticmethod + def build_evaluation_report(session_summary: Dict[str, Any]) -> Dict[str, Any]: + """ + Build structured evaluation report from session summary. + + Used for grading and feedback generation. + """ + + return { + "session_id": session_summary["session_id"], + "overall_score": ( + session_summary.get("average_depth_score", 0) + + session_summary.get("average_coverage_score", 0) + ) + / 2, + "depth_assessment": session_summary.get("average_depth_score"), + "coverage_assessment": session_summary.get("average_coverage_score"), + "weak_areas": session_summary.get("weak_areas", []), + "strong_areas": session_summary.get("strong_areas", []), + "contradictions_count": session_summary.get("contradictions_found", 0), + "adaptive_difficulty_final": session_summary.get("final_adaptive_difficulty"), + "viva_phase_final": session_summary.get("viva_phase"), + "recommendations": VivaTranscriptBuilder._generate_recommendations(session_summary), + } + + @staticmethod + def _generate_recommendations(session_summary: Dict[str, Any]) -> List[str]: + """Generate recommendations based on viva performance.""" + + recommendations = [] + + weak_areas = session_summary.get("weak_areas", []) + if weak_areas: + recommendations.append( + f"Student should focus on strengthening understanding in: {', '.join(weak_areas)}" + ) + + contradictions = session_summary.get("contradictions_found", 0) + if contradictions > 3: + recommendations.append("Multiple contradictions detected. Consider additional technical depth assessment.") + + avg_depth = session_summary.get("average_depth_score", 0) + if avg_depth < 3: + recommendations.append("Responses suggest surface-level understanding. Recommend remedial study.") + elif avg_depth > 8: + recommendations.append("Student demonstrates strong implementation knowledge.") + + return recommendations diff --git a/backend/src/services/voice_viva_orchestrator.py b/backend/src/services/voice_viva_orchestrator.py new file mode 100644 index 0000000..21a2cd1 --- /dev/null +++ b/backend/src/services/voice_viva_orchestrator.py @@ -0,0 +1,500 @@ +""" +Voice Infrastructure Pipeline — Stage 6 + +Deterministic, turn-based voice viva system that: +1. Plays questions via TTS +2. Records student responses via microphone +3. Transcribes speech to text +4. Normalizes technical terminology +5. Detects silence to finalize responses +6. Delivers finalized transcript to MAIN Agent + +IMPORTANT: Voice system is infrastructure-only. +MAIN Agent remains the viva examiner brain. +""" + +import json +import time +from typing import Dict, Any, Optional, List, Tuple +from datetime import datetime +from enum import Enum + + +class VoicePhase(str, Enum): + """Phases of voice interaction in a turn.""" + + IDLE = "IDLE" + PLAYING_QUESTION = "PLAYING_QUESTION" + QUESTION_PLAYED = "QUESTION_PLAYED" + LISTENING = "LISTENING" + RECORDING = "RECORDING" + SILENCE_DETECTED = "SILENCE_DETECTED" + TRANSCRIBING = "TRANSCRIBING" + TRANSCRIPT_READY = "TRANSCRIPT_READY" + + +class TranscriptNormalizer: + """ + Normalizes technical terminology in transcribed speech. + + Examples: + - "redis" → "Redis" + - "jay pee off" → "JPG" + - "async" → "async/await" + - "cache invalidation" → "cache invalidation" + """ + + # Technical term corrections + TECHNICAL_CORRECTIONS = { + # Databases + "postgres": "PostgreSQL", + "postgre sql": "PostgreSQL", + "mongodb": "MongoDB", + "redis": "Redis", + # Languages + "python": "Python", + "javascript": "JavaScript", + "type script": "TypeScript", + "go": "Go", + # Frameworks + "fast api": "FastAPI", + "django": "Django", + "react": "React", + "next dot js": "Next.js", + # Protocols + "jay son": "JSON", + "rest": "REST", + "graphql": "GraphQL", + "soap": "SOAP", + # Common terms + "async": "async/await", + "cache invalidation": "cache invalidation", + "race condition": "race condition", + "dead lock": "deadlock", + "jwt": "JWT", + # Acronyms + "oh auth": "OAuth", + "s s l": "SSL", + "h t t p s": "HTTPS", + "c r u d": "CRUD", + "acid": "ACID", + "solid": "SOLID", + } + + @staticmethod + def normalize(transcript: str) -> str: + """ + Normalize technical terminology in transcript. + + Performs deterministic corrections for common speech-to-text errors + with technical terms. + """ + + normalized = transcript.lower() + + # Apply technical corrections + for wrong, correct in TranscriptNormalizer.TECHNICAL_CORRECTIONS.items(): + normalized = normalized.replace(wrong.lower(), correct) + + return normalized + + @staticmethod + def extract_technical_terms(transcript: str) -> List[str]: + """Extract identified technical terms from normalized transcript.""" + + terms = [] + + for term in TranscriptNormalizer.TECHNICAL_CORRECTIONS.values(): + if term.lower() in transcript.lower(): + terms.append(term) + + return list(set(terms)) + + +class SilenceDetector: + """ + Detects silence in audio stream to determine response finalization. + + Parameters: + - silence_threshold_ms: Milliseconds of silence to trigger finalization + - min_response_duration_ms: Minimum response duration before silence ends session + """ + + def __init__( + self, + silence_threshold_ms: int = 3000, + min_response_duration_ms: int = 1000, + max_response_duration_seconds: int = 120, + ): + self.silence_threshold_ms = silence_threshold_ms + self.min_response_duration_ms = min_response_duration_ms + self.max_response_duration_seconds = max_response_duration_seconds + self.recording_start_time: Optional[float] = None + self.last_sound_time: Optional[float] = None + + def start_recording(self) -> None: + """Mark recording start.""" + + self.recording_start_time = time.time() + self.last_sound_time = self.recording_start_time + + def should_finalize(self, is_sound_present: bool) -> bool: + """ + Determine if recording should be finalized. + + Returns True if: + - Silence exceeded threshold after minimum response duration + - Maximum response duration exceeded + """ + + if not self.recording_start_time: + return False + + current_time = time.time() + elapsed_ms = (current_time - self.recording_start_time) * 1000 + + # Max response duration exceeded + if elapsed_ms > self.max_response_duration_seconds * 1000: + return True + + # Update last sound time + if is_sound_present: + self.last_sound_time = current_time + + # Check silence duration + if self.last_sound_time: + silence_duration_ms = (current_time - self.last_sound_time) * 1000 + + # If minimum response given and silence exceeded + if elapsed_ms >= self.min_response_duration_ms and silence_duration_ms >= self.silence_threshold_ms: + return True + + return False + + +class TTSProvider: + """ + Text-to-Speech provider interface. + + Implementations: SystemTTS, GoogleTTS, AzureTTS, etc. + """ + + async def speak(self, text: str, language: str = "en-US") -> Dict[str, Any]: + """ + Play text as speech. + + Returns: + {"success": bool, "duration_seconds": float, "audio_path": str} + """ + + raise NotImplementedError() + + +class SystemTTSProvider(TTSProvider): + """System TTS using platform-native speech synthesis (say on macOS, etc.).""" + + async def speak(self, text: str, language: str = "en-US") -> Dict[str, Any]: + """ + Use system TTS to play question. + + On macOS: uses `say` command + On Linux: uses `espeak` or `festival` + On Windows: uses SAPI + """ + + import subprocess + import sys + + try: + start_time = time.time() + + # Platform-specific TTS command + if sys.platform == "darwin": # macOS + subprocess.run(["say", "-r", "150", text], check=True) + elif sys.platform == "linux": + # Try espeak first, fallback to festival + try: + subprocess.run(["espeak", text], check=True) + except FileNotFoundError: + subprocess.run(["festival", "--tts"], input=text.encode(), check=True) + elif sys.platform == "win32": + # Windows SAPI + import pyttsx3 + + engine = pyttsx3.init() + engine.say(text) + engine.runAndWait() + + duration = time.time() - start_time + + return { + "success": True, + "duration_seconds": duration, + "audio_path": None, + "provider": "system_tts", + } + + except Exception as e: + return { + "success": False, + "error": str(e), + "provider": "system_tts", + } + + +class STTProvider: + """ + Speech-to-Text provider interface. + + Implementations: Deepgram, Google Speech-to-Text, Azure, etc. + """ + + async def transcribe(self, audio_data: bytes) -> Dict[str, Any]: + """ + Transcribe audio to text. + + Returns: + {"success": bool, "transcript": str, "confidence": float} + """ + + raise NotImplementedError() + + +class MockSTTProvider(STTProvider): + """Mock STT for testing (returns pre-configured responses).""" + + def __init__(self): + self.mock_responses = [] + self.response_index = 0 + + def add_mock_response(self, transcript: str, confidence: float = 0.95): + """Add a mock response for testing.""" + + self.mock_responses.append({"transcript": transcript, "confidence": confidence}) + + async def transcribe(self, audio_data: bytes) -> Dict[str, Any]: + """Return next mock response.""" + + if self.response_index < len(self.mock_responses): + response = self.mock_responses[self.response_index] + self.response_index += 1 + return { + "success": True, + "transcript": response["transcript"], + "confidence": response["confidence"], + "provider": "mock_stt", + } + + return { + "success": False, + "error": "No more mock responses", + "provider": "mock_stt", + } + + +class VoiceSessionOrchestrator: + """ + Orchestrates a single voice turn: question → playback → listen → transcribe. + + Deterministic and turn-based. + """ + + def __init__( + self, + tts_provider: TTSProvider, + stt_provider: STTProvider, + silence_detector: Optional[SilenceDetector] = None, + ): + self.tts_provider = tts_provider + self.stt_provider = stt_provider + self.silence_detector = silence_detector or SilenceDetector() + self.phase = VoicePhase.IDLE + + async def conduct_turn( + self, + question: str, + turn_number: int, + max_response_duration_seconds: int = 120, + ) -> Dict[str, Any]: + """ + Conduct a single voice turn: play question → listen → transcribe. + + Returns: + { + "success": bool, + "turn_number": int, + "question": str, + "transcript": str, + "transcript_normalized": str, + "confidence": float, + "duration_seconds": float, + "technical_terms": List[str], + "timestamp": str + } + """ + + turn_start = time.time() + result = { + "turn_number": turn_number, + "question": question, + "success": False, + "timestamp": datetime.utcnow().isoformat(), + } + + try: + # Step 1: Play question + self.phase = VoicePhase.PLAYING_QUESTION + play_result = await self.tts_provider.speak(question) + + if not play_result.get("success"): + result["error"] = f"TTS failed: {play_result.get('error')}" + return result + + self.phase = VoicePhase.QUESTION_PLAYED + result["question_playback_duration_seconds"] = play_result.get("duration_seconds", 0) + + # Step 2: Listen for response + self.phase = VoicePhase.LISTENING + self.silence_detector.start_recording() + + # Simulate recording (in real implementation, would use actual audio device) + # For now, this is a placeholder that returns after timeout + recording_start = time.time() + while (time.time() - recording_start) < 5: # Simulate 5 second recording window + # In real implementation, would check audio device for sound + # and update silence detector + if self.silence_detector.should_finalize(is_sound_present=False): + break + await self._async_sleep(0.1) + + self.phase = VoicePhase.SILENCE_DETECTED + + # Step 3: Transcribe (mock audio for now) + self.phase = VoicePhase.TRANSCRIBING + mock_audio = b"mock_audio_data" # In real impl: actual audio bytes + transcribe_result = await self.stt_provider.transcribe(mock_audio) + + if not transcribe_result.get("success"): + result["error"] = f"STT failed: {transcribe_result.get('error')}" + return result + + # Step 4: Normalize transcript + raw_transcript = transcribe_result.get("transcript", "") + normalized_transcript = TranscriptNormalizer.normalize(raw_transcript) + technical_terms = TranscriptNormalizer.extract_technical_terms(normalized_transcript) + + self.phase = VoicePhase.TRANSCRIPT_READY + + # Step 5: Build result + turn_duration = time.time() - turn_start + result.update( + { + "success": True, + "transcript": raw_transcript, + "transcript_normalized": normalized_transcript, + "confidence": transcribe_result.get("confidence", 0.0), + "duration_seconds": turn_duration, + "technical_terms": technical_terms, + "phase_final": self.phase.value, + } + ) + + except Exception as e: + result["error"] = str(e) + result["phase_error"] = self.phase.value + + return result + + async def _async_sleep(self, duration: float) -> None: + """Async sleep utility.""" + + import asyncio + + await asyncio.sleep(duration) + + +class VoiceVivaSession: + """ + Complete voice-based viva session from start to finish. + + Coordinates: + - Question selection from MAIN Agent + - Voice turn orchestration + - Transcript collection + - Session persistence + """ + + def __init__( + self, + session_id: str, + main_agent_orchestrator, + tts_provider: TTSProvider, + stt_provider: STTProvider, + ): + self.session_id = session_id + self.main_agent_orchestrator = main_agent_orchestrator + self.voice_turn_orchestrator = VoiceSessionOrchestrator(tts_provider, stt_provider) + self.turns: List[Dict[str, Any]] = [] + self.session_start_time = datetime.utcnow() + + async def conduct_viva(self, max_turns: int = 15) -> Dict[str, Any]: + """ + Conduct full voice-based viva. + + Returns session transcript and final state. + """ + + for turn_num in range(1, max_turns + 1): + # Get next question from MAIN Agent + target, question_text = self.main_agent_orchestrator.get_next_question() + + if not target: + # All questions done + break + + # Conduct voice turn + voice_turn_result = await self.voice_turn_orchestrator.conduct_turn(question_text, turn_num) + + if not voice_turn_result.get("success"): + # Continue despite STT error + continue + + # Extract normalized transcript + student_response = voice_turn_result.get("transcript_normalized", "") + + # Evaluate response + evaluation = self.main_agent_orchestrator.evaluate_answer(student_response, target) + + # Generate follow-up + follow_up = self.main_agent_orchestrator.generate_follow_up(evaluation, target) + + # Store turn + turn_record = { + "turn_number": turn_num, + "question": question_text, + "target_id": target.target_id, + "voice_turn": voice_turn_result, + "evaluation": evaluation, + "follow_up": follow_up, + } + + self.turns.append(turn_record) + + # If strong follow-up needed, ask it before moving to next question + if follow_up: + # This would be asked as next voice turn + pass + + # Generate session summary + session_summary = self.main_agent_orchestrator.get_session_summary() + session_summary["voice_turns"] = len(self.turns) + session_summary["session_duration_seconds"] = ( + datetime.utcnow() - self.session_start_time + ).total_seconds() + + return { + "session_id": self.session_id, + "success": len(self.turns) > 0, + "turns": self.turns, + "summary": session_summary, + } diff --git a/backend/tests/test_stages_4_5_6_integration.py b/backend/tests/test_stages_4_5_6_integration.py new file mode 100644 index 0000000..0b7203e --- /dev/null +++ b/backend/tests/test_stages_4_5_6_integration.py @@ -0,0 +1,443 @@ +""" +End-to-End Integration Test — Stages 4-6 + +Validates complete flow: +1. ORACLE analysis → IntelligenceArtifact +2. MAIN Agent starts viva with adaptive questions +3. Voice infrastructure conducts turns +4. Session persists with full audit trail + +This test demonstrates: +- Deterministic behavior +- Event emission across stages +- Transcript persistence +- Contradiction detection +- Adaptive difficulty progression +""" + +import asyncio +import json +from datetime import datetime +from typing import Dict, Any + +# Stage 4 imports +from src.services.oracle_main_handoff import OracleMainHandoffOrchestrator +from src.models.intelligence_artifact import IntelligenceArtifact + +# Stage 5 imports +from src.services.main_agent_viva_orchestrator import MainAgentVivaOrchestrator + +# Stage 6 imports +from src.services.voice_viva_orchestrator import ( + MockSTTProvider, + SystemTTSProvider, + VoiceVivaSession, +) + +# Event orchestration +from src.services.runtime_event_orchestrator import RuntimeEventOrchestrator + +# Session persistence +from src.services.viva_session_persistence import VivaSessionStore, VivaTranscriptBuilder + + +class Stage456IntegrationTest: + """ + End-to-end integration test for Stages 4-6. + + Demonstrates: + - Complete workflow + - Event emission + - Adaptive behavior + - Persistence + """ + + def __init__(self): + self.test_session_id = f"test_session_{datetime.utcnow().isoformat()}" + self.event_orchestrator = RuntimeEventOrchestrator() + self.viva_session_store = VivaSessionStore() + self.results = {} + + async def run_full_test(self) -> Dict[str, Any]: + """ + Execute full Stages 4-6 flow. + + Returns: + Results with validation status and key metrics + """ + + print("\n" + "=" * 70) + print("STAGE 4-6 INTEGRATION TEST") + print("=" * 70) + + try: + # Stage 4: Create mock ORACLE artifact + print("\n[STAGE 4] Creating mock IntelligenceArtifact from ORACLE...") + artifact = self._create_mock_oracle_artifact() + print(f"✓ Artifact created: {artifact.artifact_id}") + print(f" - Viva targets: {len(artifact.viva_targets)}") + print(f" - Failure scenarios: {len(artifact.failure_scenarios)}") + print(f" - Weak points: {len(artifact.weak_points)}") + + # Emit Stage 4 event + await self.event_orchestrator.stage4.emit_oracle_analysis_complete( + self.test_session_id, artifact + ) + print("✓ ORACLE_INTELLIGENCE_READY event emitted") + + # Stage 5: Initialize MAIN Agent and conduct viva + print("\n[STAGE 5] Initializing MAIN Agent viva orchestration...") + main_orchestrator = MainAgentVivaOrchestrator(artifact) + session_state = main_orchestrator.initialize_session(self.test_session_id) + print(f"✓ Viva session initialized: {session_state.session_id}") + + # Emit Stage 5 start event + await self.event_orchestrator.stage5.emit_viva_session_started(self.test_session_id) + print("✓ VIVA_SESSION_STARTED event emitted") + + # Conduct mock viva turns (without voice) + print("\n[STAGE 5] Conducting viva questions...") + num_turns = 3 + for turn_num in range(num_turns): + print(f"\n Turn {turn_num + 1}/{num_turns}:") + + # Get question + target, question = main_orchestrator.get_next_question() + if not target: + print(" ✗ No more questions") + break + + print(f" Q: {question}") + print(f" Category: {target.category.value}") + + # Emit question event + await self.event_orchestrator.stage5.emit_question_asked( + self.test_session_id, + target.target_id, + question, + target.difficulty, + ) + + # Mock student response + mock_response = self._get_mock_response(target) + print(f" A: {mock_response}") + + # Emit response event + await self.event_orchestrator.stage5.emit_response_received( + self.test_session_id, target.target_id, mock_response + ) + + # Evaluate + evaluation = main_orchestrator.evaluate_answer(mock_response, target) + print( + f" Depth: {evaluation['depth_level']}, Coverage: {evaluation['coverage_score']:.2f}" + ) + + # Emit evaluation event + await self.event_orchestrator.stage5.emit_evaluation_complete( + self.test_session_id, + target.target_id, + evaluation["depth_level"], + evaluation["coverage_score"], + evaluation.get("red_flags", []), + ) + + # Generate follow-up + follow_up = main_orchestrator.generate_follow_up(evaluation, target) + if follow_up: + print(f" Follow-up: {follow_up}") + await self.event_orchestrator.stage5.emit_follow_up_generated( + self.test_session_id, target.target_id, follow_up + ) + + # Detect contradictions + if evaluation.get("contradictions"): + for contradiction in evaluation["contradictions"]: + print( + f" ⚠ Contradiction: {contradiction['previous']} vs {contradiction['current']}" + ) + await self.event_orchestrator.stage5.emit_contradiction_detected( + self.test_session_id, + target.target_id, + contradiction["previous"], + contradiction["current"], + contradiction.get("severity", "MEDIUM"), + ) + + # Save QA pair + await self.viva_session_store.save_question_answer_pair( + self.test_session_id, + {"target_id": target.target_id, "question": question}, + {"transcript": mock_response}, + evaluation, + ) + + # Get viva summary + session_summary = main_orchestrator.get_session_summary() + print(f"\n✓ Viva summary:") + print(f" - Questions: {session_summary['total_questions']}") + print(f" - Avg depth score: {session_summary['average_depth_score']:.2f}") + print(f" - Contradictions: {session_summary['contradictions_found']}") + print(f" - Weak areas: {', '.join(session_summary['weak_areas']) or 'None'}") + + # Emit completion event + await self.event_orchestrator.stage5.emit_viva_session_completed( + self.test_session_id, session_summary + ) + print("✓ VIVA_SESSION_COMPLETED event emitted") + + # Stage 6: Mock voice session (without actual audio) + print("\n[STAGE 6] Initializing voice infrastructure...") + mock_stt = MockSTTProvider() + mock_stt.add_mock_response("Redis is used for caching in the request pipeline", 0.95) + mock_stt.add_mock_response( + "It provides fast in-memory storage for frequently accessed data", 0.92 + ) + + tts_provider = SystemTTSProvider() + + # Create voice session + voice_session = VoiceVivaSession( + self.test_session_id, + main_orchestrator, + tts_provider, + mock_stt, + ) + + print("✓ Voice session initialized") + + # Emit voice start event + await self.event_orchestrator.stage6.emit_voice_session_started(self.test_session_id) + + # Conduct 1 voice turn to demonstrate + print("\n[STAGE 6] Conducting sample voice turn...") + target, question = main_orchestrator.get_next_question() + if target: + print(f" Question: {question}") + + # Emit playing event + await self.event_orchestrator.stage6.emit_question_played( + self.test_session_id, 1, 2.5 + ) + print(" ✓ Question played via TTS") + + # Emit listening start + await self.event_orchestrator.stage6.emit_listening_started(self.test_session_id, 1) + print(" ✓ Listening started") + + # Simulate recording + await asyncio.sleep(0.5) + + # Emit listening stop + await self.event_orchestrator.stage6.emit_listening_stopped( + self.test_session_id, 1, 3.2 + ) + print(" ✓ Silence detected, recording stopped") + + # Mock transcription + transcription = await mock_stt.transcribe(b"mock_audio") + if transcription.get("success"): + await self.event_orchestrator.stage6.emit_transcription_received( + self.test_session_id, + 1, + transcription.get("transcript"), + transcription.get("confidence"), + ) + print(f" ✓ Transcription: {transcription.get('transcript')}") + + # Emit normalization + from src.services.voice_viva_orchestrator import TranscriptNormalizer + + normalized = TranscriptNormalizer.normalize( + transcription.get("transcript") + ) + technical_terms = TranscriptNormalizer.extract_technical_terms(normalized) + await self.event_orchestrator.stage6.emit_transcription_normalized( + self.test_session_id, 1, technical_terms + ) + print(f" ✓ Normalized with terms: {technical_terms}") + + # Emit voice session end + await self.event_orchestrator.stage6.emit_voice_session_ended( + self.test_session_id, 1 + ) + print("✓ VOICE_SESSION_ENDED event emitted") + + # Save session summary + await self.viva_session_store.save_session_summary( + self.test_session_id, session_summary, artifact.artifact_id + ) + print("✓ Session summary persisted") + + # Generate transcripts + print("\n[PERSISTENCE] Generating transcripts...") + text_transcript = VivaTranscriptBuilder.build_text_transcript(session_summary) + print("✓ Text transcript generated") + + eval_report = VivaTranscriptBuilder.build_evaluation_report(session_summary) + print("✓ Evaluation report generated") + + # Collect results + self.results = { + "status": "SUCCESS", + "test_session_id": self.test_session_id, + "stage4": { + "artifact_id": artifact.artifact_id, + "viva_targets": len(artifact.viva_targets), + "deterministic_hash": artifact.deterministic_hash, + }, + "stage5": { + "questions_asked": session_summary["total_questions"], + "average_depth_score": session_summary["average_depth_score"], + "average_coverage_score": session_summary["average_coverage_score"], + "contradictions_found": session_summary["contradictions_found"], + "weak_areas": session_summary["weak_areas"], + "strong_areas": session_summary["strong_areas"], + }, + "stage6": { + "voice_turns": 1, + "mock_stt_confidence": 0.95, + }, + "events_emitted": len(self.event_orchestrator.get_event_log()), + "event_types": list( + set(e.event_type for e in self.event_orchestrator.get_event_log()) + ), + } + + # Print event log summary + print("\n[EVENTS] Summary of emitted events:") + event_log = self.event_orchestrator.get_event_log() + for i, event in enumerate(event_log, 1): + print(f" {i:2d}. {event.event_type}") + + print("\n" + "=" * 70) + print("✓ INTEGRATION TEST PASSED") + print("=" * 70) + + return self.results + + except Exception as e: + self.results = { + "status": "FAILED", + "error": str(e), + } + print(f"\n✗ TEST FAILED: {e}") + import traceback + + traceback.print_exc() + return self.results + + def _create_mock_oracle_artifact(self) -> IntelligenceArtifact: + """Create mock ORACLE artifact for testing.""" + + from src.models.intelligence_artifact import ( + VivaTarget, + ExecutionNode, + ExecutionPath, + FailureScenario, + WeakPoint, + IntelligenceCategory, + ) + + return IntelligenceArtifact( + session_id=self.test_session_id, + analysis_duration_seconds=2.5, + project_name="Test Project", + project_type="Web Application", + backend_stack={"framework": "FastAPI", "database": "PostgreSQL", "cache": "Redis"}, + architecture_pattern="MVC", + execution_graph_nodes=[ + ExecutionNode( + node_id="req_handler", + label="Request Handler", + node_type="REQUEST_HANDLER", + implementation_details="Receives HTTP request", + ), + ExecutionNode( + node_id="middleware", + label="Auth Middleware", + node_type="MIDDLEWARE", + implementation_details="Validates JWT token", + ), + ], + execution_paths=[ + ExecutionPath( + path_id="happy_path", + description="Normal flow", + nodes=["req_handler", "middleware"], + scenario="HAPPY_PATH", + criticality="HIGH", + ) + ], + viva_targets=[ + VivaTarget( + target_id="target_1", + question="How does your caching strategy handle concurrent writes?", + category=IntelligenceCategory.SCALABILITY, + difficulty="HARD", + depth_score=8.0, + why_important="Tests understanding of race conditions", + expected_coverage=[ + "Locking mechanism", + "Cache invalidation", + "Consistency guarantees", + ], + red_flags=["No concurrency handling", "Hand-waving solution"], + ), + VivaTarget( + target_id="target_2", + question="What happens if database connection fails?", + category=IntelligenceCategory.FAILURE_PATH, + difficulty="MEDIUM", + depth_score=6.0, + why_important="Tests error handling knowledge", + ), + ], + failure_scenarios=[ + FailureScenario( + scenario_name="Database Connection Failure", + trigger="DB unreachable", + propagation_path=["DB Query", "Error Handler"], + impact="Request fails", + severity="HIGH", + detectability="EASY", + ) + ], + weak_points=[ + WeakPoint( + area="Concurrency", + weakness="Race conditions in cache invalidation", + why_problematic="Shows lack of understanding of concurrent systems", + testing_approach="Ask about locking mechanisms", + ) + ], + summary="Mock artifact for testing", + analysis_confidence=0.95, + ) + + def _get_mock_response(self, target: Any) -> str: + """Generate mock response based on target.""" + + if "caching" in target.question.lower(): + return "We use Redis for caching with TTL-based invalidation and handle concurrent writes with atomic operations." + + elif "database" in target.question.lower(): + return "If database fails, we have circuit breaker and fallback to cache or return error." + + else: + return "The system is designed with consideration for error handling and recovery mechanisms." + + +async def main(): + """Run the integration test.""" + + test = Stage456IntegrationTest() + results = await test.run_full_test() + + # Print final results as JSON + print("\nFinal Results:") + print(json.dumps(results, indent=2, default=str)) + + +if __name__ == "__main__": + asyncio.run(main()) diff --git a/backend/tests/test_stages_7_8_9_integration.py b/backend/tests/test_stages_7_8_9_integration.py new file mode 100644 index 0000000..82ea2ea --- /dev/null +++ b/backend/tests/test_stages_7_8_9_integration.py @@ -0,0 +1,78 @@ +""" +Integration test for Stages 7-9: SENTINEL, Evaluation loop, Curriculum transition +""" + +import asyncio +import json +from datetime import datetime + +from src.services.runtime_event_orchestrator import RuntimeEventOrchestrator +from src.services.main_agent_viva_orchestrator import MainAgentVivaOrchestrator +from src.services.sentinel_parallel_monitor import SentinelParallelMonitor +from src.services.main_agent_evaluation_loop import MainAgentEvaluationLoop +from src.services.curriculum_question_engine import CurriculumQuestionEngine +from src.models.intelligence_artifact import IntelligenceArtifact + + +async def run_test(): + test_session_id = f"test_session_7_8_9_{datetime.utcnow().isoformat()}" + event_orchestrator = RuntimeEventOrchestrator() + + # Create mock artifact similar to Stage 4 + artifact = IntelligenceArtifact( + session_id=test_session_id, + analysis_duration_seconds=1.0, + project_name="Curriculum Test", + project_type="Web App", + backend_stack={"framework": "FastAPI", "database": "PostgreSQL", "cache": "Redis"}, + architecture_pattern="MVC", + viva_targets=[], + summary="Test", + analysis_confidence=0.9, + ) + + main = MainAgentVivaOrchestrator(artifact) + main.initialize_session(test_session_id) + + # Attach SENTINEL and evaluation loop + sentinel = SentinelParallelMonitor(test_session_id) + evaluation = MainAgentEvaluationLoop(main) + curriculum = CurriculumQuestionEngine(artifact, test_session_id) + + main.attach_sentinel(sentinel) + main.attach_evaluation_loop(evaluation) + main.attach_curriculum_engine(curriculum) + + # Simulate a response with sentinel observations + target_stub = type( + "T", + (), + { + "target_id": "t1", + "question": "How do you cache?", + "category": type("C", (), {"value": "SCALABILITY"}), + "expected_coverage": ["lock", "evict"], + "red_flags": [], + "difficulty": "MEDIUM", + "depth_score": 5.0, + "follow_up_paths": ["Explain locking at code level"], + }, + ) + response = "We use Redis with eviction and locking in critical sections." + + enriched, artifact_eval, follow_up = evaluation.process_finalized_response(1, target_stub, response) + print(json.dumps(enriched, indent=2, default=str)) + + # Start curriculum transition + if curriculum.should_start_transition(implementation_turns_completed=2): + curriculum.start_transition() + q = curriculum.get_next_question() + print("Curriculum question:", q.prompt) + score, completed = curriculum.evaluate_answer(q, "Memory eviction and latency trade-off explained with memory considerations.") + print(score, completed) + + print("SENTINEL events:", len(sentinel.integrity_events)) + + +if __name__ == "__main__": + asyncio.run(run_test()) diff --git a/frontend/app.js b/frontend/app.js deleted file mode 100644 index bbbfbd4..0000000 --- a/frontend/app.js +++ /dev/null @@ -1,481 +0,0 @@ -'use strict'; -// ═══════════════════════════════════════════ -// El — AI Examiner | app.js -// ✅ Always-on voice input (SpeechRecognition) -// ✅ Mic auto-starts on boot, pauses when El speaks -// ✅ Transcribed text auto-sends after speech ends -// ✅ Camera + mic media handled in browser -// ✅ AI calls go to /api/chat (backend — no key here) -// ═══════════════════════════════════════════ - -const BACKEND_URL = 'http://localhost:3333/api/chat'; - -// ── DOM ────────────────────────────────────────────────────────────── -const gate = document.getElementById('gate'); -const gateBtn = document.getElementById('gate-btn'); -const gateLoading = document.getElementById('gate-loading'); -const gateErr = document.getElementById('gate-err'); - -const app = document.getElementById('app'); -const camVideo = document.getElementById('cam-video'); -const camAvatar = document.getElementById('cam-avatar'); -const camSpeakRing = document.getElementById('cam-speak-ring'); -const mmBars = document.querySelectorAll('.mm-bars i'); -const mmMuted = document.getElementById('mm-muted'); -const speakerBtn = document.getElementById('speaker-btn'); -const spkOn = document.getElementById('spk-on'); -const spkOff = document.getElementById('spk-off'); - -const elsOrb = document.getElementById('els-orb'); -const elsState = document.getElementById('els-state'); -const elsWaves = document.getElementById('els-waves'); - -const heroOrb = document.getElementById('hero-orb'); -const heroText = document.getElementById('hero-text'); -const heroSub = document.getElementById('hero-sub'); -const quickChips = document.getElementById('quick-chips'); -const hero = document.getElementById('hero'); -const msgs = document.getElementById('msgs'); -const chatScroll = document.getElementById('chat-scroll'); - -const chatInput = document.getElementById('chat-input'); -const plusBtn = document.getElementById('plus-btn'); -const fileInput = document.getElementById('file-input'); -const fileChips = document.getElementById('file-chips'); -const sendBtn = document.getElementById('send-btn'); -const deniedOverlay = document.getElementById('denied-overlay'); -const listenRing = document.getElementById('listen-ring'); -const listenLabel = document.getElementById('listen-label'); - -// ── STATE ──────────────────────────────────────────────────────────── -let stream = null; -let audioCtx = null; -let analyser = null; -let vizRAF = null; -let voiceEnabled = true; -let isSpeaking = false; -let pendingFiles = []; -let conversation = []; - -// Speech Recognition -const SR = window.SpeechRecognition || window.webkitSpeechRecognition; -let recognition = null; -let isListening = false; -let autoSendTimer = null; -let srEnabled = !!SR; // false if browser doesn't support it - -// ── UTILS ──────────────────────────────────────────────────────────── -const show = el => el && el.classList.remove('hidden'); -const hide = el => el && el.classList.add('hidden'); -const sleep = ms => new Promise(r => setTimeout(r, ms)); -const now = () => new Date().toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' }); - -// ── GATE ───────────────────────────────────────────────────────────── -async function requestPermissionsAndBoot(launchExam = false) { - hide(gateErr); - document.getElementById('gate-btn').style.display = 'none'; - show(gateLoading); - - try { - stream = await navigator.mediaDevices.getUserMedia({ - video: { width: { ideal: 1280 }, height: { ideal: 720 }, facingMode: 'user' }, - audio: { echoCancellation: true, noiseSuppression: true } - }); - - gate.classList.add('out'); - await sleep(450); - gate.style.display = 'none'; - show(app); - bootApp(launchExam); - } catch (err) { - document.getElementById('gate-btn').style.display = ''; - hide(gateLoading); - handlePermErr(err); - } -} - -gateBtn.addEventListener('click', () => requestPermissionsAndBoot(true)); - -// ── BOOT ───────────────────────────────────────────────────────────── -function bootApp(launchExam = false) { - camVideo.srcObject = stream; - camVideo.play().catch(() => {}); - if (stream.getAudioTracks().length > 0) startAnalyser(); - startSession(launchExam); -} - -// ── SESSION START ──────────────────────────────────────────────────── -function startSession(launchExam = false) { - if (heroText) heroText.style.display = 'none'; - - if (launchExam) { - const examMsg = "Let's start the exam. Please begin!"; - addElMsg(examMsg, true); - elSpeak(examMsg); - conversation.push({ role: 'model', parts: [{ text: examMsg }] }); - chatInput.value = "Let's start the exam."; - sendMessage(); - } else { - const h = new Date().getHours(); - const greet = h < 12 ? 'Good morning' : h < 17 ? 'Good afternoon' : 'Good evening'; - const opening = `${greet}! I'm El, your AI Examiner. How can I help you today?`; - addElMsg(opening, true); - elSpeak(opening); - conversation.push({ role: 'model', parts: [{ text: opening }] }); - } - setTimeout(startListening, 800); -} - -// ══════════════════════════════════════════════ -// SPEECH RECOGNITION — always-on voice input -// ══════════════════════════════════════════════ -function initSR() { - if (!srEnabled) return; - - recognition = new SR(); - recognition.lang = 'en-US'; - recognition.continuous = false; // restart loop gives more reliability - recognition.interimResults = true; - recognition.maxAlternatives = 1; - - recognition.onstart = () => { - isListening = true; - setListenUI(true); - }; - - recognition.onresult = (e) => { - let interim = ''; - let final = ''; - for (let i = e.resultIndex; i < e.results.length; i++) { - if (e.results[i].isFinal) final += e.results[i][0].transcript; - else interim += e.results[i][0].transcript; - } - - const spoken = (final || interim).trim(); - if (spoken) { - chatInput.value = spoken; - autoResize(); - } - - if (final.trim()) { - // Small pause to let user finish sentence, then auto-send - clearTimeout(autoSendTimer); - autoSendTimer = setTimeout(() => { - if (chatInput.value.trim()) sendMessage(); - }, 900); - } - }; - - recognition.onend = () => { - isListening = false; - setListenUI(false); - // Restart loop unless El is speaking - if (!isSpeaking && srEnabled) { - setTimeout(startListening, 250); - } - }; - - recognition.onerror = (e) => { - isListening = false; - setListenUI(false); - // 'aborted' and 'no-speech' are normal — just restart - if (e.error !== 'aborted' && e.error !== 'no-speech') { - console.warn('[SR]', e.error); - } - if (!isSpeaking && srEnabled) { - setTimeout(startListening, 400); - } - }; -} - -function startListening() { - if (!srEnabled || isSpeaking || isListening) return; - try { - if (!recognition) initSR(); - recognition.start(); - } catch (_) { - // Already started — ignore - } -} - -function stopListening() { - if (recognition && isListening) { - try { recognition.stop(); } catch (_) {} - isListening = false; - setListenUI(false); - } -} - -function setListenUI(active) { - if (listenRing) listenRing.classList.toggle('active', active); - const listenDot = document.getElementById('listen-dot'); - if (listenDot) listenDot.classList.toggle('on', active); - if (listenLabel) listenLabel.textContent = active ? '🎙️ Listening — speak now…' : 'Speak your answer'; - chatInput.placeholder = active - ? '🎙️ Listening — speak now…' - : 'Write your answer…'; -} - -// ── MIC ANALYSER ───────────────────────────────────────────────────── -function startAnalyser() { - try { - audioCtx = new (window.AudioContext || window.webkitAudioContext)(); - const src = audioCtx.createMediaStreamSource(stream); - analyser = audioCtx.createAnalyser(); - analyser.fftSize = 64; - analyser.smoothingTimeConstant = 0.75; - src.connect(analyser); - const data = new Uint8Array(analyser.frequencyBinCount); - function draw() { - vizRAF = requestAnimationFrame(draw); - analyser.getByteFrequencyData(data); - const avg = data.reduce((a, b) => a + b, 0) / data.length / 255; - mmBars.forEach((bar, i) => { - const idx = Math.floor((i / mmBars.length) * data.length); - bar.style.height = Math.max(4, (data[idx] / 255) * 24) + 'px'; - bar.style.opacity = 0.3 + (data[idx] / 255) * 0.7; - }); - camSpeakRing.classList.toggle('active', avg > 0.05); - } - draw(); - } catch (e) { console.warn('AudioCtx', e); } -} - -// ── EL SPEAKS ──────────────────────────────────────────────────────── -function elSpeak(text) { - if (!voiceEnabled) { - // No TTS — start listening immediately - setTimeout(startListening, 300); - return; - } - window.speechSynthesis.cancel(); - - const utter = new SpeechSynthesisUtterance(text); - utter.rate = 0.95; - utter.pitch = 1.0; - utter.volume = 1.0; - - const voices = window.speechSynthesis.getVoices(); - const prefs = ['Google UK English Female','Microsoft Zira Desktop','Samantha','Karen','Moira']; - for (const name of prefs) { - const v = voices.find(v => v.name === name); - if (v) { utter.voice = v; break; } - } - if (!utter.voice) utter.voice = voices.find(v => v.lang.startsWith('en')) || null; - - utter.onstart = () => { - isSpeaking = true; - stopListening(); // pause mic while El talks - heroOrb.classList.add('speaking'); - elsOrb.classList.add('pulse'); - elsState.textContent = 'Speaking…'; - show(elsWaves); - }; - - utter.onend = utter.onerror = () => { - isSpeaking = false; - heroOrb.classList.remove('speaking'); - elsOrb.classList.remove('pulse'); - elsState.textContent = 'Listening for you…'; - hide(elsWaves); - // Auto-start mic after El finishes - setTimeout(startListening, 400); - }; - - window.speechSynthesis.speak(utter); -} - -if (speechSynthesis.onvoiceschanged !== undefined) - speechSynthesis.onvoiceschanged = () => speechSynthesis.getVoices(); - -speakerBtn.addEventListener('click', () => { - voiceEnabled = !voiceEnabled; - if (!voiceEnabled) window.speechSynthesis.cancel(); - voiceEnabled ? (show(spkOn), hide(spkOff)) : (hide(spkOn), show(spkOff)); -}); - -// ── TOGGLE CAMERA — removed, camera is always-on ──────────────────── - -// ── FILE UPLOAD ────────────────────────────────────────────────────── -plusBtn.addEventListener('click', () => fileInput.click()); -fileInput.addEventListener('change', () => { - Array.from(fileInput.files).forEach(f => { - pendingFiles.push(f); - const chip = document.createElement('div'); - chip.className = 'fc'; - chip.innerHTML = `📎 ${trunc(f.name, 10)} `; - chip.querySelector('button').addEventListener('click', () => { - pendingFiles = pendingFiles.filter(x => x.name !== f.name); - chip.remove(); - }); - fileChips.appendChild(chip); - }); - fileInput.value = ''; -}); -const trunc = (s, n) => s.length <= n ? s : s.slice(0, n) + '…'; - -// ── SEND MESSAGE ───────────────────────────────────────────────────── -async function sendMessage() { - const text = chatInput.value.trim(); - if (!text && pendingFiles.length === 0) return; - - clearTimeout(autoSendTimer); - stopListening(); // pause while processing - - if (isSpeaking) window.speechSynthesis.cancel(); - - // Collapse hero on first message - if (hero && !hero.classList.contains('gone')) { - hero.style.display = 'none'; - hero.classList.add('gone'); - } - - // User bubble - const userEl = document.createElement('div'); - userEl.className = 'msg-user'; - if (text) userEl.textContent = text; - pendingFiles.forEach(f => { - const t = document.createElement('div'); - t.className = 'file-tag'; - t.textContent = `📎 ${f.name}`; - userEl.appendChild(t); - }); - msgs.appendChild(userEl); - scrollBot(); - - const userText = text || `[User uploaded: ${pendingFiles.map(f => f.name).join(', ')}]`; - pendingFiles = []; - fileChips.innerHTML = ''; - chatInput.value = ''; - autoResize(); - - conversation.push({ role: 'user', parts: [{ text: userText }] }); - - sendBtn.disabled = true; - elsState.textContent = 'Thinking…'; - const typingWrap = showTyping(); - - let reply = ''; - try { - reply = await callBackend(conversation); - } catch (e) { - reply = "I had a little trouble connecting. Could you try again?"; - console.error('[chat]', e.message); - } - - typingWrap.remove(); - sendBtn.disabled = false; - - addElMsg(reply, true); - elSpeak(reply); // mic auto-restarts after El finishes - conversation.push({ role: 'model', parts: [{ text: reply }] }); -} - -sendBtn.addEventListener('click', sendMessage); -chatInput.addEventListener('keydown', e => { - if (e.key === 'Enter' && !e.shiftKey) { e.preventDefault(); sendMessage(); } -}); - -// ── CALL BACKEND ───────────────────────────────────────────────────── -async function callBackend(messages) { - const res = await fetch(BACKEND_URL, { - method: 'POST', - headers: { 'Content-Type': 'application/json' }, - body: JSON.stringify({ messages }) - }); - const data = await res.json(); - if (!res.ok) throw new Error(data.error || `HTTP ${res.status}`); - return data.reply; -} - -// ── ADD EL BUBBLE ──────────────────────────────────────────────────── -function addElMsg(text, animate = false) { - const wrap = document.createElement('div'); - wrap.className = 'msg-el'; - const av = document.createElement('div'); - av.className = 'el-av'; - const right = document.createElement('div'); - const bbl = document.createElement('div'); - bbl.className = 'el-bbl'; - const timeEl = document.createElement('div'); - timeEl.className = 'el-time'; - timeEl.innerHTML = `${now()} `; - right.append(bbl, timeEl); - wrap.append(av, right); - msgs.appendChild(wrap); - scrollBot(); - - if (!animate) { bbl.textContent = text; return; } - let i = 0; - const iv = setInterval(() => { - bbl.textContent = text.slice(0, i++); - if (i > text.length) clearInterval(iv); - scrollBot(); - }, 16); -} - -// ── TYPING INDICATOR ───────────────────────────────────────────────── -function showTyping() { - const wrap = document.createElement('div'); - wrap.className = 'msg-el'; - const av = document.createElement('div'); - av.className = 'el-av'; - const bbl = document.createElement('div'); - bbl.className = 'el-bbl'; - bbl.innerHTML = '
'; - wrap.append(av, bbl); - msgs.appendChild(wrap); - scrollBot(); - return wrap; -} - -function useChip(btn) { - chatInput.value = btn.textContent.replace(/^[^\w]+/, '').trim(); - chatInput.focus(); -} - -function autoResize() { - chatInput.style.height = 'auto'; - chatInput.style.height = Math.min(chatInput.scrollHeight, 130) + 'px'; -} -chatInput.addEventListener('input', autoResize); -function scrollBot() { chatScroll.scrollTop = chatScroll.scrollHeight; } - -// ── PERMISSION ERROR ───────────────────────────────────────────────── -function handlePermErr(err) { - if (['NotAllowedError', 'PermissionDeniedError'].includes(err.name)) { - show(deniedOverlay); return; - } - gateErr.textContent = `Could not access media: ${err.message}`; - show(gateErr); -} - -// ── BROWSER CHECK ──────────────────────────────────────────────────── -if (!navigator.mediaDevices || !navigator.mediaDevices.getUserMedia) { - gateBtn.disabled = true; - gateErr.textContent = 'Use Chrome, Firefox, Edge, or Safari over localhost / HTTPS.'; - show(gateErr); -} - -if (!SR) { - console.warn('SpeechRecognition not supported in this browser. Use Chrome for voice input.'); -} - -// ── CLEANUP ────────────────────────────────────────────────────────── -window.addEventListener('beforeunload', () => { - window.speechSynthesis.cancel(); - stopListening(); - if (vizRAF) cancelAnimationFrame(vizRAF); - if (audioCtx) audioCtx.close(); - if (stream) stream.getTracks().forEach(t => t.stop()); -}); - -// ── REMOVE SPLINE LOGO ─────────────────────────────────────────────── -setInterval(() => { - document.querySelectorAll('spline-viewer').forEach(viewer => { - if (viewer.shadowRoot) { - const logo = viewer.shadowRoot.querySelector('#logo'); - if (logo) logo.remove(); - } - }); -}, 1000); diff --git a/frontend/index.html b/frontend/index.html index c3fa418..a478382 100644 --- a/frontend/index.html +++ b/frontend/index.html @@ -1,172 +1,194 @@ - - El — AI Examiner + ORACLE Viva Operations - + - - - +
+
+
+

University Viva Infrastructure

+

ORACLE Viva Operations

+

Session-centered administration for pre-viva preparation, gatekeeper validation, and ORACLE analysis start.

+
+
+ Backend + Checking… + Connected to the exam-session API when available. +
+
- -
-
- - -
-

Hi, I'm El 👋

-

Your personal AI Examiner.
I just need your camera & microphone to get started.

-
- -
- - -
+
- - + + +
+
+ State + +
+
+ Students + 0 +
+
+ Admissions + 0 +
+
+ + - -
-
-
- El - Ready +
+ +
- - -
- MIC -
- -
-
- - -
-
-
-
- - +
+

Audit-safe event log

+
+
+ + +
+
+
+

Stage 3

+

ORACLE analysis start

-

- -
-
-
+
+

Analysis status

+
Waiting for a live session and successful gatekeeper admission.
+
- -
-
- -
-
- -
-
+
+

Attached artifacts

+
- - -
- -
-
- -
- - Speak your answer -
-
- - - + - - \ No newline at end of file + diff --git a/frontend/orb.js b/frontend/orb.js index d2870db..99a6838 100644 --- a/frontend/orb.js +++ b/frontend/orb.js @@ -1,3 +1,328 @@ +"use strict"; + +const API_BASE = 'http://localhost:8000'; +const stageLabels = ['DRAFT', 'CONFIGURED', 'READY', 'LIVE', 'ACTIVE_VIVA', 'COMPLETED', 'ARCHIVED']; + +const state = { + sessions: [], + selectedSessionId: null, + latestDecision: null, + latestOracle: null, + latestEvents: [], +}; + +const $ = (id) => document.getElementById(id); + +const els = { + backendStatus: $('backend-status'), + backendNote: $('backend-note'), + stageStrip: $('stage-strip'), + sessionSelect: $('session-select'), + sessionState: $('session-state'), + sessionCount: $('session-count'), + admissionCount: $('admission-count'), + gatekeeperOutput: $('gatekeeper-output'), + oracleOutput: $('oracle-output'), + eventFeed: $('event-feed'), + artifactList: $('artifact-list'), + sessionList: $('session-list'), +}; + +function toDateTimeLocal(value) { + if (!value) return ''; + const date = new Date(value); + const pad = (n) => String(n).padStart(2, '0'); + return `${date.getFullYear()}-${pad(date.getMonth() + 1)}-${pad(date.getDate())}T${pad(date.getHours())}:${pad(date.getMinutes())}`; +} + +function parseDateTimeLocal(value) { + return value ? new Date(value).toISOString() : null; +} + +function parseRubric() { + const lines = $('rubric-criteria').value.split('\n').map((line) => line.trim()).filter(Boolean); + const criteria = lines.map((line, index) => { + const [name = `Criterion ${index + 1}`, score = '10', description = ''] = line.split('|').map((part) => part.trim()); + return { name, max_score: Number(score) || 10, description: description || null }; + }); + return { + title: $('rubric-title').value.trim() || 'Default Viva Rubric', + criteria, + }; +} + +function parseTimingWindow() { + return { + opens_at: parseDateTimeLocal($('opens-at').value), + closes_at: parseDateTimeLocal($('closes-at').value), + viva_duration_minutes: Number($('duration').value) || 15, + check_in_grace_minutes: Number($('grace').value) || 5, + }; +} + +function parseConfig() { + return { + subject: $('subject').value.trim(), + course: $('course').value.trim(), + semester: $('semester').value.trim(), + subject_code: $('subject-code').value.trim() || null, + instructor_name: $('instructor').value.trim() || null, + exam_coordinator: $('coordinator').value.trim() || null, + timing_window: parseTimingWindow(), + rubric: parseRubric(), + }; +} + +function parseSubmissions() { + return $('student-submissions').value + .split('\n') + .map((line) => line.trim()) + .filter(Boolean) + .map((line) => { + const [roll_number = '', repository_url = '', documents = '', batch_label = ''] = line.split('|').map((part) => part.trim()); + return { + roll_number, + repository_url: repository_url || null, + document_paths: documents ? documents.split(',').map((item) => item.trim()).filter(Boolean) : [], + batch_label: batch_label || null, + }; + }); +} + +async function request(path, options = {}) { + const response = await fetch(`${API_BASE}${path}`, { + headers: { 'Content-Type': 'application/json', ...(options.headers || {}) }, + ...options, + }); + const data = await response.json(); + if (!response.ok) { + throw new Error(data?.detail || data?.error || `Request failed: ${response.status}`); + } + return data; +} + +function setBackendStatus(ok, note) { + els.backendStatus.textContent = ok ? 'Online' : 'Offline'; + els.backendStatus.classList.toggle('ok', ok); + els.backendNote.textContent = note; +} + +function renderStageStrip(session) { + const current = session?.state || 'DRAFT'; + const currentIndex = stageLabels.indexOf(current); + els.stageStrip.innerHTML = stageLabels.map((label) => ` +
+ ${label.replaceAll('_', ' ')} +
+ `).join(''); +} + +function renderSessions() { + const sessions = state.sessions; + els.sessionSelect.innerHTML = sessions.length + ? sessions.map((session) => ``).join('') + : ''; + els.sessionList.innerHTML = sessions.length + ? sessions.map((session) => ` + + `).join('') + : '

Create a draft session to begin the lifecycle.

'; + document.querySelectorAll('[data-session]').forEach((button) => { + button.addEventListener('click', () => selectSession(button.dataset.session)); + }); +} + +function renderCurrentSession(session) { + renderStageStrip(session); + els.sessionState.textContent = session?.state || '—'; + els.sessionCount.textContent = session?.assigned_students?.length ?? 0; + els.admissionCount.textContent = session?.gatekeeper_decisions?.length ?? 0; + if (session?.gatekeeper_decisions?.length) { + state.latestDecision = session.gatekeeper_decisions[session.gatekeeper_decisions.length - 1]; + } + if (session?.analysis_artifacts?.length) { + state.latestOracle = session.analysis_artifacts[session.analysis_artifacts.length - 1]; + } + els.gatekeeperOutput.textContent = state.latestDecision + ? JSON.stringify(state.latestDecision, null, 2) + : 'No admission decision recorded for this session.'; + els.oracleOutput.textContent = state.latestOracle + ? JSON.stringify(state.latestOracle, null, 2) + : 'ORACLE has not started yet.'; + els.artifactList.innerHTML = session?.analysis_artifacts?.length + ? session.analysis_artifacts.slice().reverse().map((artifact) => ` +
+ ${artifact.artifact_type} +
${escapeHtml(JSON.stringify(artifact.payload, null, 2))}
+
+ `).join('') + : '

Artifacts will appear here after ORACLE analysis starts.

'; + + const auditEvents = session?.audit_events || []; + state.latestEvents = auditEvents.slice().reverse(); + els.eventFeed.innerHTML = auditEvents.length + ? auditEvents.slice().reverse().map((event) => ` +
+
+ ${event.event_type} + ${new Date(event.timestamp).toLocaleString()} +
+ ${event.actor} +
+ `).join('') + : '

Audit-safe events will be recorded here.

'; +} + +function escapeHtml(value) { + return value + .replaceAll('&', '&') + .replaceAll('<', '<') + .replaceAll('>', '>') + .replaceAll('"', '"'); +} + +async function loadSessions(preselect = null) { + try { + const data = await request('/exam-sessions'); + state.sessions = data.items || []; + const nextSelection = preselect || state.selectedSessionId || state.sessions[0]?.session_id || null; + state.selectedSessionId = nextSelection; + renderSessions(); + if (nextSelection) { + await selectSession(nextSelection, { silent: true }); + } else { + renderCurrentSession(null); + } + setBackendStatus(true, `${state.sessions.length} session(s) available from the exam-session API.`); + } catch (error) { + setBackendStatus(false, error.message); + renderCurrentSession(null); + } +} + +async function selectSession(sessionId, options = {}) { + if (!sessionId) return; + state.selectedSessionId = sessionId; + $('session-select').value = sessionId; + const payload = await request(`/exam-sessions/${sessionId}`); + const session = payload.session; + renderSessions(); + renderCurrentSession(session); + if (!options.silent) { + setBackendStatus(true, `Loaded ${session.title} (${session.state}).`); + } +} + +async function createSession() { + const payload = { + admin_id: $('admin-id').value.trim(), + title: $('session-title').value.trim(), + }; + const data = await request('/exam-sessions', { + method: 'POST', + body: JSON.stringify(payload), + }); + await loadSessions(data.session.session_id); +} + +async function configureSession() { + const sessionId = state.selectedSessionId; + if (!sessionId) throw new Error('Create or select a session first.'); + await request(`/exam-sessions/${sessionId}/configure`, { + method: 'POST', + body: JSON.stringify({ config: parseConfig() }), + }); + await loadSessions(sessionId); +} + +async function assignStudents() { + const sessionId = state.selectedSessionId; + if (!sessionId) throw new Error('Create or select a session first.'); + await request(`/exam-sessions/${sessionId}/students`, { + method: 'POST', + body: JSON.stringify({ submissions: parseSubmissions() }), + }); + await loadSessions(sessionId); +} + +async function markReady() { + const sessionId = state.selectedSessionId; + if (!sessionId) throw new Error('Create or select a session first.'); + await request(`/exam-sessions/${sessionId}/ready`, { method: 'POST', body: '{}' }); + await loadSessions(sessionId); +} + +async function activateSession() { + const sessionId = state.selectedSessionId; + if (!sessionId) throw new Error('Create or select a session first.'); + await request(`/exam-sessions/${sessionId}/activate`, { method: 'POST', body: '{}' }); + await loadSessions(sessionId); +} + +async function runGatekeeper() { + const sessionId = state.selectedSessionId; + if (!sessionId) throw new Error('Create or select a session first.'); + const roll_number = $('roll-number').value.trim(); + const data = await request(`/exam-sessions/${sessionId}/gatekeeper/precheck`, { + method: 'POST', + body: JSON.stringify({ roll_number }), + }); + state.latestDecision = data.decision; + renderCurrentSession(data.session); + setBackendStatus(true, data.decision.admitted ? `Admission granted for ${data.decision.student_roll_number}.` : `Admission rejected: ${data.decision.reason}`); +} + +async function startOracle() { + const sessionId = state.selectedSessionId; + if (!sessionId) throw new Error('Create or select a session first.'); + const roll_number = $('roll-number').value.trim(); + const data = await request(`/exam-sessions/${sessionId}/oracle/start`, { + method: 'POST', + body: JSON.stringify({ roll_number }), + }); + await loadSessions(sessionId); + setBackendStatus(true, `ORACLE analysis attached to ${data.session.title}.`); +} + +function bindEvents() { + $('refresh-sessions').addEventListener('click', () => loadSessions()); + $('create-session').addEventListener('click', async () => handleAction(createSession)); + $('configure-session').addEventListener('click', async () => handleAction(configureSession)); + $('assign-students').addEventListener('click', async () => handleAction(assignStudents)); + $('set-ready').addEventListener('click', async () => handleAction(markReady)); + $('activate-session').addEventListener('click', async () => handleAction(activateSession)); + $('gatekeeper-check').addEventListener('click', async () => handleAction(runGatekeeper)); + $('start-oracle').addEventListener('click', async () => handleAction(startOracle)); + $('session-select').addEventListener('change', (event) => selectSession(event.target.value)); +} + +async function handleAction(action) { + try { + await action(); + } catch (error) { + setBackendStatus(false, error.message); + } +} + +function seedTimeline() { + els.stageStrip.innerHTML = stageLabels.map((label) => ` +
${label.replaceAll('_', ' ')}
+ `).join(''); +} + +async function boot() { + seedTimeline(); + bindEvents(); + $('opens-at').value = toDateTimeLocal(new Date(Date.now() + 60 * 60 * 1000).toISOString()); + $('closes-at').value = toDateTimeLocal(new Date(Date.now() + 4 * 60 * 60 * 1000).toISOString()); + await loadSessions(); +} + +boot(); /** * El Orb — Vanilla WebGL (ported from OGL React component) * Non-interactive: hover = 0 always, no mouse events. diff --git a/frontend/style.css b/frontend/style.css index 5e38d87..0c960a4 100644 --- a/frontend/style.css +++ b/frontend/style.css @@ -1,528 +1,284 @@ -*,*::before,*::after{box-sizing:border-box;margin:0;padding:0} -:root{ - --bg:#f0f0f5; - --surface:#ffffff; - --surface2:#f7f7fb; - --border:rgba(0,0,0,.07); - --accent:#6c2fff; - --accent2:#9b72ff; - --accent-soft:rgba(108,47,255,.1); - --green:#22c55e; - --red:#ef4444; - --text1:#18181b; - --text2:#71717a; - --text3:#d4d4d8; - --shadow-sm:0 2px 12px rgba(0,0,0,.07); - --shadow:0 8px 32px rgba(0,0,0,.1); - --shadow-lg:0 24px 64px rgba(0,0,0,.14); -} -html,body{height:100%;font-family:'Inter',system-ui,sans-serif;background:var(--bg);color:var(--text1);overflow:hidden} -.hidden{display:none!important} - -/* ════ GATE ════ */ -.gate{ - position:fixed;inset:0;display:flex;flex-direction:column; - align-items:center;justify-content:center;gap:24px; - padding:40px 24px;text-align:center; - background:linear-gradient(160deg,#f5f5ff 0%,#ece9ff 100%); - z-index:200;transition:opacity .45s,transform .45s; -} -.gate.out{opacity:0;transform:scale(.96);pointer-events:none} -.gate-orb-wrap{ - position:relative;width:280px;height:280px; - display:flex;align-items:center;justify-content:center; - margin-bottom:4px; -} - -spline-viewer::part(logo) { - display: none !important; -} - -/* ─── Rings must combine centering + float in ONE transform ─── */ -@keyframes ring-float{ - from{ transform:translate(-50%,-50%) translateY(0) } - to { transform:translate(-50%,-50%) translateY(-10px) } -} - -.gate-ring { - position: absolute; - border-radius: 50%; - top: 50%; left: 50%; - transform: translate(-50%, -50%); - animation: ring-float 3s ease-in-out infinite alternate; - pointer-events: none; -} - -/* Inner ring — darkest, right at the orb edge */ -.gate-ring-1 { - width: 240px; height: 240px; - border: 4px solid rgba(88, 32, 210, 0.92); - box-shadow: - 0 0 28px rgba(108,47,255,0.45), - inset 0 0 20px rgba(108,47,255,0.28); - z-index: 5; -} - -/* Middle ring */ -.gate-ring-2 { - width: 296px; height: 296px; - border: 2px solid rgba(130, 80, 240, 0.40); - z-index: 4; - animation-delay: 0.15s; -} - -/* Outer ring — faintest */ -.gate-ring-3 { - width: 352px; height: 352px; - border: 1.5px solid rgba(160, 120, 245, 0.18); - z-index: 3; - animation-delay: 0.30s; -} - -/* ─── Orb ─── */ -.gate-orb{ - width:200px;height:200px;border-radius:50%; - position:relative;overflow:hidden; - flex-shrink:0;z-index:10; - /* Blue-sky atmosphere base */ +:root { + color-scheme: dark; + --bg: #0c1118; + --bg-elevated: #121925; + --panel: #172131; + --panel-2: #1d2838; + --line: rgba(255, 255, 255, 0.08); + --text: #e8edf5; + --muted: #97a6ba; + --accent: #d7b46a; + --accent-2: #74a9ff; + --good: #3ddc97; + --bad: #ff7b72; + --shadow: 0 24px 80px rgba(0, 0, 0, 0.34); +} + +* { box-sizing: border-box; } + +html, body { + margin: 0; + min-height: 100%; background: - radial-gradient(ellipse at 50% 25%, rgba(110,185,255,0.90) 0%, transparent 55%), - radial-gradient(ellipse at 50% 80%, rgba(155,110,240,0.82) 0%, transparent 55%), - linear-gradient(145deg, rgba(130,190,255,0.65) 0%, rgba(175,140,245,0.75) 100%); - box-shadow: - 0 0 0 2.5px rgba(255,255,255,0.92), - inset 0 4px 35px rgba(255,255,255,0.72), - 0 22px 70px rgba(108,47,255,0.34); - animation:orb-float 3s ease-in-out infinite alternate; -} - -/* Photorealistic cloud blobs */ -.gate-orb::before{ - content:'';position:absolute; - inset:-12%;border-radius:50%; - background: - /* Large central white cloud mass */ - radial-gradient(ellipse 80% 65% at 38% 50%, rgba(255,255,255,1) 0%, rgba(255,255,255,0.80) 28%, transparent 60%), - /* Upper-right cloud puff */ - radial-gradient(ellipse 55% 50% at 66% 28%, rgba(255,255,255,0.96) 0%, rgba(255,255,255,0.45) 40%, transparent 65%), - /* Sky blue – left edge */ - radial-gradient(ellipse 48% 80% at 2% 48%, rgba(75,170,255,0.90) 0%, transparent 60%), - /* Sky blue – upper centre */ - radial-gradient(ellipse 72% 44% at 54% 2%, rgba(95,185,255,0.85) 0%, transparent 58%), - /* Rich purple cloud – lower right */ - radial-gradient(ellipse 58% 55% at 74% 74%, rgba(165,100,250,0.90) 0%, transparent 58%), - /* Purple shadow – lower left */ - radial-gradient(ellipse 46% 44% at 20% 80%, rgba(140,100,230,0.75) 0%, transparent 55%), - /* Bright specular highlight */ - radial-gradient(circle at 78% 16%, rgba(255,255,255,1) 0%, transparent 18%); - filter:blur(13px); - animation:orb-drift 9s ease-in-out infinite; -} - -/* Glass-sphere rim darkening */ -.gate-orb::after{ - content:'';position:absolute;inset:0;border-radius:50%; - background:radial-gradient(ellipse at 50% 50%, - transparent 44%, - rgba(108,80,200,0.06) 64%, - rgba(90,65,185,0.20) 82%, - rgba(75,50,170,0.42) 100% - ); -} - -@keyframes orb-float{from{transform:translateY(0)}to{transform:translateY(-10px)}} -@keyframes orb-hue{0%{filter:hue-rotate(0deg)}100%{filter:hue-rotate(25deg)}} -@keyframes orb-drift{ - 0% {transform:translate(0%,0%) scale(1) rotate(0deg)} - 33% {transform:translate(4%,-3%) scale(1.04) rotate(6deg)} - 66% {transform:translate(-3%,4%) scale(0.97) rotate(-4deg)} - 100%{transform:translate(0%,0%) scale(1) rotate(0deg)} -} -.gate-title{ - font-family:'Playfair Display',Georgia,serif; - font-size:2rem;font-weight:600; - letter-spacing:-.01em;color:var(--text1); - line-height:1.2; -} -.gate-title span{color:var(--accent);font-style:italic} -.gate-sub{font-size:1rem;color:var(--text2);line-height:1.7;max-width:360px} -.gate-btn{ - display:inline-flex;align-items:center;gap:12px; - padding:16px 38px;border-radius:50px;border:none; - background:var(--accent);color:#fff;font-size:1rem;font-weight:600; - cursor:pointer;font-family:inherit; - box-shadow:0 8px 28px rgba(108,47,255,.42); - transition:background .2s,transform .15s,box-shadow .2s; -} -.gate-btn:hover{background:#5a1eee;transform:translateY(-2px);box-shadow:0 12px 36px rgba(108,47,255,.52)} -.gate-btn:active{transform:translateY(0)} -.gate-btn:disabled{opacity:.5;cursor:not-allowed;transform:none} -.gate-btns{display:flex;flex-direction:column;align-items:center;gap:12px;width:100%} -.gate-btn-secondary{ - background:transparent;color:var(--accent); - border:2px solid var(--accent); - box-shadow:none; -} -.gate-btn-secondary:hover{background:var(--accent-soft);border-color:#5a1eee;color:#5a1eee;box-shadow:none;transform:translateY(-2px)} -.gate-loading{display:flex;align-items:center;gap:7px;font-size:.9rem;color:var(--text2)} -.gl-dot{width:7px;height:7px;border-radius:50%;background:var(--accent);animation:db 1.2s ease-in-out infinite} -.gl-dot:nth-child(2){animation-delay:.2s}.gl-dot:nth-child(3){animation-delay:.4s} -@keyframes db{0%,80%,100%{transform:translateY(0);opacity:.35}40%{transform:translateY(-6px);opacity:1}} -.gate-err{font-size:.88rem;color:var(--red);background:rgba(239,68,68,.07);border:1px solid rgba(239,68,68,.2);border-radius:12px;padding:12px 20px;max-width:360px} - -/* ════ APP ════ */ -.app{display:flex;flex-direction:column;height:100vh;overflow:hidden;background:var(--bg)} - -/* Chrome bar */ -.chrome{ - display:flex;align-items:center;height:48px;padding:0 20px; - background:rgba(255,255,255,.9);backdrop-filter:blur(18px); - border-bottom:1px solid var(--border);flex-shrink:0;position:relative;z-index:10; -} -.chrome-dots{display:flex;gap:7px} -.dot{width:13px;height:13px;border-radius:50%} -.dot.red{background:#ff5f57}.dot.yellow{background:#febc2e}.dot.green{background:#28c840} -.chrome-title{ - position:absolute;left:50%;transform:translateX(-50%); - display:flex;align-items:center;gap:9px; - font-size:.9rem;font-weight:600;color:var(--text2);user-select:none; -} -.chrome-logo{ - width:24px;height:24px;border-radius:7px;background:var(--accent); - display:flex;align-items:center;justify-content:center; - font-size:.68rem;font-weight:800;color:#fff; -} -.chrome-right{margin-left:auto;display:flex;gap:5px} -.chrome-btn{ - width:32px;height:32px;border-radius:8px;border:none;background:transparent; - color:var(--text2);display:flex;align-items:center;justify-content:center; - cursor:pointer;transition:background .15s,color .15s; -} -.chrome-btn:hover{background:var(--accent-soft);color:var(--accent)} - -/* ════ CAM STRIP ════ */ -.cam-strip{ - display:flex;align-items:center;gap:16px;padding:12px 20px; - background:rgba(255,255,255,.92);backdrop-filter:blur(12px); - border-bottom:1px solid var(--border);flex-shrink:0; -} -.cam-tile{ - position:relative;width:220px;aspect-ratio:16/9; - border-radius:14px;overflow:hidden; - background:#18181b;border:2px solid rgba(255,255,255,.1); - box-shadow:0 4px 20px rgba(0,0,0,.12);flex-shrink:0; -} -#cam-video{width:100%;height:100%;object-fit:cover;transform:scaleX(-1);display:block} -.cam-avatar{position:absolute;inset:0;background:#18181b;display:flex;align-items:center;justify-content:center;color:#52525b} -.cam-speak-ring{position:absolute;inset:0;border-radius:inherit;border:3px solid transparent;pointer-events:none;transition:border-color .1s,box-shadow .1s} -.cam-speak-ring.active{border-color:var(--green);box-shadow:inset 0 0 0 1px var(--green),0 0 16px rgba(34,197,94,.4)} -.cam-tag{position:absolute;bottom:8px;left:10px;font-size:.7rem;font-weight:600;color:rgba(255,255,255,.85);background:rgba(0,0,0,.45);backdrop-filter:blur(4px);border-radius:5px;padding:2px 9px} -.cam-btn{position:absolute;top:8px;right:8px;width:26px;height:26px;border-radius:6px;border:none;background:rgba(0,0,0,.38);color:#fff;display:flex;align-items:center;justify-content:center;cursor:pointer;transition:background .15s} -.cam-btn:hover{background:rgba(0,0,0,.62)} - -/* El status card */ -.el-status{ - display:flex;align-items:center;gap:12px;padding:10px 16px; - background:var(--surface2);border:1px solid var(--border); - border-radius:14px;box-shadow:var(--shadow-sm);flex-shrink:0; -} -.els-orb{ - width:46px;height:46px;border-radius:50%;flex-shrink:0; - position:relative;overflow:hidden; - background:radial-gradient(ellipse at 55% 48%, - #e2e0f9 0%, - #d3d0f0 40%, - rgba(200,196,230,0.65) 75%, - rgba(210,205,240,0.35) 100% - ); - box-shadow: - 0 0 0 1px rgba(190,180,240,0.28), - 0 6px 20px rgba(120,100,200,0.22), - 0 0 36px rgba(160,140,230,0.14); -} -.els-orb::before{ - content:'';position:absolute; - inset:-12%;border-radius:50%; - background: - radial-gradient(circle at 42% 42%, rgba(50,190,255,0.82) 0%, transparent 30%), - radial-gradient(circle at 22% 60%, rgba(70,95,240,0.50) 0%, transparent 28%), - radial-gradient(circle at 70% 66%, rgba(175,110,255,0.58) 0%, transparent 33%), - radial-gradient(circle at 72% 22%, rgba(255,255,255,0.95) 0%, transparent 24%), - radial-gradient(circle at 50% 50%, rgba(210,225,255,0.22) 0%, transparent 55%); - filter:blur(6px); - animation:orb-drift 9s ease-in-out infinite; -} -.els-orb::after{ - content:'';position:absolute;inset:0;border-radius:50%; - background:radial-gradient(ellipse at 50% 50%, - transparent 52%, - rgba(130,115,200,0.10) 72%, - rgba(110,95,185,0.22) 100% - ); -} -.els-orb.pulse{animation:orb-float .5s ease-in-out infinite alternate} -.els-text{display:flex;flex-direction:column;gap:2px} -.els-name{font-size:.88rem;font-weight:700;color:var(--text1)} -.els-state{font-size:.76rem;color:var(--text2)} -.els-waves{display:flex;align-items:center;gap:2px;height:22px} -.els-waves b{display:block;width:3px;height:4px;border-radius:2px;background:var(--accent);opacity:.75;animation:wb .8s ease-in-out infinite alternate;font-style:normal} -.els-waves b:nth-child(1){animation-delay:0s}.els-waves b:nth-child(2){animation-delay:.1s} -.els-waves b:nth-child(3){animation-delay:.2s}.els-waves b:nth-child(4){animation-delay:.1s} -.els-waves b:nth-child(5){animation-delay:0s} -@keyframes wb{from{height:4px;opacity:.3}to{height:20px;opacity:1}} - -/* Mic meter */ -.mic-meter{display:flex;flex-direction:column;gap:5px} -.mm-label{font-size:.62rem;font-weight:700;letter-spacing:.1em;color:var(--text3)} -.mm-bars{display:flex;align-items:center;gap:3px;height:30px} -.mm-bars i{display:block;width:4px;height:4px;border-radius:2px;background:var(--green);opacity:.4;transition:height .07s ease} -.mm-muted{font-size:.62rem;font-weight:700;letter-spacing:.1em;color:var(--red);background:rgba(239,68,68,.09);border-radius:5px;padding:3px 7px} - -/* ════ CHAT SCROLL ════ */ -.chat-scroll{flex:1;overflow-y:auto;display:flex;flex-direction:column;padding:32px 28px 10px;scroll-behavior:smooth} -.chat-scroll::-webkit-scrollbar{width:4px} -.chat-scroll::-webkit-scrollbar-thumb{background:var(--text3);border-radius:4px} - -/* Hero */ -.hero{display:flex;flex-direction:column;align-items:center;text-align:center;gap:16px;padding-bottom:32px;flex-shrink:0} -.hero-orb-wrap{position:relative;width:260px;height:260px;display:flex;align-items:center;justify-content:center;margin-bottom:12px} -.hero-orb-glow{position:absolute;width:100%;height:100%;border-radius:50%;background:radial-gradient(circle,rgba(108,47,255,.12) 0%,transparent 65%);animation:orb-float 3s ease-in-out infinite alternate} - -.hero-ring { - position: absolute; - border-radius: 50%; - inset: 0; - margin: auto; - animation: orb-float 3s ease-in-out infinite alternate; - pointer-events: none; -} - -.hero-ring-1 { - width: 154px; - height: 154px; - border: 3.5px solid rgba(108, 47, 255, 0.85); - box-shadow: 0 0 16px rgba(108, 47, 255, 0.3), inset 0 0 12px rgba(108, 47, 255, 0.2); - z-index: 5; -} - -.hero-ring-2 { - width: 186px; - height: 186px; - border: 2px solid rgba(108, 47, 255, 0.35); - z-index: 4; - animation-delay: 0.1s; -} - -.hero-ring-3 { - width: 222px; - height: 222px; - border: 1.5px solid rgba(108, 47, 255, 0.15); - z-index: 3; - animation-delay: 0.2s; -} - -.hero-orb{ - width:130px;height:130px;border-radius:50%;position:relative;z-index:10; - overflow:hidden; - background:radial-gradient(ellipse at 55% 48%, - #e2e0f9 0%, - #d3d0f0 40%, - rgba(200,196,230,0.65) 75%, - rgba(210,205,240,0.35) 100% - ); - box-shadow: - 0 0 0 1px rgba(255,255,255,0.7), - inset 0 0 16px rgba(255,255,255,0.6), - 0 12px 32px rgba(108,47,255,0.25); - animation:orb-float 3s ease-in-out infinite alternate; -} -.hero-orb::before{ - content:'';position:absolute; - inset:-12%;border-radius:50%; - background: - radial-gradient(circle at 42% 42%, rgba(50,190,255,0.82) 0%, transparent 30%), - radial-gradient(circle at 22% 60%, rgba(70,95,240,0.50) 0%, transparent 28%), - radial-gradient(circle at 70% 66%, rgba(175,110,255,0.58) 0%, transparent 33%), - radial-gradient(circle at 72% 22%, rgba(255,255,255,0.95) 0%, transparent 24%), - radial-gradient(circle at 50% 50%, rgba(210,225,255,0.22) 0%, transparent 55%); - filter:blur(16px); - animation:orb-drift 9s ease-in-out infinite; -} -.hero-orb::after{ - content:'';position:absolute;inset:0;border-radius:50%; - background:radial-gradient(ellipse at 50% 50%, - transparent 52%, - rgba(130,115,200,0.10) 72%, - rgba(110,95,185,0.22) 100% - ); -} -.hero-orb.speaking{ - animation:orb-float .4s ease-in-out infinite alternate; - box-shadow: - 0 0 0 16px rgba(108,47,255,.10), - 0 0 80px rgba(160,140,230,0.35), - 0 22px 60px rgba(120,100,200,0.2); -} - -.hero-text{font-size:1rem;font-weight:600;color:var(--text1);line-height:1.7;max-width:400px;min-height:1.7em;text-align:center} -.hero-line2{font-size:.92rem;font-weight:400;color:var(--text2)} -.hero-text .cur{display:inline-block;width:2px;height:1.1em;background:var(--accent);margin-left:2px;vertical-align:text-bottom;animation:blink .9s step-end infinite} -@keyframes blink{0%,100%{opacity:1}50%{opacity:0}} -.hero-sub{font-size:.9rem;color:var(--text2)} - -/* Suggestion chips */ -.quick-chips{display:flex;flex-wrap:wrap;gap:10px;justify-content:center;max-width:560px} -.qchip{ - padding:10px 20px;border-radius:50px;border:1px solid var(--border); - background:var(--surface);color:var(--text1);font-size:.88rem;font-weight:500; - cursor:pointer;font-family:inherit;box-shadow:var(--shadow-sm); - transition:background .15s,border-color .15s,transform .15s; -} -.qchip:hover{background:var(--accent-soft);border-color:var(--accent2);transform:translateY(-2px)} - -/* ════ MESSAGES ════ */ -.msgs{display:flex;flex-direction:column;gap:16px;padding-top:8px} - -/* El bubble */ -.msg-el{display:flex;gap:12px;align-items:flex-start;max-width:80%;align-self:flex-start} -.el-av{ - width:32px;height:32px;border-radius:50%;flex-shrink:0; - position:relative;overflow:hidden; - background:radial-gradient(ellipse at 55% 48%, - #e2e0f9 0%, - #d3d0f0 40%, - rgba(200,196,230,0.65) 75%, - rgba(210,205,240,0.35) 100% - ); - box-shadow: - 0 0 0 1px rgba(190,180,240,0.28), - 0 4px 14px rgba(120,100,200,0.2); - margin-top:2px; -} -.el-av::before{ - content:'';position:absolute; - inset:-12%;border-radius:50%; - background: - radial-gradient(circle at 42% 42%, rgba(50,190,255,0.82) 0%, transparent 30%), - radial-gradient(circle at 22% 60%, rgba(70,95,240,0.50) 0%, transparent 28%), - radial-gradient(circle at 70% 66%, rgba(175,110,255,0.58) 0%, transparent 33%), - radial-gradient(circle at 72% 22%, rgba(255,255,255,0.95) 0%, transparent 24%); - filter:blur(4px); - animation:orb-drift 9s ease-in-out infinite; -} -.el-av::after{ - content:'';position:absolute;inset:0;border-radius:50%; - background:radial-gradient(ellipse at 50% 50%, - transparent 52%, - rgba(110,95,185,0.18) 100% - ); -} -.el-bbl{ - background:var(--surface);border:1px solid var(--border); - border-radius:4px 20px 20px 20px; - padding:13px 18px;font-size:.96rem;line-height:1.62;color:var(--text1); - box-shadow:var(--shadow-sm);word-break:break-word; -} -.el-time{font-size:.72rem;color:var(--text3);margin-top:5px;padding-left:3px;display:flex;align-items:center;gap:4px} - -/* User bubble */ -.msg-user{ - align-self:flex-end;max-width:70%; - background:var(--surface);border:1px solid var(--border); - padding:13px 18px;border-radius:20px 20px 4px 20px; - font-size:.96rem;line-height:1.55;color:var(--text1); - box-shadow:var(--shadow-sm);word-break:break-word;font-weight:500; -} - -/* Typing dots */ -.tdots{display:flex;gap:5px;padding:3px 0} -.tdots span{width:8px;height:8px;border-radius:50%;background:var(--text3);animation:db 1.2s ease-in-out infinite} -.tdots span:nth-child(2){animation-delay:.2s}.tdots span:nth-child(3){animation-delay:.4s} -.file-tag{display:inline-flex;align-items:center;gap:6px;padding:5px 12px;background:var(--accent-soft);border:1px solid var(--accent2);border-radius:8px;font-size:.82rem;font-weight:500;color:var(--accent);margin-top:6px} - -/* ════ INPUT BAR ════ */ -.input-bar{ - display:flex;align-items:center;gap:12px;padding:14px 20px 20px; - background:rgba(255,255,255,.94);backdrop-filter:blur(18px); - border-top:1px solid var(--border);flex-shrink:0; -} -.bar-pill-btn{ - display:inline-flex;align-items:center; - padding:11px 22px;border-radius:50px;border:none; - background:var(--accent);color:#fff;font-size:.9rem;font-weight:600; - cursor:pointer;font-family:inherit;white-space:nowrap;flex-shrink:0; - box-shadow:0 4px 16px rgba(108,47,255,.38);transition:background .2s,transform .15s; -} -.bar-pill-btn:hover{background:#5a1eee;transform:translateY(-1px)} -.bar-pill-btn:active{transform:translateY(0)} -.bar-input-wrap{flex:1;display:flex;align-items:flex-end;border:1.5px solid var(--border);border-radius:24px;padding:8px 16px;background:var(--surface2);transition:border-color .2s,box-shadow .2s} -.bar-input-wrap:focus-within{border-color:var(--accent2);box-shadow:0 0 0 3px rgba(108,47,255,.1);background:var(--surface)} -.bar-textarea{ - width:100%;resize:none;border:none;outline:none; - background:transparent;font-size:.95rem;font-family:inherit; - color:var(--text1);line-height:1.5;max-height:130px;overflow-y:auto; -} -.bar-textarea::placeholder{color:var(--text3)} -.bar-actions{display:flex;align-items:center;gap:8px;flex-shrink:0} -.bar-act-btn{ - width:38px;height:38px;border-radius:50%;border:none;background:transparent; - color:var(--text2);display:flex;align-items:center;justify-content:center; - cursor:pointer;transition:background .15s,color .15s,transform .15s; -} -.bar-act-btn:hover{background:var(--accent-soft);color:var(--accent);transform:scale(1.1)} -.file-chips{display:flex;flex-wrap:wrap;gap:5px;align-items:center;max-width:150px} -.fc{display:flex;align-items:center;gap:4px;background:var(--accent-soft);border:1px solid var(--accent2);border-radius:7px;padding:4px 11px;font-size:.76rem;font-weight:500;color:var(--accent);white-space:nowrap;max-width:120px;overflow:hidden;text-overflow:ellipsis} -.fc button{border:none;background:transparent;color:var(--accent);cursor:pointer;font-size:.76rem;padding:0;line-height:1} -.bar-send-btn{ - width:40px;height:40px;border-radius:50%;border:none; - background:var(--accent);color:#fff; - display:flex;align-items:center;justify-content:center;cursor:pointer; - box-shadow:0 4px 14px rgba(108,47,255,.38);transition:background .2s,transform .15s; - flex-shrink:0; -} -.bar-send-btn:hover{background:#5a1eee;transform:scale(1.1)} - -/* ════ DENIED ════ */ -.denied-overlay{position:fixed;inset:0;background:rgba(0,0,0,.45);backdrop-filter:blur(8px);display:flex;align-items:center;justify-content:center;z-index:300} -.denied-card{background:var(--surface);border-radius:22px;padding:44px 48px;max-width:400px;text-align:center;box-shadow:var(--shadow-lg)} -.denied-card h2{font-size:1.35rem;font-weight:700;margin:14px 0 10px} -.denied-card p{font-size:.92rem;color:var(--text2);line-height:1.65;margin-bottom:22px} -.denied-card strong{color:var(--text1)} -.denied-card button{padding:13px 30px;border-radius:50px;border:none;background:var(--accent);color:#fff;font-size:.92rem;font-weight:600;cursor:pointer;font-family:inherit} -*{scrollbar-width:thin;scrollbar-color:var(--text3) transparent} - -/* ════ LISTEN RING (always-on mic) ════ */ -.listen-wrap{position:relative;display:flex;align-items:center;justify-content:center} -.listen-ring{ - position:absolute;inset:-6px;border-radius:50%; - border:2.5px solid transparent;pointer-events:none; - transition:border-color .2s,box-shadow .2s; -} -.listen-ring.active{ - border-color:var(--green); - box-shadow:0 0 0 3px rgba(34,197,94,.15),0 0 14px rgba(34,197,94,.3); - animation:listen-pulse 1.8s ease-in-out infinite; -} -@keyframes listen-pulse{ - 0%,100%{box-shadow:0 0 0 3px rgba(34,197,94,.15),0 0 14px rgba(34,197,94,.3)} - 50%{box-shadow:0 0 0 6px rgba(34,197,94,.08),0 0 24px rgba(34,197,94,.45)} -} - -/* ════ LISTEN STATUS BAR ════ */ -.listen-status-bar{ - display:flex;align-items:center;justify-content:center;gap:7px; - padding:6px 0 10px; - font-size:.75rem;font-weight:500;color:var(--text2); - background:rgba(255,255,255,.94);backdrop-filter:blur(18px); - border-top:none;flex-shrink:0; -} -.listen-dot{ - width:7px;height:7px;border-radius:50%; - background:var(--text3); - transition:background .3s; -} -#listen-ring.active ~ .bar-act-btn, -.listen-status-bar:has(+ .listen-dot) { } -/* Updated via JS: when active, dot goes green */ -.listen-dot.on{background:var(--green);animation:listen-pulse 1.8s ease-in-out infinite} + radial-gradient(circle at top left, rgba(215, 180, 106, 0.12), transparent 28%), + radial-gradient(circle at top right, rgba(116, 169, 255, 0.14), transparent 26%), + linear-gradient(180deg, #091018, #0c1118 26%, #0b1017 100%); + color: var(--text); + font-family: 'IBM Plex Sans', sans-serif; +} + +body { padding: 28px; } + +.shell { + max-width: 1500px; + margin: 0 auto; +} + +.topbar { + display: flex; + justify-content: space-between; + gap: 24px; + align-items: flex-start; + margin-bottom: 18px; +} + +.eyebrow { + margin: 0 0 8px; + font-size: 0.75rem; + text-transform: uppercase; + letter-spacing: 0.2em; + color: var(--accent); +} + +h1, h2 { + margin: 0; + font-family: 'IBM Plex Serif', serif; + font-weight: 600; +} + +h1 { font-size: clamp(2rem, 4vw, 3.5rem); } +h2 { font-size: 1.4rem; } + +.lede { + margin: 12px 0 0; + color: var(--muted); + max-width: 70ch; + line-height: 1.6; +} + +.status-card { + min-width: 240px; + padding: 18px 20px; + border: 1px solid var(--line); + background: rgba(18, 25, 37, 0.82); + border-radius: 18px; + box-shadow: var(--shadow); +} + +.status-card strong { + display: block; + font-size: 1.25rem; + margin: 6px 0 4px; +} + +.status-card strong.ok { color: var(--good); } + +.status-label { + color: var(--muted); + font-size: 0.8rem; + text-transform: uppercase; + letter-spacing: 0.14em; +} + +.stage-strip { + display: grid; + grid-template-columns: repeat(7, minmax(0, 1fr)); + gap: 10px; + margin-bottom: 18px; +} + +.stage { + padding: 12px 10px; + text-align: center; + border-radius: 14px; + border: 1px solid var(--line); + background: rgba(255, 255, 255, 0.03); + color: var(--muted); + font-size: 0.78rem; + letter-spacing: 0.08em; +} + +.stage.done { + color: var(--text); + background: rgba(116, 169, 255, 0.08); +} + +.stage.active { + color: #0b1017; + background: linear-gradient(135deg, var(--accent), #f6d59c); + font-weight: 700; +} + +.grid { + display: grid; + grid-template-columns: 1.2fr 0.9fr 1fr; + gap: 18px; +} + +.panel { + background: rgba(18, 25, 37, 0.88); + border: 1px solid var(--line); + border-radius: 22px; + padding: 22px; + box-shadow: var(--shadow); +} + +.panel-xl { min-height: 100%; } +.panel-tall { display: flex; flex-direction: column; gap: 18px; } + +.panel-head { + display: flex; + align-items: flex-start; + justify-content: space-between; + gap: 18px; + margin-bottom: 18px; +} + +.form-grid { + display: grid; + grid-template-columns: repeat(2, minmax(0, 1fr)); + gap: 14px; +} +label { + display: flex; + flex-direction: column; + gap: 8px; + color: var(--muted); + font-size: 0.9rem; +} + +.stacked { margin-top: 14px; } + +input, textarea, select { + width: 100%; + border: 1px solid var(--line); + background: rgba(255, 255, 255, 0.04); + color: var(--text); + border-radius: 14px; + padding: 12px 14px; + font: inherit; +} + +textarea { resize: vertical; min-height: 92px; } + +input:focus, textarea:focus, select:focus { + outline: 2px solid rgba(215, 180, 106, 0.36); + border-color: rgba(215, 180, 106, 0.62); +} + +.split-actions { + display: grid; + grid-template-columns: repeat(2, minmax(0, 1fr)); + gap: 10px; + margin-top: 16px; +} + +button { + border: 0; + border-radius: 14px; + padding: 12px 14px; + background: linear-gradient(135deg, var(--accent), #f6d59c); + color: #0b1017; + font-weight: 700; + cursor: pointer; + transition: transform 0.16s ease, opacity 0.16s ease; +} + +button:hover { transform: translateY(-1px); } +button.secondary, .ghost { + background: rgba(255, 255, 255, 0.05); + color: var(--text); + border: 1px solid var(--line); +} + +.ghost { padding-inline: 16px; } + +.metric-row { + display: grid; + grid-template-columns: repeat(3, minmax(0, 1fr)); + gap: 12px; + margin: 12px 0 16px; +} + +.metric-row article, .result-box, .artifact-card, .feed-item, .session-pill { + border: 1px solid var(--line); + background: rgba(255, 255, 255, 0.03); + border-radius: 16px; +} + +.metric-row article { padding: 14px; } + +.metric-row span { + display: block; + color: var(--muted); + font-size: 0.78rem; + text-transform: uppercase; + letter-spacing: 0.08em; +} + +.metric-row strong { + display: block; + margin-top: 6px; + font-size: 1.2rem; +} + +.result-box { padding: 16px; } + +pre { + white-space: pre-wrap; + word-break: break-word; + margin: 10px 0 0; + font: 0.85rem/1.55 'IBM Plex Sans', sans-serif; + color: var(--text); +} + +.feed, .artifact-list, .session-list { + display: grid; + gap: 10px; +} + +.feed-item, .artifact-card { padding: 12px 14px; } + +.feed-item { + display: flex; + justify-content: space-between; + gap: 12px; +} + +.feed-item span, .feed-item small, .muted { color: var(--muted); } + +.session-pill { + padding: 14px; + text-align: left; + color: var(--text); +} + +.session-pill.selected { + border-color: rgba(215, 180, 106, 0.55); + box-shadow: 0 0 0 1px rgba(215, 180, 106, 0.18) inset; +} + +.session-pill strong, .artifact-card strong { display: block; margin-bottom: 4px; } +.session-pill span, .session-pill small { display: block; color: var(--muted); } + +@media (max-width: 1240px) { + .grid { grid-template-columns: 1fr; } +} + +@media (max-width: 900px) { + body { padding: 16px; } + .topbar { flex-direction: column; } + .stage-strip { grid-template-columns: repeat(2, minmax(0, 1fr)); } + .form-grid, .split-actions, .metric-row { grid-template-columns: 1fr; } +} diff --git a/scripts/seed_students.py b/scripts/seed_students.py new file mode 100644 index 0000000..8571cd5 --- /dev/null +++ b/scripts/seed_students.py @@ -0,0 +1,24 @@ +#!/usr/bin/env python3 +"""Seed the student registry persistence file with sample records. + +Usage: + python scripts/seed_students.py + +This writes `backend/data/students.json` with the sample students defined +in the Gatekeeper registry fixtures. +""" +import json +import os +from backend.src.agents.gatekeeper.registry.registry_store import SAMPLE_STUDENTS + +def main(): + base = os.path.abspath(os.path.join(os.path.dirname(__file__), '..')) + out_path = os.path.join(base, 'backend', 'data', 'students.json') + os.makedirs(os.path.dirname(out_path), exist_ok=True) + serializable = [s.to_dict() for s in SAMPLE_STUDENTS] + with open(out_path, 'w', encoding='utf-8') as f: + json.dump(serializable, f, indent=2) + print(f"Seeded {len(serializable)} students to {out_path}") + +if __name__ == '__main__': + main() diff --git a/scripts/smoke_main_agent.py b/scripts/smoke_main_agent.py new file mode 100644 index 0000000..20ebdca --- /dev/null +++ b/scripts/smoke_main_agent.py @@ -0,0 +1,24 @@ +import sys +from pathlib import Path +sys.path.insert(0, str(Path('backend').resolve())) +import asyncio +import types + +from src.agents.main_agent.agent import MainAgent + +async def main(): + ma = MainAgent() + + async def fake_process(session_id, input_data, log_callback=None): + return types.SimpleNamespace(implementation_viva_targets=[]) + + # Patch dependent agents to avoid external I/O in smoke test + ma.gatekeeper.process = fake_process + ma.oracle.process = fake_process + ma.sentinel.process = fake_process + + result = await ma.process("smoke-session-1", {"student_id": "test-123", "enable_voice": False}) + print("MainAgent.process returned:", type(result), getattr(result, 'implementation_viva_targets', None)) + +if __name__ == '__main__': + asyncio.run(main()) diff --git a/test-commit.md b/test-commit.md deleted file mode 100644 index 59e70d8..0000000 --- a/test-commit.md +++ /dev/null @@ -1,5 +0,0 @@ -# Discord bot test -# This is my frist github workflowtest Sun May 17 10:34:54 IST 2026 -webhook updated Sun May 17 10:41:31 IST 2026 -test Sun May 17 10:53:18 IST 2026 -mention test Sun May 17 17:49:44 IST 2026 diff --git a/testing_backend_ui/README.md b/testing_backend_ui/README.md new file mode 100644 index 0000000..9df19cb --- /dev/null +++ b/testing_backend_ui/README.md @@ -0,0 +1,31 @@ +# ORACLE Backend Testing UI (Step 1) + +Initial React dashboard scaffolding with: + +- Agent Topology Graph (React Flow) +- Session Timeline (Gantt style) +- Live Event Feed (schema-shaped events) +- Alert Cards (SENTINEL + GATEKEEPER) + +## Run + +```bash +npm install +npm run dev +``` + +## Build check + +```bash +npm run build +``` + +## Notes + +- Agent topology uses required fixed layout: + - ORACLE (top) + - MAIN VIVA (middle) + - GATEKEEPER + SENTINEL (bottom, side-by-side) +- Event objects follow the required schema fields: + - event_id, timestamp, source_agent, target_agent, event_type, session_id, payload, duration_ms +- This is step 1 foundation; next step can connect these panels to live backend websocket streams. diff --git a/testing_backend_ui/index.html b/testing_backend_ui/index.html new file mode 100644 index 0000000..1a03966 --- /dev/null +++ b/testing_backend_ui/index.html @@ -0,0 +1,18 @@ + + + + + + ORACLE Backend Testing UI + + + + + +
+ + + diff --git a/testing_backend_ui/package-lock.json b/testing_backend_ui/package-lock.json new file mode 100644 index 0000000..c2e5b5b --- /dev/null +++ b/testing_backend_ui/package-lock.json @@ -0,0 +1,2208 @@ +{ + "name": "testing-backend-ui", + "version": "0.1.0", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "name": "testing-backend-ui", + "version": "0.1.0", + "dependencies": { + "react": "^18.3.1", + "react-dom": "^18.3.1", + "reactflow": "^11.11.4" + }, + "devDependencies": { + "@vitejs/plugin-react": "^4.3.1", + "vite": "^5.4.10" + } + }, + "node_modules/@babel/code-frame": { + "version": "7.29.0", + "resolved": "https://registry.npmjs.org/@babel/code-frame/-/code-frame-7.29.0.tgz", + "integrity": "sha512-9NhCeYjq9+3uxgdtp20LSiJXJvN0FeCtNGpJxuMFZ1Kv3cWUNb6DOhJwUvcVCzKGR66cw4njwM6hrJLqgOwbcw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/helper-validator-identifier": "^7.28.5", + "js-tokens": "^4.0.0", + "picocolors": "^1.1.1" + }, + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/compat-data": { + "version": "7.29.3", + "resolved": "https://registry.npmjs.org/@babel/compat-data/-/compat-data-7.29.3.tgz", + "integrity": "sha512-LIVqM46zQWZhj17qA8wb4nW/ixr2y1Nw+r1etiAWgRM6U1IqP+LNhL1yg440jYZR72jCWcWbLWzIosH+uP1fqg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/core": { + "version": "7.29.0", + "resolved": "https://registry.npmjs.org/@babel/core/-/core-7.29.0.tgz", + "integrity": "sha512-CGOfOJqWjg2qW/Mb6zNsDm+u5vFQ8DxXfbM09z69p5Z6+mE1ikP2jUXw+j42Pf1XTYED2Rni5f95npYeuwMDQA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/code-frame": "^7.29.0", + "@babel/generator": "^7.29.0", + "@babel/helper-compilation-targets": "^7.28.6", + "@babel/helper-module-transforms": "^7.28.6", + "@babel/helpers": "^7.28.6", + "@babel/parser": "^7.29.0", + "@babel/template": "^7.28.6", + "@babel/traverse": "^7.29.0", + "@babel/types": "^7.29.0", + "@jridgewell/remapping": "^2.3.5", + "convert-source-map": "^2.0.0", + "debug": "^4.1.0", + "gensync": "^1.0.0-beta.2", + "json5": "^2.2.3", + "semver": "^6.3.1" + }, + "engines": { + "node": ">=6.9.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/babel" + } + }, + "node_modules/@babel/generator": { + "version": "7.29.1", + "resolved": "https://registry.npmjs.org/@babel/generator/-/generator-7.29.1.tgz", + "integrity": "sha512-qsaF+9Qcm2Qv8SRIMMscAvG4O3lJ0F1GuMo5HR/Bp02LopNgnZBC/EkbevHFeGs4ls/oPz9v+Bsmzbkbe+0dUw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/parser": "^7.29.0", + "@babel/types": "^7.29.0", + "@jridgewell/gen-mapping": "^0.3.12", + "@jridgewell/trace-mapping": "^0.3.28", + "jsesc": "^3.0.2" + }, + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/helper-compilation-targets": { + "version": "7.28.6", + "resolved": "https://registry.npmjs.org/@babel/helper-compilation-targets/-/helper-compilation-targets-7.28.6.tgz", + "integrity": "sha512-JYtls3hqi15fcx5GaSNL7SCTJ2MNmjrkHXg4FSpOA/grxK8KwyZ5bubHsCq8FXCkua6xhuaaBit+3b7+VZRfcA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/compat-data": "^7.28.6", + "@babel/helper-validator-option": "^7.27.1", + "browserslist": "^4.24.0", + "lru-cache": "^5.1.1", + "semver": "^6.3.1" + }, + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/helper-globals": { + "version": "7.28.0", + "resolved": "https://registry.npmjs.org/@babel/helper-globals/-/helper-globals-7.28.0.tgz", + "integrity": "sha512-+W6cISkXFa1jXsDEdYA8HeevQT/FULhxzR99pxphltZcVaugps53THCeiWA8SguxxpSp3gKPiuYfSWopkLQ4hw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/helper-module-imports": { + "version": "7.28.6", + "resolved": "https://registry.npmjs.org/@babel/helper-module-imports/-/helper-module-imports-7.28.6.tgz", + "integrity": "sha512-l5XkZK7r7wa9LucGw9LwZyyCUscb4x37JWTPz7swwFE/0FMQAGpiWUZn8u9DzkSBWEcK25jmvubfpw2dnAMdbw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/traverse": "^7.28.6", + "@babel/types": "^7.28.6" + }, + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/helper-module-transforms": { + "version": "7.28.6", + "resolved": "https://registry.npmjs.org/@babel/helper-module-transforms/-/helper-module-transforms-7.28.6.tgz", + "integrity": "sha512-67oXFAYr2cDLDVGLXTEABjdBJZ6drElUSI7WKp70NrpyISso3plG9SAGEF6y7zbha/wOzUByWWTJvEDVNIUGcA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/helper-module-imports": "^7.28.6", + "@babel/helper-validator-identifier": "^7.28.5", + "@babel/traverse": "^7.28.6" + }, + "engines": { + "node": ">=6.9.0" + }, + "peerDependencies": { + "@babel/core": "^7.0.0" + } + }, + "node_modules/@babel/helper-plugin-utils": { + "version": "7.28.6", + "resolved": "https://registry.npmjs.org/@babel/helper-plugin-utils/-/helper-plugin-utils-7.28.6.tgz", + "integrity": "sha512-S9gzZ/bz83GRysI7gAD4wPT/AI3uCnY+9xn+Mx/KPs2JwHJIz1W8PZkg2cqyt3RNOBM8ejcXhV6y8Og7ly/Dug==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/helper-string-parser": { + "version": "7.27.1", + "resolved": "https://registry.npmjs.org/@babel/helper-string-parser/-/helper-string-parser-7.27.1.tgz", + "integrity": "sha512-qMlSxKbpRlAridDExk92nSobyDdpPijUq2DW6oDnUqd0iOGxmQjyqhMIihI9+zv4LPyZdRje2cavWPbCbWm3eA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/helper-validator-identifier": { + "version": "7.28.5", + "resolved": "https://registry.npmjs.org/@babel/helper-validator-identifier/-/helper-validator-identifier-7.28.5.tgz", + "integrity": "sha512-qSs4ifwzKJSV39ucNjsvc6WVHs6b7S03sOh2OcHF9UHfVPqWWALUsNUVzhSBiItjRZoLHx7nIarVjqKVusUZ1Q==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/helper-validator-option": { + "version": "7.27.1", + "resolved": "https://registry.npmjs.org/@babel/helper-validator-option/-/helper-validator-option-7.27.1.tgz", + "integrity": "sha512-YvjJow9FxbhFFKDSuFnVCe2WxXk1zWc22fFePVNEaWJEu8IrZVlda6N0uHwzZrUM1il7NC9Mlp4MaJYbYd9JSg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/helpers": { + "version": "7.29.2", + "resolved": "https://registry.npmjs.org/@babel/helpers/-/helpers-7.29.2.tgz", + "integrity": "sha512-HoGuUs4sCZNezVEKdVcwqmZN8GoHirLUcLaYVNBK2J0DadGtdcqgr3BCbvH8+XUo4NGjNl3VOtSjEKNzqfFgKw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/template": "^7.28.6", + "@babel/types": "^7.29.0" + }, + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/parser": { + "version": "7.29.3", + "resolved": "https://registry.npmjs.org/@babel/parser/-/parser-7.29.3.tgz", + "integrity": "sha512-b3ctpQwp+PROvU/cttc4OYl4MzfJUWy6FZg+PMXfzmt/+39iHVF0sDfqay8TQM3JA2EUOyKcFZt75jWriQijsA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/types": "^7.29.0" + }, + "bin": { + "parser": "bin/babel-parser.js" + }, + "engines": { + "node": ">=6.0.0" + } + }, + "node_modules/@babel/plugin-transform-react-jsx-self": { + "version": "7.27.1", + "resolved": "https://registry.npmjs.org/@babel/plugin-transform-react-jsx-self/-/plugin-transform-react-jsx-self-7.27.1.tgz", + "integrity": "sha512-6UzkCs+ejGdZ5mFFC/OCUrv028ab2fp1znZmCZjAOBKiBK2jXD1O+BPSfX8X2qjJ75fZBMSnQn3Rq2mrBJK2mw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/helper-plugin-utils": "^7.27.1" + }, + "engines": { + "node": ">=6.9.0" + }, + "peerDependencies": { + "@babel/core": "^7.0.0-0" + } + }, + "node_modules/@babel/plugin-transform-react-jsx-source": { + "version": "7.27.1", + "resolved": "https://registry.npmjs.org/@babel/plugin-transform-react-jsx-source/-/plugin-transform-react-jsx-source-7.27.1.tgz", + "integrity": "sha512-zbwoTsBruTeKB9hSq73ha66iFeJHuaFkUbwvqElnygoNbj/jHRsSeokowZFN3CZ64IvEqcmmkVe89OPXc7ldAw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/helper-plugin-utils": "^7.27.1" + }, + "engines": { + "node": ">=6.9.0" + }, + "peerDependencies": { + "@babel/core": "^7.0.0-0" + } + }, + "node_modules/@babel/template": { + "version": "7.28.6", + "resolved": "https://registry.npmjs.org/@babel/template/-/template-7.28.6.tgz", + "integrity": "sha512-YA6Ma2KsCdGb+WC6UpBVFJGXL58MDA6oyONbjyF/+5sBgxY/dwkhLogbMT2GXXyU84/IhRw/2D1Os1B/giz+BQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/code-frame": "^7.28.6", + "@babel/parser": "^7.28.6", + "@babel/types": "^7.28.6" + }, + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/traverse": { + "version": "7.29.0", + "resolved": "https://registry.npmjs.org/@babel/traverse/-/traverse-7.29.0.tgz", + "integrity": "sha512-4HPiQr0X7+waHfyXPZpWPfWL/J7dcN1mx9gL6WdQVMbPnF3+ZhSMs8tCxN7oHddJE9fhNE7+lxdnlyemKfJRuA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/code-frame": "^7.29.0", + "@babel/generator": "^7.29.0", + "@babel/helper-globals": "^7.28.0", + "@babel/parser": "^7.29.0", + "@babel/template": "^7.28.6", + "@babel/types": "^7.29.0", + "debug": "^4.3.1" + }, + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@babel/types": { + "version": "7.29.0", + "resolved": "https://registry.npmjs.org/@babel/types/-/types-7.29.0.tgz", + "integrity": "sha512-LwdZHpScM4Qz8Xw2iKSzS+cfglZzJGvofQICy7W7v4caru4EaAmyUuO6BGrbyQ2mYV11W0U8j5mBhd14dd3B0A==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/helper-string-parser": "^7.27.1", + "@babel/helper-validator-identifier": "^7.28.5" + }, + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@esbuild/aix-ppc64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.21.5.tgz", + "integrity": "sha512-1SDgH6ZSPTlggy1yI6+Dbkiz8xzpHJEVAlF/AM1tHPLsf5STom9rwtjE4hKAF20FfXXNTFqEYXyJNWh1GiZedQ==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "aix" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/android-arm": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.21.5.tgz", + "integrity": "sha512-vCPvzSjpPHEi1siZdlvAlsPxXl7WbOVUBBAowWug4rJHb68Ox8KualB+1ocNvT5fjv6wpkX6o/iEpbDrf68zcg==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/android-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.21.5.tgz", + "integrity": "sha512-c0uX9VAUBQ7dTDCjq+wdyGLowMdtR/GoC2U5IYk/7D1H1JYC0qseD7+11iMP2mRLN9RcCMRcjC4YMclCzGwS/A==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/android-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.21.5.tgz", + "integrity": "sha512-D7aPRUUNHRBwHxzxRvp856rjUHRFW1SdQATKXH2hqA0kAZb1hKmi02OpYRacl0TxIGz/ZmXWlbZgjwWYaCakTA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/darwin-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.21.5.tgz", + "integrity": "sha512-DwqXqZyuk5AiWWf3UfLiRDJ5EDd49zg6O9wclZ7kUMv2WRFr4HKjXp/5t8JZ11QbQfUS6/cRCKGwYhtNAY88kQ==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/darwin-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.21.5.tgz", + "integrity": "sha512-se/JjF8NlmKVG4kNIuyWMV/22ZaerB+qaSi5MdrXtd6R08kvs2qCN4C09miupktDitvh8jRFflwGFBQcxZRjbw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/freebsd-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.21.5.tgz", + "integrity": "sha512-5JcRxxRDUJLX8JXp/wcBCy3pENnCgBR9bN6JsY4OmhfUtIHe3ZW0mawA7+RDAcMLrMIZaf03NlQiX9DGyB8h4g==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/freebsd-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.21.5.tgz", + "integrity": "sha512-J95kNBj1zkbMXtHVH29bBriQygMXqoVQOQYA+ISs0/2l3T9/kj42ow2mpqerRBxDJnmkUDCaQT/dfNXWX/ZZCQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-arm": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.21.5.tgz", + "integrity": "sha512-bPb5AHZtbeNGjCKVZ9UGqGwo8EUu4cLq68E95A53KlxAPRmUyYv2D6F0uUI65XisGOL1hBP5mTronbgo+0bFcA==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.21.5.tgz", + "integrity": "sha512-ibKvmyYzKsBeX8d8I7MH/TMfWDXBF3db4qM6sy+7re0YXya+K1cem3on9XgdT2EQGMu4hQyZhan7TeQ8XkGp4Q==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-ia32": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.21.5.tgz", + "integrity": "sha512-YvjXDqLRqPDl2dvRODYmmhz4rPeVKYvppfGYKSNGdyZkA01046pLWyRKKI3ax8fbJoK5QbxblURkwK/MWY18Tg==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-loong64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.21.5.tgz", + "integrity": "sha512-uHf1BmMG8qEvzdrzAqg2SIG/02+4/DHB6a9Kbya0XDvwDEKCoC8ZRWI5JJvNdUjtciBGFQ5PuBlpEOXQj+JQSg==", + "cpu": [ + "loong64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-mips64el": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.21.5.tgz", + "integrity": "sha512-IajOmO+KJK23bj52dFSNCMsz1QP1DqM6cwLUv3W1QwyxkyIWecfafnI555fvSGqEKwjMXVLokcV5ygHW5b3Jbg==", + "cpu": [ + "mips64el" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-ppc64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.21.5.tgz", + "integrity": "sha512-1hHV/Z4OEfMwpLO8rp7CvlhBDnjsC3CttJXIhBi+5Aj5r+MBvy4egg7wCbe//hSsT+RvDAG7s81tAvpL2XAE4w==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-riscv64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.21.5.tgz", + "integrity": "sha512-2HdXDMd9GMgTGrPWnJzP2ALSokE/0O5HhTUvWIbD3YdjME8JwvSCnNGBnTThKGEB91OZhzrJ4qIIxk/SBmyDDA==", + "cpu": [ + "riscv64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-s390x": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.21.5.tgz", + "integrity": "sha512-zus5sxzqBJD3eXxwvjN1yQkRepANgxE9lgOW2qLnmr8ikMTphkjgXu1HR01K4FJg8h1kEEDAqDcZQtbrRnB41A==", + "cpu": [ + "s390x" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/linux-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.21.5.tgz", + "integrity": "sha512-1rYdTpyv03iycF1+BhzrzQJCdOuAOtaqHTWJZCWvijKD2N5Xu0TtVC8/+1faWqcP9iBCWOmjmhoH94dH82BxPQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/netbsd-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.21.5.tgz", + "integrity": "sha512-Woi2MXzXjMULccIwMnLciyZH4nCIMpWQAs049KEeMvOcNADVxo0UBIQPfSmxB3CWKedngg7sWZdLvLczpe0tLg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "netbsd" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/openbsd-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.21.5.tgz", + "integrity": "sha512-HLNNw99xsvx12lFBUwoT8EVCsSvRNDVxNpjZ7bPn947b8gJPzeHWyNVhFsaerc0n3TsbOINvRP2byTZ5LKezow==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/sunos-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.21.5.tgz", + "integrity": "sha512-6+gjmFpfy0BHU5Tpptkuh8+uw3mnrvgs+dSPQXQOv3ekbordwnzTVEb4qnIvQcYXq6gzkyTnoZ9dZG+D4garKg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "sunos" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/win32-arm64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.21.5.tgz", + "integrity": "sha512-Z0gOTd75VvXqyq7nsl93zwahcTROgqvuAcYDUr+vOv8uHhNSKROyU961kgtCD1e95IqPKSQKH7tBTslnS3tA8A==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/win32-ia32": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.21.5.tgz", + "integrity": "sha512-SWXFF1CL2RVNMaVs+BBClwtfZSvDgtL//G/smwAc5oVK/UPu2Gu9tIaRgFmYFFKrmg3SyAjSrElf0TiJ1v8fYA==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@esbuild/win32-x64": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.21.5.tgz", + "integrity": "sha512-tQd/1efJuzPC6rCFwEvLtci/xNFcTZknmXs98FYDfGE4wP9ClFV98nyKrzJKVPMhdDnjzLhdUyMX4PsQAPjwIw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=12" + } + }, + "node_modules/@jridgewell/gen-mapping": { + "version": "0.3.13", + "resolved": "https://registry.npmjs.org/@jridgewell/gen-mapping/-/gen-mapping-0.3.13.tgz", + "integrity": "sha512-2kkt/7niJ6MgEPxF0bYdQ6etZaA+fQvDcLKckhy1yIQOzaoKjBBjSj63/aLVjYE3qhRt5dvM+uUyfCg6UKCBbA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@jridgewell/sourcemap-codec": "^1.5.0", + "@jridgewell/trace-mapping": "^0.3.24" + } + }, + "node_modules/@jridgewell/remapping": { + "version": "2.3.5", + "resolved": "https://registry.npmjs.org/@jridgewell/remapping/-/remapping-2.3.5.tgz", + "integrity": "sha512-LI9u/+laYG4Ds1TDKSJW2YPrIlcVYOwi2fUC6xB43lueCjgxV4lffOCZCtYFiH6TNOX+tQKXx97T4IKHbhyHEQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@jridgewell/gen-mapping": "^0.3.5", + "@jridgewell/trace-mapping": "^0.3.24" + } + }, + "node_modules/@jridgewell/resolve-uri": { + "version": "3.1.2", + "resolved": "https://registry.npmjs.org/@jridgewell/resolve-uri/-/resolve-uri-3.1.2.tgz", + "integrity": "sha512-bRISgCIjP20/tbWSPWMEi54QVPRZExkuD9lJL+UIxUKtwVJA8wW1Trb1jMs1RFXo1CBTNZ/5hpC9QvmKWdopKw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6.0.0" + } + }, + "node_modules/@jridgewell/sourcemap-codec": { + "version": "1.5.5", + "resolved": "https://registry.npmjs.org/@jridgewell/sourcemap-codec/-/sourcemap-codec-1.5.5.tgz", + "integrity": "sha512-cYQ9310grqxueWbl+WuIUIaiUaDcj7WOq5fVhEljNVgRfOUhY9fy2zTvfoqWsnebh8Sl70VScFbICvJnLKB0Og==", + "dev": true, + "license": "MIT" + }, + "node_modules/@jridgewell/trace-mapping": { + "version": "0.3.31", + "resolved": "https://registry.npmjs.org/@jridgewell/trace-mapping/-/trace-mapping-0.3.31.tgz", + "integrity": "sha512-zzNR+SdQSDJzc8joaeP8QQoCQr8NuYx2dIIytl1QeBEZHJ9uW6hebsrYgbz8hJwUQao3TWCMtmfV8Nu1twOLAw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@jridgewell/resolve-uri": "^3.1.0", + "@jridgewell/sourcemap-codec": "^1.4.14" + } + }, + "node_modules/@reactflow/background": { + "version": "11.3.14", + "resolved": "https://registry.npmjs.org/@reactflow/background/-/background-11.3.14.tgz", + "integrity": "sha512-Gewd7blEVT5Lh6jqrvOgd4G6Qk17eGKQfsDXgyRSqM+CTwDqRldG2LsWN4sNeno6sbqVIC2fZ+rAUBFA9ZEUDA==", + "license": "MIT", + "dependencies": { + "@reactflow/core": "11.11.4", + "classcat": "^5.0.3", + "zustand": "^4.4.1" + }, + "peerDependencies": { + "react": ">=17", + "react-dom": ">=17" + } + }, + "node_modules/@reactflow/controls": { + "version": "11.2.14", + "resolved": "https://registry.npmjs.org/@reactflow/controls/-/controls-11.2.14.tgz", + "integrity": "sha512-MiJp5VldFD7FrqaBNIrQ85dxChrG6ivuZ+dcFhPQUwOK3HfYgX2RHdBua+gx+40p5Vw5It3dVNp/my4Z3jF0dw==", + "license": "MIT", + "dependencies": { + "@reactflow/core": "11.11.4", + "classcat": "^5.0.3", + "zustand": "^4.4.1" + }, + "peerDependencies": { + "react": ">=17", + "react-dom": ">=17" + } + }, + "node_modules/@reactflow/core": { + "version": "11.11.4", + "resolved": "https://registry.npmjs.org/@reactflow/core/-/core-11.11.4.tgz", + "integrity": "sha512-H4vODklsjAq3AMq6Np4LE12i1I4Ta9PrDHuBR9GmL8uzTt2l2jh4CiQbEMpvMDcp7xi4be0hgXj+Ysodde/i7Q==", + "license": "MIT", + "dependencies": { + "@types/d3": "^7.4.0", + "@types/d3-drag": "^3.0.1", + "@types/d3-selection": "^3.0.3", + "@types/d3-zoom": "^3.0.1", + "classcat": "^5.0.3", + "d3-drag": "^3.0.0", + "d3-selection": "^3.0.0", + "d3-zoom": "^3.0.0", + "zustand": "^4.4.1" + }, + "peerDependencies": { + "react": ">=17", + "react-dom": ">=17" + } + }, + "node_modules/@reactflow/minimap": { + "version": "11.7.14", + "resolved": "https://registry.npmjs.org/@reactflow/minimap/-/minimap-11.7.14.tgz", + "integrity": "sha512-mpwLKKrEAofgFJdkhwR5UQ1JYWlcAAL/ZU/bctBkuNTT1yqV+y0buoNVImsRehVYhJwffSWeSHaBR5/GJjlCSQ==", + "license": "MIT", + "dependencies": { + "@reactflow/core": "11.11.4", + "@types/d3-selection": "^3.0.3", + "@types/d3-zoom": "^3.0.1", + "classcat": "^5.0.3", + "d3-selection": "^3.0.0", + "d3-zoom": "^3.0.0", + "zustand": "^4.4.1" + }, + "peerDependencies": { + "react": ">=17", + "react-dom": ">=17" + } + }, + "node_modules/@reactflow/node-resizer": { + "version": "2.2.14", + "resolved": "https://registry.npmjs.org/@reactflow/node-resizer/-/node-resizer-2.2.14.tgz", + "integrity": "sha512-fwqnks83jUlYr6OHcdFEedumWKChTHRGw/kbCxj0oqBd+ekfs+SIp4ddyNU0pdx96JIm5iNFS0oNrmEiJbbSaA==", + "license": "MIT", + "dependencies": { + "@reactflow/core": "11.11.4", + "classcat": "^5.0.4", + "d3-drag": "^3.0.0", + "d3-selection": "^3.0.0", + "zustand": "^4.4.1" + }, + "peerDependencies": { + "react": ">=17", + "react-dom": ">=17" + } + }, + "node_modules/@reactflow/node-toolbar": { + "version": "1.3.14", + "resolved": "https://registry.npmjs.org/@reactflow/node-toolbar/-/node-toolbar-1.3.14.tgz", + "integrity": "sha512-rbynXQnH/xFNu4P9H+hVqlEUafDCkEoCy0Dg9mG22Sg+rY/0ck6KkrAQrYrTgXusd+cEJOMK0uOOFCK2/5rSGQ==", + "license": "MIT", + "dependencies": { + "@reactflow/core": "11.11.4", + "classcat": "^5.0.3", + "zustand": "^4.4.1" + }, + "peerDependencies": { + "react": ">=17", + "react-dom": ">=17" + } + }, + "node_modules/@rolldown/pluginutils": { + "version": "1.0.0-beta.27", + "resolved": "https://registry.npmjs.org/@rolldown/pluginutils/-/pluginutils-1.0.0-beta.27.tgz", + "integrity": "sha512-+d0F4MKMCbeVUJwG96uQ4SgAznZNSq93I3V+9NHA4OpvqG8mRCpGdKmK8l/dl02h2CCDHwW2FqilnTyDcAnqjA==", + "dev": true, + "license": "MIT" + }, + "node_modules/@rollup/rollup-android-arm-eabi": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.60.4.tgz", + "integrity": "sha512-F5QXMSiFebS9hKZj02XhWLLnRpJ3B3AROP0tWbFBSj+6kCbg5m9j5JoHKd4mmSVy5mS/IMQloYgYxCuJC0fxEQ==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ] + }, + "node_modules/@rollup/rollup-android-arm64": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.60.4.tgz", + "integrity": "sha512-GxxTKApUpzRhof7poWvCJHRF51C67u1R7D6DiluBE8wKU1u5GWE8t+v81JvJYtbawoBFX1hLv5Ei4eVjkWokaw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ] + }, + "node_modules/@rollup/rollup-darwin-arm64": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.60.4.tgz", + "integrity": "sha512-tua0TaJxMOB1R0V0RS1jFZ/RpURFDJIOR2A6jWwQeawuFyS4gBW+rntLRaQd0EQ4bd6Vp44Z2rXW+YYDBsj6IA==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ] + }, + "node_modules/@rollup/rollup-darwin-x64": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.60.4.tgz", + "integrity": "sha512-CSKq7MsP+5PFIcydhAiR1K0UhEI1A2jWXVKHPCBZ151yOutENwvnPocgVHkivu2kviURtCEB6zUQw0vs8RrhMg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ] + }, + "node_modules/@rollup/rollup-freebsd-arm64": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.60.4.tgz", + "integrity": "sha512-+O8OkVdyvXMtJEciu2wS/pzm1IxntEEQx3z5TAVy4l32G0etZn+RsA48ARRrFm6Ri8fvqPQfgrvNxSjKAbnd3g==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ] + }, + "node_modules/@rollup/rollup-freebsd-x64": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.60.4.tgz", + "integrity": "sha512-Iw3oMskH3AfNuhU0MSN7vNbdi4me/NiYo2azqPz/Le16zHSa+3RRmliCMWWQmh4lcndccU40xcJuTYJZxNo/lw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ] + }, + "node_modules/@rollup/rollup-linux-arm-gnueabihf": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.60.4.tgz", + "integrity": "sha512-EIPRXTVQpHyF8WOo219AD2yEltPehLTcTMz2fn6JsatLYSzQf00hj3rulF+yauOlF9/FtM2WpkT/hJh/KJFGhA==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-arm-musleabihf": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.60.4.tgz", + "integrity": "sha512-J3Yh9PzzF1Ovah2At+lHiGQdsYgArxBbXv/zHfSyaiFQEqvNv7DcW98pCrmdjCZBrqBiKrKKe2V+aaSGWuBe/w==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-arm64-gnu": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.60.4.tgz", + "integrity": "sha512-BFDEZMYfUvLn37ONE1yMBojPxnMlTFsdyNoqncT0qFq1mAfllL+ATMMJd8TeuVMiX84s1KbcxcZbXInmcO2mRg==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-arm64-musl": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.60.4.tgz", + "integrity": "sha512-pc9EYOSlOgdQ2uPl1o9PF6/kLSgaUosia7gOuS8mB69IxJvlclko1MECXysjs5ryez1/5zjYqx3+xYU0TU6R1A==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-loong64-gnu": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loong64-gnu/-/rollup-linux-loong64-gnu-4.60.4.tgz", + "integrity": "sha512-NxnomyxYerDh5n4iLrNa+sH+Z+U4BMEE46V2PgQ/hoB909i8gV1M5wPojWg9fk1jWpO3IQnOs20K4wyZuFLEFQ==", + "cpu": [ + "loong64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-loong64-musl": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loong64-musl/-/rollup-linux-loong64-musl-4.60.4.tgz", + "integrity": "sha512-nbJnQ8a3z1mtmrwImCYhc6BGpThAyYVRQxw9uKSKG4wR6aAYno9sVjJ0zaZcW9BPJX1GbrDPf+SvdWjgTuDmnw==", + "cpu": [ + "loong64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-ppc64-gnu": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-ppc64-gnu/-/rollup-linux-ppc64-gnu-4.60.4.tgz", + "integrity": "sha512-2EU6acNrQLd8tYvo/LXW535wupT3m6fo7HKo6lr7ktQoItxTyOL1ZCR/GfGCuXl2vR+zmfI6eRXkSemafv+iVg==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-ppc64-musl": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-ppc64-musl/-/rollup-linux-ppc64-musl-4.60.4.tgz", + "integrity": "sha512-WeBtoMuaMxiiIrO2IYP3xs6GMWkJP2C0EoT8beTLkUPmzV1i/UcOSVw1d5r9KBODtHKilG5yFxsGRnBbK3wJ4A==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-riscv64-gnu": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.60.4.tgz", + "integrity": "sha512-FJHFfqpKUI3A10WrWKiFbBZ7yVbGT4q4B5o1qKFFojqpaYoh9LrQgqWCmmcxQzVSXYtyB5bzkXrYzlHTs21MYA==", + "cpu": [ + "riscv64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-riscv64-musl": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-musl/-/rollup-linux-riscv64-musl-4.60.4.tgz", + "integrity": "sha512-mcEl6CUT5IAUmQf1m9FYSmVqCJlpQ8r8eyftFUHG8i9OhY7BkBXSUdnLH5DOf0wCOjcP9v/QO93zpmF1SptCCw==", + "cpu": [ + "riscv64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-s390x-gnu": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.60.4.tgz", + "integrity": "sha512-ynt3JxVd2w2buzoKDWIyiV1pJW93xlQic1THVLXilz429oijRpSHivZAgp65KBu+cMcgf1eVVjdnTLvPxgCuoQ==", + "cpu": [ + "s390x" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-x64-gnu": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.60.4.tgz", + "integrity": "sha512-Boiz5+MsaROEWDf+GGEwF8VMHGhlUoQMtIPjOgA5fv4osupqTVnJteQNKJwUcnUog2G55jYXH7KZFFiJe0TEzQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-x64-musl": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.60.4.tgz", + "integrity": "sha512-+qfSY27qIrFfI/Hom04KYFw3GKZSGU4lXus51wsb5EuySfFlWRwjkKWoE9emgRw/ukoT4Udsj4W/+xxG8VbPKg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-openbsd-x64": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-openbsd-x64/-/rollup-openbsd-x64-4.60.4.tgz", + "integrity": "sha512-VpTfOPHgVXEBeeR8hZ2O0F3aSso+JDWqTWmTmzcQKted54IAdUVbxE+j/MVxUsKa8L20HJhv3vUezVPoquqWjA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ] + }, + "node_modules/@rollup/rollup-openharmony-arm64": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-openharmony-arm64/-/rollup-openharmony-arm64-4.60.4.tgz", + "integrity": "sha512-IPOsh5aRYuLv/nkU51X10Bf75Bsf6+gZdx1X+QP5QM6lIJFHHqbHLG0uJn/hWthzo13UAc2umiUorqZy3axoZg==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openharmony" + ] + }, + "node_modules/@rollup/rollup-win32-arm64-msvc": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.60.4.tgz", + "integrity": "sha512-4QzE9E81OohJ/HKzHhsqU+zcYYojVOXlFMs1DdyMT6qXl/niOH7AVElmmEdUNHHS/oRkc++d5k6Vy85zFs0DEw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@rollup/rollup-win32-ia32-msvc": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.60.4.tgz", + "integrity": "sha512-zTPgT1YuHHcd+Tmx7h8aml0FWFVelV5N54oHow9SLj+GfoDy/huQ+UV396N/C7KpMDMiPspRktzM1/0r1usYEA==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@rollup/rollup-win32-x64-gnu": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-gnu/-/rollup-win32-x64-gnu-4.60.4.tgz", + "integrity": "sha512-DRS4G7mi9lJxqEDezIkKCaUIKCrLUUDCUaCsTPCi/rtqaC6D/jjwslMQyiDU50Ka0JKpeXeRBFBAXwArY52vBw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@rollup/rollup-win32-x64-msvc": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.60.4.tgz", + "integrity": "sha512-QVTUovf40zgTqlFVrKA1uXMVvU2QWEFWfAH8Wdc48IxLvrJMQVMBRjuQyUpzZCDkakImib9eVazbWlC6ksWtJw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@types/babel__core": { + "version": "7.20.5", + "resolved": "https://registry.npmjs.org/@types/babel__core/-/babel__core-7.20.5.tgz", + "integrity": "sha512-qoQprZvz5wQFJwMDqeseRXWv3rqMvhgpbXFfVyWhbx9X47POIA6i/+dXefEmZKoAgOaTdaIgNSMqMIU61yRyzA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/parser": "^7.20.7", + "@babel/types": "^7.20.7", + "@types/babel__generator": "*", + "@types/babel__template": "*", + "@types/babel__traverse": "*" + } + }, + "node_modules/@types/babel__generator": { + "version": "7.27.0", + "resolved": "https://registry.npmjs.org/@types/babel__generator/-/babel__generator-7.27.0.tgz", + "integrity": "sha512-ufFd2Xi92OAVPYsy+P4n7/U7e68fex0+Ee8gSG9KX7eo084CWiQ4sdxktvdl0bOPupXtVJPY19zk6EwWqUQ8lg==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/types": "^7.0.0" + } + }, + "node_modules/@types/babel__template": { + "version": "7.4.4", + "resolved": "https://registry.npmjs.org/@types/babel__template/-/babel__template-7.4.4.tgz", + "integrity": "sha512-h/NUaSyG5EyxBIp8YRxo4RMe2/qQgvyowRwVMzhYhBCONbW8PUsg4lkFMrhgZhUe5z3L3MiLDuvyJ/CaPa2A8A==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/parser": "^7.1.0", + "@babel/types": "^7.0.0" + } + }, + "node_modules/@types/babel__traverse": { + "version": "7.28.0", + "resolved": "https://registry.npmjs.org/@types/babel__traverse/-/babel__traverse-7.28.0.tgz", + "integrity": "sha512-8PvcXf70gTDZBgt9ptxJ8elBeBjcLOAcOtoO/mPJjtji1+CdGbHgm77om1GrsPxsiE+uXIpNSK64UYaIwQXd4Q==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/types": "^7.28.2" + } + }, + "node_modules/@types/d3": { + "version": "7.4.3", + "resolved": "https://registry.npmjs.org/@types/d3/-/d3-7.4.3.tgz", + "integrity": "sha512-lZXZ9ckh5R8uiFVt8ogUNf+pIrK4EsWrx2Np75WvF/eTpJ0FMHNhjXk8CKEx/+gpHbNQyJWehbFaTvqmHWB3ww==", + "license": "MIT", + "dependencies": { + "@types/d3-array": "*", + "@types/d3-axis": "*", + "@types/d3-brush": "*", + "@types/d3-chord": "*", + "@types/d3-color": "*", + "@types/d3-contour": "*", + "@types/d3-delaunay": "*", + "@types/d3-dispatch": "*", + "@types/d3-drag": "*", + "@types/d3-dsv": "*", + "@types/d3-ease": "*", + "@types/d3-fetch": "*", + "@types/d3-force": "*", + "@types/d3-format": "*", + "@types/d3-geo": "*", + "@types/d3-hierarchy": "*", + "@types/d3-interpolate": "*", + "@types/d3-path": "*", + "@types/d3-polygon": "*", + "@types/d3-quadtree": "*", + "@types/d3-random": "*", + "@types/d3-scale": "*", + "@types/d3-scale-chromatic": "*", + "@types/d3-selection": "*", + "@types/d3-shape": "*", + "@types/d3-time": "*", + "@types/d3-time-format": "*", + "@types/d3-timer": "*", + "@types/d3-transition": "*", + "@types/d3-zoom": "*" + } + }, + "node_modules/@types/d3-array": { + "version": "3.2.2", + "resolved": "https://registry.npmjs.org/@types/d3-array/-/d3-array-3.2.2.tgz", + "integrity": "sha512-hOLWVbm7uRza0BYXpIIW5pxfrKe0W+D5lrFiAEYR+pb6w3N2SwSMaJbXdUfSEv+dT4MfHBLtn5js0LAWaO6otw==", + "license": "MIT" + }, + "node_modules/@types/d3-axis": { + "version": "3.0.6", + "resolved": "https://registry.npmjs.org/@types/d3-axis/-/d3-axis-3.0.6.tgz", + "integrity": "sha512-pYeijfZuBd87T0hGn0FO1vQ/cgLk6E1ALJjfkC0oJ8cbwkZl3TpgS8bVBLZN+2jjGgg38epgxb2zmoGtSfvgMw==", + "license": "MIT", + "dependencies": { + "@types/d3-selection": "*" + } + }, + "node_modules/@types/d3-brush": { + "version": "3.0.6", + "resolved": "https://registry.npmjs.org/@types/d3-brush/-/d3-brush-3.0.6.tgz", + "integrity": "sha512-nH60IZNNxEcrh6L1ZSMNA28rj27ut/2ZmI3r96Zd+1jrZD++zD3LsMIjWlvg4AYrHn/Pqz4CF3veCxGjtbqt7A==", + "license": "MIT", + "dependencies": { + "@types/d3-selection": "*" + } + }, + "node_modules/@types/d3-chord": { + "version": "3.0.6", + "resolved": "https://registry.npmjs.org/@types/d3-chord/-/d3-chord-3.0.6.tgz", + "integrity": "sha512-LFYWWd8nwfwEmTZG9PfQxd17HbNPksHBiJHaKuY1XeqscXacsS2tyoo6OdRsjf+NQYeB6XrNL3a25E3gH69lcg==", + "license": "MIT" + }, + "node_modules/@types/d3-color": { + "version": "3.1.3", + "resolved": "https://registry.npmjs.org/@types/d3-color/-/d3-color-3.1.3.tgz", + "integrity": "sha512-iO90scth9WAbmgv7ogoq57O9YpKmFBbmoEoCHDB2xMBY0+/KVrqAaCDyCE16dUspeOvIxFFRI+0sEtqDqy2b4A==", + "license": "MIT" + }, + "node_modules/@types/d3-contour": { + "version": "3.0.6", + "resolved": "https://registry.npmjs.org/@types/d3-contour/-/d3-contour-3.0.6.tgz", + "integrity": "sha512-BjzLgXGnCWjUSYGfH1cpdo41/hgdWETu4YxpezoztawmqsvCeep+8QGfiY6YbDvfgHz/DkjeIkkZVJavB4a3rg==", + "license": "MIT", + "dependencies": { + "@types/d3-array": "*", + "@types/geojson": "*" + } + }, + "node_modules/@types/d3-delaunay": { + "version": "6.0.4", + "resolved": "https://registry.npmjs.org/@types/d3-delaunay/-/d3-delaunay-6.0.4.tgz", + "integrity": "sha512-ZMaSKu4THYCU6sV64Lhg6qjf1orxBthaC161plr5KuPHo3CNm8DTHiLw/5Eq2b6TsNP0W0iJrUOFscY6Q450Hw==", + "license": "MIT" + }, + "node_modules/@types/d3-dispatch": { + "version": "3.0.7", + "resolved": "https://registry.npmjs.org/@types/d3-dispatch/-/d3-dispatch-3.0.7.tgz", + "integrity": "sha512-5o9OIAdKkhN1QItV2oqaE5KMIiXAvDWBDPrD85e58Qlz1c1kI/J0NcqbEG88CoTwJrYe7ntUCVfeUl2UJKbWgA==", + "license": "MIT" + }, + "node_modules/@types/d3-drag": { + "version": "3.0.7", + "resolved": "https://registry.npmjs.org/@types/d3-drag/-/d3-drag-3.0.7.tgz", + "integrity": "sha512-HE3jVKlzU9AaMazNufooRJ5ZpWmLIoc90A37WU2JMmeq28w1FQqCZswHZ3xR+SuxYftzHq6WU6KJHvqxKzTxxQ==", + "license": "MIT", + "dependencies": { + "@types/d3-selection": "*" + } + }, + "node_modules/@types/d3-dsv": { + "version": "3.0.7", + "resolved": "https://registry.npmjs.org/@types/d3-dsv/-/d3-dsv-3.0.7.tgz", + "integrity": "sha512-n6QBF9/+XASqcKK6waudgL0pf/S5XHPPI8APyMLLUHd8NqouBGLsU8MgtO7NINGtPBtk9Kko/W4ea0oAspwh9g==", + "license": "MIT" + }, + "node_modules/@types/d3-ease": { + "version": "3.0.2", + "resolved": "https://registry.npmjs.org/@types/d3-ease/-/d3-ease-3.0.2.tgz", + "integrity": "sha512-NcV1JjO5oDzoK26oMzbILE6HW7uVXOHLQvHshBUW4UMdZGfiY6v5BeQwh9a9tCzv+CeefZQHJt5SRgK154RtiA==", + "license": "MIT" + }, + "node_modules/@types/d3-fetch": { + "version": "3.0.7", + "resolved": "https://registry.npmjs.org/@types/d3-fetch/-/d3-fetch-3.0.7.tgz", + "integrity": "sha512-fTAfNmxSb9SOWNB9IoG5c8Hg6R+AzUHDRlsXsDZsNp6sxAEOP0tkP3gKkNSO/qmHPoBFTxNrjDprVHDQDvo5aA==", + "license": "MIT", + "dependencies": { + "@types/d3-dsv": "*" + } + }, + "node_modules/@types/d3-force": { + "version": "3.0.10", + "resolved": "https://registry.npmjs.org/@types/d3-force/-/d3-force-3.0.10.tgz", + "integrity": "sha512-ZYeSaCF3p73RdOKcjj+swRlZfnYpK1EbaDiYICEEp5Q6sUiqFaFQ9qgoshp5CzIyyb/yD09kD9o2zEltCexlgw==", + "license": "MIT" + }, + "node_modules/@types/d3-format": { + "version": "3.0.4", + "resolved": "https://registry.npmjs.org/@types/d3-format/-/d3-format-3.0.4.tgz", + "integrity": "sha512-fALi2aI6shfg7vM5KiR1wNJnZ7r6UuggVqtDA+xiEdPZQwy/trcQaHnwShLuLdta2rTymCNpxYTiMZX/e09F4g==", + "license": "MIT" + }, + "node_modules/@types/d3-geo": { + "version": "3.1.0", + "resolved": "https://registry.npmjs.org/@types/d3-geo/-/d3-geo-3.1.0.tgz", + "integrity": "sha512-856sckF0oP/diXtS4jNsiQw/UuK5fQG8l/a9VVLeSouf1/PPbBE1i1W852zVwKwYCBkFJJB7nCFTbk6UMEXBOQ==", + "license": "MIT", + "dependencies": { + "@types/geojson": "*" + } + }, + "node_modules/@types/d3-hierarchy": { + "version": "3.1.7", + "resolved": "https://registry.npmjs.org/@types/d3-hierarchy/-/d3-hierarchy-3.1.7.tgz", + "integrity": "sha512-tJFtNoYBtRtkNysX1Xq4sxtjK8YgoWUNpIiUee0/jHGRwqvzYxkq0hGVbbOGSz+JgFxxRu4K8nb3YpG3CMARtg==", + "license": "MIT" + }, + "node_modules/@types/d3-interpolate": { + "version": "3.0.4", + "resolved": "https://registry.npmjs.org/@types/d3-interpolate/-/d3-interpolate-3.0.4.tgz", + "integrity": "sha512-mgLPETlrpVV1YRJIglr4Ez47g7Yxjl1lj7YKsiMCb27VJH9W8NVM6Bb9d8kkpG/uAQS5AmbA48q2IAolKKo1MA==", + "license": "MIT", + "dependencies": { + "@types/d3-color": "*" + } + }, + "node_modules/@types/d3-path": { + "version": "3.1.1", + "resolved": "https://registry.npmjs.org/@types/d3-path/-/d3-path-3.1.1.tgz", + "integrity": "sha512-VMZBYyQvbGmWyWVea0EHs/BwLgxc+MKi1zLDCONksozI4YJMcTt8ZEuIR4Sb1MMTE8MMW49v0IwI5+b7RmfWlg==", + "license": "MIT" + }, + "node_modules/@types/d3-polygon": { + "version": "3.0.2", + "resolved": "https://registry.npmjs.org/@types/d3-polygon/-/d3-polygon-3.0.2.tgz", + "integrity": "sha512-ZuWOtMaHCkN9xoeEMr1ubW2nGWsp4nIql+OPQRstu4ypeZ+zk3YKqQT0CXVe/PYqrKpZAi+J9mTs05TKwjXSRA==", + "license": "MIT" + }, + "node_modules/@types/d3-quadtree": { + "version": "3.0.6", + "resolved": "https://registry.npmjs.org/@types/d3-quadtree/-/d3-quadtree-3.0.6.tgz", + "integrity": "sha512-oUzyO1/Zm6rsxKRHA1vH0NEDG58HrT5icx/azi9MF1TWdtttWl0UIUsjEQBBh+SIkrpd21ZjEv7ptxWys1ncsg==", + "license": "MIT" + }, + "node_modules/@types/d3-random": { + "version": "3.0.3", + "resolved": "https://registry.npmjs.org/@types/d3-random/-/d3-random-3.0.3.tgz", + "integrity": "sha512-Imagg1vJ3y76Y2ea0871wpabqp613+8/r0mCLEBfdtqC7xMSfj9idOnmBYyMoULfHePJyxMAw3nWhJxzc+LFwQ==", + "license": "MIT" + }, + "node_modules/@types/d3-scale": { + "version": "4.0.9", + "resolved": "https://registry.npmjs.org/@types/d3-scale/-/d3-scale-4.0.9.tgz", + "integrity": "sha512-dLmtwB8zkAeO/juAMfnV+sItKjlsw2lKdZVVy6LRr0cBmegxSABiLEpGVmSJJ8O08i4+sGR6qQtb6WtuwJdvVw==", + "license": "MIT", + "dependencies": { + "@types/d3-time": "*" + } + }, + "node_modules/@types/d3-scale-chromatic": { + "version": "3.1.0", + "resolved": "https://registry.npmjs.org/@types/d3-scale-chromatic/-/d3-scale-chromatic-3.1.0.tgz", + "integrity": "sha512-iWMJgwkK7yTRmWqRB5plb1kadXyQ5Sj8V/zYlFGMUBbIPKQScw+Dku9cAAMgJG+z5GYDoMjWGLVOvjghDEFnKQ==", + "license": "MIT" + }, + "node_modules/@types/d3-selection": { + "version": "3.0.11", + "resolved": "https://registry.npmjs.org/@types/d3-selection/-/d3-selection-3.0.11.tgz", + "integrity": "sha512-bhAXu23DJWsrI45xafYpkQ4NtcKMwWnAC/vKrd2l+nxMFuvOT3XMYTIj2opv8vq8AO5Yh7Qac/nSeP/3zjTK0w==", + "license": "MIT" + }, + "node_modules/@types/d3-shape": { + "version": "3.1.8", + "resolved": "https://registry.npmjs.org/@types/d3-shape/-/d3-shape-3.1.8.tgz", + "integrity": "sha512-lae0iWfcDeR7qt7rA88BNiqdvPS5pFVPpo5OfjElwNaT2yyekbM0C9vK+yqBqEmHr6lDkRnYNoTBYlAgJa7a4w==", + "license": "MIT", + "dependencies": { + "@types/d3-path": "*" + } + }, + "node_modules/@types/d3-time": { + "version": "3.0.4", + "resolved": "https://registry.npmjs.org/@types/d3-time/-/d3-time-3.0.4.tgz", + "integrity": "sha512-yuzZug1nkAAaBlBBikKZTgzCeA+k1uy4ZFwWANOfKw5z5LRhV0gNA7gNkKm7HoK+HRN0wX3EkxGk0fpbWhmB7g==", + "license": "MIT" + }, + "node_modules/@types/d3-time-format": { + "version": "4.0.3", + "resolved": "https://registry.npmjs.org/@types/d3-time-format/-/d3-time-format-4.0.3.tgz", + "integrity": "sha512-5xg9rC+wWL8kdDj153qZcsJ0FWiFt0J5RB6LYUNZjwSnesfblqrI/bJ1wBdJ8OQfncgbJG5+2F+qfqnqyzYxyg==", + "license": "MIT" + }, + "node_modules/@types/d3-timer": { + "version": "3.0.2", + "resolved": "https://registry.npmjs.org/@types/d3-timer/-/d3-timer-3.0.2.tgz", + "integrity": "sha512-Ps3T8E8dZDam6fUyNiMkekK3XUsaUEik+idO9/YjPtfj2qruF8tFBXS7XhtE4iIXBLxhmLjP3SXpLhVf21I9Lw==", + "license": "MIT" + }, + "node_modules/@types/d3-transition": { + "version": "3.0.9", + "resolved": "https://registry.npmjs.org/@types/d3-transition/-/d3-transition-3.0.9.tgz", + "integrity": "sha512-uZS5shfxzO3rGlu0cC3bjmMFKsXv+SmZZcgp0KD22ts4uGXp5EVYGzu/0YdwZeKmddhcAccYtREJKkPfXkZuCg==", + "license": "MIT", + "dependencies": { + "@types/d3-selection": "*" + } + }, + "node_modules/@types/d3-zoom": { + "version": "3.0.8", + "resolved": "https://registry.npmjs.org/@types/d3-zoom/-/d3-zoom-3.0.8.tgz", + "integrity": "sha512-iqMC4/YlFCSlO8+2Ii1GGGliCAY4XdeG748w5vQUbevlbDu0zSjH/+jojorQVBK/se0j6DUFNPBGSqD3YWYnDw==", + "license": "MIT", + "dependencies": { + "@types/d3-interpolate": "*", + "@types/d3-selection": "*" + } + }, + "node_modules/@types/estree": { + "version": "1.0.8", + "resolved": "https://registry.npmjs.org/@types/estree/-/estree-1.0.8.tgz", + "integrity": "sha512-dWHzHa2WqEXI/O1E9OjrocMTKJl2mSrEolh1Iomrv6U+JuNwaHXsXx9bLu5gG7BUWFIN0skIQJQ/L1rIex4X6w==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/geojson": { + "version": "7946.0.16", + "resolved": "https://registry.npmjs.org/@types/geojson/-/geojson-7946.0.16.tgz", + "integrity": "sha512-6C8nqWur3j98U6+lXDfTUWIfgvZU+EumvpHKcYjujKH7woYyLj2sUmff0tRhrqM7BohUw7Pz3ZB1jj2gW9Fvmg==", + "license": "MIT" + }, + "node_modules/@vitejs/plugin-react": { + "version": "4.7.0", + "resolved": "https://registry.npmjs.org/@vitejs/plugin-react/-/plugin-react-4.7.0.tgz", + "integrity": "sha512-gUu9hwfWvvEDBBmgtAowQCojwZmJ5mcLn3aufeCsitijs3+f2NsrPtlAWIR6OPiqljl96GVCUbLe0HyqIpVaoA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@babel/core": "^7.28.0", + "@babel/plugin-transform-react-jsx-self": "^7.27.1", + "@babel/plugin-transform-react-jsx-source": "^7.27.1", + "@rolldown/pluginutils": "1.0.0-beta.27", + "@types/babel__core": "^7.20.5", + "react-refresh": "^0.17.0" + }, + "engines": { + "node": "^14.18.0 || >=16.0.0" + }, + "peerDependencies": { + "vite": "^4.2.0 || ^5.0.0 || ^6.0.0 || ^7.0.0" + } + }, + "node_modules/baseline-browser-mapping": { + "version": "2.10.32", + "resolved": "https://registry.npmjs.org/baseline-browser-mapping/-/baseline-browser-mapping-2.10.32.tgz", + "integrity": "sha512-wbPvpyjJPC0zdfdKXxqEL3Ea+bOMD/87X4lftiJkkaBiuG6ALQy1SLmEd7BSmVCuwCQsBrCamgBoLyfFDD1EPg==", + "dev": true, + "license": "Apache-2.0", + "bin": { + "baseline-browser-mapping": "dist/cli.cjs" + }, + "engines": { + "node": ">=6.0.0" + } + }, + "node_modules/browserslist": { + "version": "4.28.2", + "resolved": "https://registry.npmjs.org/browserslist/-/browserslist-4.28.2.tgz", + "integrity": "sha512-48xSriZYYg+8qXna9kwqjIVzuQxi+KYWp2+5nCYnYKPTr0LvD89Jqk2Or5ogxz0NUMfIjhh2lIUX/LyX9B4oIg==", + "dev": true, + "funding": [ + { + "type": "opencollective", + "url": "https://opencollective.com/browserslist" + }, + { + "type": "tidelift", + "url": "https://tidelift.com/funding/github/npm/browserslist" + }, + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "dependencies": { + "baseline-browser-mapping": "^2.10.12", + "caniuse-lite": "^1.0.30001782", + "electron-to-chromium": "^1.5.328", + "node-releases": "^2.0.36", + "update-browserslist-db": "^1.2.3" + }, + "bin": { + "browserslist": "cli.js" + }, + "engines": { + "node": "^6 || ^7 || ^8 || ^9 || ^10 || ^11 || ^12 || >=13.7" + } + }, + "node_modules/caniuse-lite": { + "version": "1.0.30001793", + "resolved": "https://registry.npmjs.org/caniuse-lite/-/caniuse-lite-1.0.30001793.tgz", + "integrity": "sha512-iwSsYWaCOoh26cV8NwNRViHlrfUvYsHDfRVcbtmw0Kg6PJIZZXwMkj1442FYLBGkeUf1juAsU3DTfxW579mrPA==", + "dev": true, + "funding": [ + { + "type": "opencollective", + "url": "https://opencollective.com/browserslist" + }, + { + "type": "tidelift", + "url": "https://tidelift.com/funding/github/npm/caniuse-lite" + }, + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "CC-BY-4.0" + }, + "node_modules/classcat": { + "version": "5.0.5", + "resolved": "https://registry.npmjs.org/classcat/-/classcat-5.0.5.tgz", + "integrity": "sha512-JhZUT7JFcQy/EzW605k/ktHtncoo9vnyW/2GspNYwFlN1C/WmjuV/xtS04e9SOkL2sTdw0VAZ2UGCcQ9lR6p6w==", + "license": "MIT" + }, + "node_modules/convert-source-map": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/convert-source-map/-/convert-source-map-2.0.0.tgz", + "integrity": "sha512-Kvp459HrV2FEJ1CAsi1Ku+MY3kasH19TFykTz2xWmMeq6bk2NU3XXvfJ+Q61m0xktWwt+1HSYf3JZsTms3aRJg==", + "dev": true, + "license": "MIT" + }, + "node_modules/d3-color": { + "version": "3.1.0", + "resolved": "https://registry.npmjs.org/d3-color/-/d3-color-3.1.0.tgz", + "integrity": "sha512-zg/chbXyeBtMQ1LbD/WSoW2DpC3I0mpmPdW+ynRTj/x2DAWYrIY7qeZIHidozwV24m4iavr15lNwIwLxRmOxhA==", + "license": "ISC", + "engines": { + "node": ">=12" + } + }, + "node_modules/d3-dispatch": { + "version": "3.0.1", + "resolved": "https://registry.npmjs.org/d3-dispatch/-/d3-dispatch-3.0.1.tgz", + "integrity": "sha512-rzUyPU/S7rwUflMyLc1ETDeBj0NRuHKKAcvukozwhshr6g6c5d8zh4c2gQjY2bZ0dXeGLWc1PF174P2tVvKhfg==", + "license": "ISC", + "engines": { + "node": ">=12" + } + }, + "node_modules/d3-drag": { + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/d3-drag/-/d3-drag-3.0.0.tgz", + "integrity": "sha512-pWbUJLdETVA8lQNJecMxoXfH6x+mO2UQo8rSmZ+QqxcbyA3hfeprFgIT//HW2nlHChWeIIMwS2Fq+gEARkhTkg==", + "license": "ISC", + "dependencies": { + "d3-dispatch": "1 - 3", + "d3-selection": "3" + }, + "engines": { + "node": ">=12" + } + }, + "node_modules/d3-ease": { + "version": "3.0.1", + "resolved": "https://registry.npmjs.org/d3-ease/-/d3-ease-3.0.1.tgz", + "integrity": "sha512-wR/XK3D3XcLIZwpbvQwQ5fK+8Ykds1ip7A2Txe0yxncXSdq1L9skcG7blcedkOX+ZcgxGAmLX1FrRGbADwzi0w==", + "license": "BSD-3-Clause", + "engines": { + "node": ">=12" + } + }, + "node_modules/d3-interpolate": { + "version": "3.0.1", + "resolved": "https://registry.npmjs.org/d3-interpolate/-/d3-interpolate-3.0.1.tgz", + "integrity": "sha512-3bYs1rOD33uo8aqJfKP3JWPAibgw8Zm2+L9vBKEHJ2Rg+viTR7o5Mmv5mZcieN+FRYaAOWX5SJATX6k1PWz72g==", + "license": "ISC", + "dependencies": { + "d3-color": "1 - 3" + }, + "engines": { + "node": ">=12" + } + }, + "node_modules/d3-selection": { + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/d3-selection/-/d3-selection-3.0.0.tgz", + "integrity": "sha512-fmTRWbNMmsmWq6xJV8D19U/gw/bwrHfNXxrIN+HfZgnzqTHp9jOmKMhsTUjXOJnZOdZY9Q28y4yebKzqDKlxlQ==", + "license": "ISC", + "engines": { + "node": ">=12" + } + }, + "node_modules/d3-timer": { + "version": "3.0.1", + "resolved": "https://registry.npmjs.org/d3-timer/-/d3-timer-3.0.1.tgz", + "integrity": "sha512-ndfJ/JxxMd3nw31uyKoY2naivF+r29V+Lc0svZxe1JvvIRmi8hUsrMvdOwgS1o6uBHmiz91geQ0ylPP0aj1VUA==", + "license": "ISC", + "engines": { + "node": ">=12" + } + }, + "node_modules/d3-transition": { + "version": "3.0.1", + "resolved": "https://registry.npmjs.org/d3-transition/-/d3-transition-3.0.1.tgz", + "integrity": "sha512-ApKvfjsSR6tg06xrL434C0WydLr7JewBB3V+/39RMHsaXTOG0zmt/OAXeng5M5LBm0ojmxJrpomQVZ1aPvBL4w==", + "license": "ISC", + "dependencies": { + "d3-color": "1 - 3", + "d3-dispatch": "1 - 3", + "d3-ease": "1 - 3", + "d3-interpolate": "1 - 3", + "d3-timer": "1 - 3" + }, + "engines": { + "node": ">=12" + }, + "peerDependencies": { + "d3-selection": "2 - 3" + } + }, + "node_modules/d3-zoom": { + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/d3-zoom/-/d3-zoom-3.0.0.tgz", + "integrity": "sha512-b8AmV3kfQaqWAuacbPuNbL6vahnOJflOhexLzMMNLga62+/nh0JzvJ0aO/5a5MVgUFGS7Hu1P9P03o3fJkDCyw==", + "license": "ISC", + "dependencies": { + "d3-dispatch": "1 - 3", + "d3-drag": "2 - 3", + "d3-interpolate": "1 - 3", + "d3-selection": "2 - 3", + "d3-transition": "2 - 3" + }, + "engines": { + "node": ">=12" + } + }, + "node_modules/debug": { + "version": "4.4.3", + "resolved": "https://registry.npmjs.org/debug/-/debug-4.4.3.tgz", + "integrity": "sha512-RGwwWnwQvkVfavKVt22FGLw+xYSdzARwm0ru6DhTVA3umU5hZc28V3kO4stgYryrTlLpuvgI9GiijltAjNbcqA==", + "dev": true, + "license": "MIT", + "dependencies": { + "ms": "^2.1.3" + }, + "engines": { + "node": ">=6.0" + }, + "peerDependenciesMeta": { + "supports-color": { + "optional": true + } + } + }, + "node_modules/electron-to-chromium": { + "version": "1.5.361", + "resolved": "https://registry.npmjs.org/electron-to-chromium/-/electron-to-chromium-1.5.361.tgz", + "integrity": "sha512-Q6Hts7N9FnJc5LeGRINFvLhCI9xZmNtTDe5ZbcVezQz7cU4a8Aua3GH1b8J2XY8Al9PF+OCwYqhgsOOheMdvkA==", + "dev": true, + "license": "ISC" + }, + "node_modules/esbuild": { + "version": "0.21.5", + "resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.21.5.tgz", + "integrity": "sha512-mg3OPMV4hXywwpoDxu3Qda5xCKQi+vCTZq8S9J/EpkhB2HzKXq4SNFZE3+NK93JYxc8VMSep+lOUSC/RVKaBqw==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "bin": { + "esbuild": "bin/esbuild" + }, + "engines": { + "node": ">=12" + }, + "optionalDependencies": { + "@esbuild/aix-ppc64": "0.21.5", + "@esbuild/android-arm": "0.21.5", + "@esbuild/android-arm64": "0.21.5", + "@esbuild/android-x64": "0.21.5", + "@esbuild/darwin-arm64": "0.21.5", + "@esbuild/darwin-x64": "0.21.5", + "@esbuild/freebsd-arm64": "0.21.5", + "@esbuild/freebsd-x64": "0.21.5", + "@esbuild/linux-arm": "0.21.5", + "@esbuild/linux-arm64": "0.21.5", + "@esbuild/linux-ia32": "0.21.5", + "@esbuild/linux-loong64": "0.21.5", + "@esbuild/linux-mips64el": "0.21.5", + "@esbuild/linux-ppc64": "0.21.5", + "@esbuild/linux-riscv64": "0.21.5", + "@esbuild/linux-s390x": "0.21.5", + "@esbuild/linux-x64": "0.21.5", + "@esbuild/netbsd-x64": "0.21.5", + "@esbuild/openbsd-x64": "0.21.5", + "@esbuild/sunos-x64": "0.21.5", + "@esbuild/win32-arm64": "0.21.5", + "@esbuild/win32-ia32": "0.21.5", + "@esbuild/win32-x64": "0.21.5" + } + }, + "node_modules/escalade": { + "version": "3.2.0", + "resolved": "https://registry.npmjs.org/escalade/-/escalade-3.2.0.tgz", + "integrity": "sha512-WUj2qlxaQtO4g6Pq5c29GTcWGDyd8itL8zTlipgECz3JesAiiOKotd8JU6otB3PACgG6xkJUyVhboMS+bje/jA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/fsevents": { + "version": "2.3.3", + "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.3.tgz", + "integrity": "sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": "^8.16.0 || ^10.6.0 || >=11.0.0" + } + }, + "node_modules/gensync": { + "version": "1.0.0-beta.2", + "resolved": "https://registry.npmjs.org/gensync/-/gensync-1.0.0-beta.2.tgz", + "integrity": "sha512-3hN7NaskYvMDLQY55gnW3NQ+mesEAepTqlg+VEbj7zzqEMBVNhzcGYYeqFo/TlYz6eQiFcp1HcsCZO+nGgS8zg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/js-tokens": { + "version": "4.0.0", + "resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-4.0.0.tgz", + "integrity": "sha512-RdJUflcE3cUzKiMqQgsCu06FPu9UdIJO0beYbPhHN4k6apgJtifcoCtT9bcxOpYBtpD2kCM6Sbzg4CausW/PKQ==", + "license": "MIT" + }, + "node_modules/jsesc": { + "version": "3.1.0", + "resolved": "https://registry.npmjs.org/jsesc/-/jsesc-3.1.0.tgz", + "integrity": "sha512-/sM3dO2FOzXjKQhJuo0Q173wf2KOo8t4I8vHy6lF9poUp7bKT0/NHE8fPX23PwfhnykfqnC2xRxOnVw5XuGIaA==", + "dev": true, + "license": "MIT", + "bin": { + "jsesc": "bin/jsesc" + }, + "engines": { + "node": ">=6" + } + }, + "node_modules/json5": { + "version": "2.2.3", + "resolved": "https://registry.npmjs.org/json5/-/json5-2.2.3.tgz", + "integrity": "sha512-XmOWe7eyHYH14cLdVPoyg+GOH3rYX++KpzrylJwSW98t3Nk+U8XOl8FWKOgwtzdb8lXGf6zYwDUzeHMWfxasyg==", + "dev": true, + "license": "MIT", + "bin": { + "json5": "lib/cli.js" + }, + "engines": { + "node": ">=6" + } + }, + "node_modules/loose-envify": { + "version": "1.4.0", + "resolved": "https://registry.npmjs.org/loose-envify/-/loose-envify-1.4.0.tgz", + "integrity": "sha512-lyuxPGr/Wfhrlem2CL/UcnUc1zcqKAImBDzukY7Y5F/yQiNdko6+fRLevlw1HgMySw7f611UIY408EtxRSoK3Q==", + "license": "MIT", + "dependencies": { + "js-tokens": "^3.0.0 || ^4.0.0" + }, + "bin": { + "loose-envify": "cli.js" + } + }, + "node_modules/lru-cache": { + "version": "5.1.1", + "resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-5.1.1.tgz", + "integrity": "sha512-KpNARQA3Iwv+jTA0utUVVbrh+Jlrr1Fv0e56GGzAFOXN7dk/FviaDW8LHmK52DlcH4WP2n6gI8vN1aesBFgo9w==", + "dev": true, + "license": "ISC", + "dependencies": { + "yallist": "^3.0.2" + } + }, + "node_modules/ms": { + "version": "2.1.3", + "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz", + "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==", + "dev": true, + "license": "MIT" + }, + "node_modules/nanoid": { + "version": "3.3.12", + "resolved": "https://registry.npmjs.org/nanoid/-/nanoid-3.3.12.tgz", + "integrity": "sha512-ZB9RH/39qpq5Vu6Y+NmUaFhQR6pp+M2Xt76XBnEwDaGcVAqhlvxrl3B2bKS5D3NH3QR76v3aSrKaF/Kiy7lEtQ==", + "dev": true, + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "bin": { + "nanoid": "bin/nanoid.cjs" + }, + "engines": { + "node": "^10 || ^12 || ^13.7 || ^14 || >=15.0.1" + } + }, + "node_modules/node-releases": { + "version": "2.0.46", + "resolved": "https://registry.npmjs.org/node-releases/-/node-releases-2.0.46.tgz", + "integrity": "sha512-GYVXHE2KnrzAfsAjl4uP++evGFCrAU1jta4ubEjIG7YWt/64Gqv66a30yKwWczVjA6j3bM4nBwH7Pk1JmDHaxQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=18" + } + }, + "node_modules/picocolors": { + "version": "1.1.1", + "resolved": "https://registry.npmjs.org/picocolors/-/picocolors-1.1.1.tgz", + "integrity": "sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==", + "dev": true, + "license": "ISC" + }, + "node_modules/postcss": { + "version": "8.5.15", + "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.15.tgz", + "integrity": "sha512-FfR8sjd4em2T6fb3I2MwAJU7HWVMr9zba+enmQeeWFfCbm+UOC/0X4DS8XtpUTMwWMGbjKYP7xjfNekzyGmB3A==", + "dev": true, + "funding": [ + { + "type": "opencollective", + "url": "https://opencollective.com/postcss/" + }, + { + "type": "tidelift", + "url": "https://tidelift.com/funding/github/npm/postcss" + }, + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "dependencies": { + "nanoid": "^3.3.12", + "picocolors": "^1.1.1", + "source-map-js": "^1.2.1" + }, + "engines": { + "node": "^10 || ^12 || >=14" + } + }, + "node_modules/react": { + "version": "18.3.1", + "resolved": "https://registry.npmjs.org/react/-/react-18.3.1.tgz", + "integrity": "sha512-wS+hAgJShR0KhEvPJArfuPVN1+Hz1t0Y6n5jLrGQbkb4urgPE/0Rve+1kMB1v/oWgHgm4WIcV+i7F2pTVj+2iQ==", + "license": "MIT", + "dependencies": { + "loose-envify": "^1.1.0" + }, + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/react-dom": { + "version": "18.3.1", + "resolved": "https://registry.npmjs.org/react-dom/-/react-dom-18.3.1.tgz", + "integrity": "sha512-5m4nQKp+rZRb09LNH59GM4BxTh9251/ylbKIbpe7TpGxfJ+9kv6BLkLBXIjjspbgbnIBNqlI23tRnTWT0snUIw==", + "license": "MIT", + "dependencies": { + "loose-envify": "^1.1.0", + "scheduler": "^0.23.2" + }, + "peerDependencies": { + "react": "^18.3.1" + } + }, + "node_modules/react-refresh": { + "version": "0.17.0", + "resolved": "https://registry.npmjs.org/react-refresh/-/react-refresh-0.17.0.tgz", + "integrity": "sha512-z6F7K9bV85EfseRCp2bzrpyQ0Gkw1uLoCel9XBVWPg/TjRj94SkJzUTGfOa4bs7iJvBWtQG0Wq7wnI0syw3EBQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/reactflow": { + "version": "11.11.4", + "resolved": "https://registry.npmjs.org/reactflow/-/reactflow-11.11.4.tgz", + "integrity": "sha512-70FOtJkUWH3BAOsN+LU9lCrKoKbtOPnz2uq0CV2PLdNSwxTXOhCbsZr50GmZ+Rtw3jx8Uv7/vBFtCGixLfd4Og==", + "license": "MIT", + "dependencies": { + "@reactflow/background": "11.3.14", + "@reactflow/controls": "11.2.14", + "@reactflow/core": "11.11.4", + "@reactflow/minimap": "11.7.14", + "@reactflow/node-resizer": "2.2.14", + "@reactflow/node-toolbar": "1.3.14" + }, + "peerDependencies": { + "react": ">=17", + "react-dom": ">=17" + } + }, + "node_modules/rollup": { + "version": "4.60.4", + "resolved": "https://registry.npmjs.org/rollup/-/rollup-4.60.4.tgz", + "integrity": "sha512-WHeFSbZYsPu3+bLoNRUuAO+wavNlocOPf3wSHTP7hcFKVnJeWsYlCDbr3mTS14FCizf9ccIxXA8sGL8zKeQN3g==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/estree": "1.0.8" + }, + "bin": { + "rollup": "dist/bin/rollup" + }, + "engines": { + "node": ">=18.0.0", + "npm": ">=8.0.0" + }, + "optionalDependencies": { + "@rollup/rollup-android-arm-eabi": "4.60.4", + "@rollup/rollup-android-arm64": "4.60.4", + "@rollup/rollup-darwin-arm64": "4.60.4", + "@rollup/rollup-darwin-x64": "4.60.4", + "@rollup/rollup-freebsd-arm64": "4.60.4", + "@rollup/rollup-freebsd-x64": "4.60.4", + "@rollup/rollup-linux-arm-gnueabihf": "4.60.4", + "@rollup/rollup-linux-arm-musleabihf": "4.60.4", + "@rollup/rollup-linux-arm64-gnu": "4.60.4", + "@rollup/rollup-linux-arm64-musl": "4.60.4", + "@rollup/rollup-linux-loong64-gnu": "4.60.4", + "@rollup/rollup-linux-loong64-musl": "4.60.4", + "@rollup/rollup-linux-ppc64-gnu": "4.60.4", + "@rollup/rollup-linux-ppc64-musl": "4.60.4", + "@rollup/rollup-linux-riscv64-gnu": "4.60.4", + "@rollup/rollup-linux-riscv64-musl": "4.60.4", + "@rollup/rollup-linux-s390x-gnu": "4.60.4", + "@rollup/rollup-linux-x64-gnu": "4.60.4", + "@rollup/rollup-linux-x64-musl": "4.60.4", + "@rollup/rollup-openbsd-x64": "4.60.4", + "@rollup/rollup-openharmony-arm64": "4.60.4", + "@rollup/rollup-win32-arm64-msvc": "4.60.4", + "@rollup/rollup-win32-ia32-msvc": "4.60.4", + "@rollup/rollup-win32-x64-gnu": "4.60.4", + "@rollup/rollup-win32-x64-msvc": "4.60.4", + "fsevents": "~2.3.2" + } + }, + "node_modules/scheduler": { + "version": "0.23.2", + "resolved": "https://registry.npmjs.org/scheduler/-/scheduler-0.23.2.tgz", + "integrity": "sha512-UOShsPwz7NrMUqhR6t0hWjFduvOzbtv7toDH1/hIrfRNIDBnnBWd0CwJTGvTpngVlmwGCdP9/Zl/tVrDqcuYzQ==", + "license": "MIT", + "dependencies": { + "loose-envify": "^1.1.0" + } + }, + "node_modules/semver": { + "version": "6.3.1", + "resolved": "https://registry.npmjs.org/semver/-/semver-6.3.1.tgz", + "integrity": "sha512-BR7VvDCVHO+q2xBEWskxS6DJE1qRnb7DxzUrogb71CWoSficBxYsiAGd+Kl0mmq/MprG9yArRkyrQxTO6XjMzA==", + "dev": true, + "license": "ISC", + "bin": { + "semver": "bin/semver.js" + } + }, + "node_modules/source-map-js": { + "version": "1.2.1", + "resolved": "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.1.tgz", + "integrity": "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==", + "dev": true, + "license": "BSD-3-Clause", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/update-browserslist-db": { + "version": "1.2.3", + "resolved": "https://registry.npmjs.org/update-browserslist-db/-/update-browserslist-db-1.2.3.tgz", + "integrity": "sha512-Js0m9cx+qOgDxo0eMiFGEueWztz+d4+M3rGlmKPT+T4IS/jP4ylw3Nwpu6cpTTP8R1MAC1kF4VbdLt3ARf209w==", + "dev": true, + "funding": [ + { + "type": "opencollective", + "url": "https://opencollective.com/browserslist" + }, + { + "type": "tidelift", + "url": "https://tidelift.com/funding/github/npm/browserslist" + }, + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "dependencies": { + "escalade": "^3.2.0", + "picocolors": "^1.1.1" + }, + "bin": { + "update-browserslist-db": "cli.js" + }, + "peerDependencies": { + "browserslist": ">= 4.21.0" + } + }, + "node_modules/use-sync-external-store": { + "version": "1.6.0", + "resolved": "https://registry.npmjs.org/use-sync-external-store/-/use-sync-external-store-1.6.0.tgz", + "integrity": "sha512-Pp6GSwGP/NrPIrxVFAIkOQeyw8lFenOHijQWkUTrDvrF4ALqylP2C/KCkeS9dpUM3KvYRQhna5vt7IL95+ZQ9w==", + "license": "MIT", + "peerDependencies": { + "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0" + } + }, + "node_modules/vite": { + "version": "5.4.21", + "resolved": "https://registry.npmjs.org/vite/-/vite-5.4.21.tgz", + "integrity": "sha512-o5a9xKjbtuhY6Bi5S3+HvbRERmouabWbyUcpXXUA1u+GNUKoROi9byOJ8M0nHbHYHkYICiMlqxkg1KkYmm25Sw==", + "dev": true, + "license": "MIT", + "dependencies": { + "esbuild": "^0.21.3", + "postcss": "^8.4.43", + "rollup": "^4.20.0" + }, + "bin": { + "vite": "bin/vite.js" + }, + "engines": { + "node": "^18.0.0 || >=20.0.0" + }, + "funding": { + "url": "https://github.com/vitejs/vite?sponsor=1" + }, + "optionalDependencies": { + "fsevents": "~2.3.3" + }, + "peerDependencies": { + "@types/node": "^18.0.0 || >=20.0.0", + "less": "*", + "lightningcss": "^1.21.0", + "sass": "*", + "sass-embedded": "*", + "stylus": "*", + "sugarss": "*", + "terser": "^5.4.0" + }, + "peerDependenciesMeta": { + "@types/node": { + "optional": true + }, + "less": { + "optional": true + }, + "lightningcss": { + "optional": true + }, + "sass": { + "optional": true + }, + "sass-embedded": { + "optional": true + }, + "stylus": { + "optional": true + }, + "sugarss": { + "optional": true + }, + "terser": { + "optional": true + } + } + }, + "node_modules/yallist": { + "version": "3.1.1", + "resolved": "https://registry.npmjs.org/yallist/-/yallist-3.1.1.tgz", + "integrity": "sha512-a4UGQaWPH59mOXUYnAG2ewncQS4i4F43Tv3JoAM+s2VDAmS9NsK8GpDMLrCHPksFT7h3K6TOoUNn2pb7RoXx4g==", + "dev": true, + "license": "ISC" + }, + "node_modules/zustand": { + "version": "4.5.7", + "resolved": "https://registry.npmjs.org/zustand/-/zustand-4.5.7.tgz", + "integrity": "sha512-CHOUy7mu3lbD6o6LJLfllpjkzhHXSBlX8B9+qPddUsIfeF5S/UZ5q0kmCsnRqT1UHFQZchNFDDzMbQsuesHWlw==", + "license": "MIT", + "dependencies": { + "use-sync-external-store": "^1.2.2" + }, + "engines": { + "node": ">=12.7.0" + }, + "peerDependencies": { + "@types/react": ">=16.8", + "immer": ">=9.0.6", + "react": ">=16.8" + }, + "peerDependenciesMeta": { + "@types/react": { + "optional": true + }, + "immer": { + "optional": true + }, + "react": { + "optional": true + } + } + } + } +} diff --git a/testing_backend_ui/package.json b/testing_backend_ui/package.json new file mode 100644 index 0000000..f81c30c --- /dev/null +++ b/testing_backend_ui/package.json @@ -0,0 +1,20 @@ +{ + "name": "testing-backend-ui", + "private": true, + "version": "0.1.0", + "type": "module", + "scripts": { + "dev": "vite", + "build": "vite build", + "preview": "vite preview" + }, + "dependencies": { + "react": "^18.3.1", + "react-dom": "^18.3.1", + "reactflow": "^11.11.4" + }, + "devDependencies": { + "@vitejs/plugin-react": "^4.3.1", + "vite": "^5.4.10" + } +} \ No newline at end of file diff --git a/testing_backend_ui/src/App.jsx b/testing_backend_ui/src/App.jsx new file mode 100644 index 0000000..b6d4781 --- /dev/null +++ b/testing_backend_ui/src/App.jsx @@ -0,0 +1,994 @@ +import { useMemo, useRef, useState, useEffect } from 'react'; +import ReactFlow, { + Background, + Controls, + MarkerType, + MiniMap, + Position, + ReactFlowProvider, +} from 'reactflow'; +import 'reactflow/dist/style.css'; + +const BACKEND_WS_URL = import.meta.env.VITE_ORACLE_WS_URL ?? 'ws://localhost:8000/ws/analyze'; +// derive HTTP API base from websocket URL (fallback) +const API_BASE = import.meta.env.VITE_API_BASE ?? BACKEND_WS_URL.replace(/^ws/, 'http').replace('/ws/analyze', ''); +const DEFAULT_REPO_URL = 'https://github.com/Project-XI/Project-EL'; + +const STATUS_CLASS = { + idle: 'status-idle', + running: 'status-running', + done: 'status-done', + error: 'status-error', +}; + +const EVENT_CLASS = { + HANDOFF: 'event-handoff', + STATUS: 'event-status', + RESULT: 'event-result', + FLAG: 'event-flag', + ERROR: 'event-error', +}; + +const AGENT_META = { + gatekeeper: { + id: 'gatekeeper', + name: 'GATEKEEPER', + status: 'idle', + progress: 0, + lastEvent: 'Awaiting validation start', + durationMs: 0, + position: { x: 120, y: 380 }, + }, + oracle: { + id: 'oracle', + name: 'ORACLE', + status: 'idle', + progress: 0, + lastEvent: 'Awaiting repository analysis', + durationMs: 0, + position: { x: 300, y: 40 }, + }, + main: { + id: 'main', + name: 'MAIN VIVA', + status: 'idle', + progress: 0, + lastEvent: 'Awaiting viva handoff', + durationMs: 0, + position: { x: 300, y: 220 }, + }, + sentinel: { + id: 'sentinel', + name: 'SENTINEL', + status: 'idle', + progress: 0, + lastEvent: 'Awaiting oversight stream', + durationMs: 0, + position: { x: 520, y: 400 }, + }, +}; + +const DEFAULT_ALERTS = [ + { + id: 'alt-1', + owner: 'SENTINEL', + type: 'tone shift', + severity: 'high', + summary: 'Rapid tone instability during follow-up Q2.', + }, + { + id: 'alt-2', + owner: 'SENTINEL', + type: 'gaze anomaly', + severity: 'medium', + summary: 'Repeated gaze divergence during answer window.', + }, + { + id: 'alt-3', + owner: 'GATEKEEPER', + type: 'identity mismatch', + severity: 'critical', + summary: 'Voice-print variance exceeded allowed tolerance.', + }, + { + id: 'alt-4', + owner: 'GATEKEEPER', + type: 'session breach', + severity: 'low', + summary: 'Unexpected short disconnect recovered automatically.', + }, +]; + +const AGENT_SEQUENCE = ['GATEKEEPER', 'ORACLE', 'MAIN VIVA', 'SENTINEL']; +const EDGE_BY_SOURCE = { + GATEKEEPER: 'e-gatekeeper-oracle', + ORACLE: 'e-oracle-main', + 'MAIN VIVA': 'e-main-sentinel', + SENTINEL: 'e-main-sentinel', +}; + +function normalizeKey(value) { + return String(value ?? '') + .toUpperCase() + .replace(/\s+/g, '_') + .replace(/[^A-Z0-9_]/g, ''); +} + +function createInitialAgentState() { + return Object.fromEntries( + Object.values(AGENT_META).map((agent) => [ + agent.id, + { + name: agent.name, + status: agent.status, + progress: agent.progress, + lastEvent: agent.lastEvent, + durationMs: agent.durationMs, + }, + ]) + ); +} + +function AgentNode({ data }) { + return ( +
+
{data.name}
+
{data.status.toUpperCase()}
+
+ Progress + {data.progress}% +
+
+
+
+
{data.lastEvent}
+
Last op: {data.durationMs}ms
+
+ ); +} + +function buildNodes(agentState) { + return Object.values(AGENT_META).map((agent) => ({ + id: agent.id, + type: 'agentNode', + position: agent.position, + sourcePosition: Position.Bottom, + targetPosition: Position.Top, + data: { + name: agent.name, + status: agentState[agent.id]?.status ?? agent.status, + progress: agentState[agent.id]?.progress ?? 0, + lastEvent: agentState[agent.id]?.lastEvent ?? agent.lastEvent, + durationMs: agentState[agent.id]?.durationMs ?? agent.durationMs, + }, + })); +} + +function buildEdges(activeEdgeIds) { + return [ + { + id: 'e-gatekeeper-oracle', + source: 'gatekeeper', + target: 'oracle', + label: 'HANDOFF', + markerEnd: { type: MarkerType.ArrowClosed, width: 18, height: 18 }, + animated: activeEdgeIds.includes('e-gatekeeper-oracle'), + }, + { + id: 'e-oracle-main', + source: 'oracle', + target: 'main', + label: 'RESULT', + markerEnd: { type: MarkerType.ArrowClosed, width: 18, height: 18 }, + animated: activeEdgeIds.includes('e-oracle-main'), + }, + { + id: 'e-main-sentinel', + source: 'main', + target: 'sentinel', + label: 'STATUS/FLAG', + markerEnd: { type: MarkerType.ArrowClosed, width: 18, height: 18 }, + animated: activeEdgeIds.includes('e-main-sentinel'), + }, + ]; +} + +function inferSourceAgent(message = '') { + if (message.includes('[Gatekeeper]')) return 'GATEKEEPER'; + if (message.includes('[Oracle]')) return 'ORACLE'; + if (message.includes('[MainAgent]')) return 'MAIN VIVA'; + if (message.includes('[Sentinel]')) return 'SENTINEL'; + return 'MAIN VIVA'; +} + +function inferEventType(logType = 'info', message = '') { + const text = `${logType} ${message}`.toLowerCase(); + if (text.includes('error') || text.includes('failed') || text.includes('rejected') || text.includes('timeout')) { + return 'ERROR'; + } + if (text.includes('flag') || text.includes('mismatch') || text.includes('contradiction') || text.includes('breach') || text.includes('risk')) { + return 'FLAG'; + } + if (text.includes('complete') || text.includes('verified') || text.includes('success') || text.includes('result')) { + return 'RESULT'; + } + if (text.includes('started') || text.includes('parsing') || text.includes('cloning') || text.includes('building') || text.includes('analyzing') || text.includes('detecting')) { + return 'HANDOFF'; + } + return 'STATUS'; +} + +function inferTargetAgent(sourceAgent, eventType, message = '') { + if (sourceAgent === 'GATEKEEPER') return eventType === 'ERROR' || message.toLowerCase().includes('rejected') ? 'GATEKEEPER' : 'ORACLE'; + if (sourceAgent === 'ORACLE') return 'MAIN VIVA'; + if (sourceAgent === 'MAIN VIVA') return 'SENTINEL'; + if (sourceAgent === 'SENTINEL') return 'MAIN VIVA'; + return 'MAIN VIVA'; +} + +function inferProgress(sourceAgent, message = '', eventType = 'STATUS') { + const text = message.toLowerCase(); + + if (sourceAgent === 'GATEKEEPER') { + if (text.includes('verified')) return { status: 'done', progress: 100 }; + if (text.includes('rejected') || eventType === 'ERROR') return { status: 'error', progress: 100 }; + if (text.includes('started')) return { status: 'running', progress: 20 }; + return { status: 'running', progress: 60 }; + } + + if (sourceAgent === 'ORACLE') { + if (text.includes('submission intelligence complete') || text.includes('complete')) return { status: 'done', progress: 100 }; + if (text.includes('parsing')) return { status: 'running', progress: 15 }; + if (text.includes('cloning')) return { status: 'running', progress: 30 }; + if (text.includes('detecting')) return { status: 'running', progress: 45 }; + if (text.includes('building execution graph')) return { status: 'running', progress: 60 }; + if (text.includes('extracting observable')) return { status: 'running', progress: 70 }; + if (text.includes('analyzing failure')) return { status: 'running', progress: 82 }; + if (text.includes('generating viva')) return { status: 'running', progress: 92 }; + return { status: 'running', progress: 50 }; + } + + if (sourceAgent === 'MAIN VIVA') { + if (text.includes('analysis complete') || text.includes('voice_viva.completed')) return { status: 'done', progress: 100 }; + if (text.includes('voice_viva.started')) return { status: 'running', progress: 10 }; + if (text.includes('question.playback.started')) return { status: 'running', progress: 25 }; + if (text.includes('turn.finalized')) return { status: 'running', progress: 50 }; + if (text.includes('turn.evaluated')) return { status: 'running', progress: 70 }; + if (text.includes('topic.coverage.updated')) return { status: 'running', progress: 85 }; + if (eventType === 'ERROR') return { status: 'error', progress: 100 }; + return { status: 'running', progress: 40 }; + } + + if (sourceAgent === 'SENTINEL') { + if (eventType === 'ERROR') return { status: 'error', progress: 100 }; + if (text.includes('complete')) return { status: 'done', progress: 100 }; + if (text.includes('placeholder')) return { status: 'running', progress: 15 }; + if (text.includes('flag')) return { status: 'running', progress: 60 }; + return { status: 'running', progress: 35 }; + } + + return { status: 'running', progress: 50 }; +} + +function mapAgentNameToId(name) { + const key = String(name || '').toUpperCase(); + if (key.includes('GATEKEEPER')) return 'gatekeeper'; + if (key.includes('ORACLE')) return 'oracle'; + if (key.includes('MAIN') || key.includes('VIVA')) return 'main'; + if (key.includes('SENTINEL')) return 'sentinel'; + return 'main'; +} + +function buildTimelineBlocks(events) { + if (events.length === 0) { + return AGENT_SEQUENCE.map((agent, index) => ({ + agent, + startMs: 0, + endMs: 0, + label: 'Waiting', + widthPct: 12, + leftPct: index * 18, + })); + } + + const start = Math.min(...events.map((event) => event.duration_ms)); + const end = Math.max(...events.map((event) => event.duration_ms)); + const span = Math.max(end - start, 1); + + return AGENT_SEQUENCE.map((agent, index) => { + const agentEvents = events.filter((event) => event.source_agent === agent || event.target_agent === agent); + if (agentEvents.length === 0) { + return { + agent, + startMs: start, + endMs: start, + label: 'Pending', + widthPct: 12, + leftPct: index * 18, + }; + } + + const agentStart = Math.min(...agentEvents.map((event) => event.duration_ms)); + const agentEnd = Math.max(...agentEvents.map((event) => event.duration_ms)); + return { + agent, + startMs: agentStart, + endMs: agentEnd, + label: `${agentEvents[0].event_type} window`, + leftPct: ((agentStart - start) / span) * 100, + widthPct: Math.max(((agentEnd - agentStart) / span) * 100, 10), + }; + }); +} + +function Timeline({ blocks, onSelectWindow }) { + return ( +
+ {blocks.map((block) => ( +
+
{block.agent}
+
+ +
+
+ ))} +
+ ); +} + +function formatTime(timestamp) { + return new Date(timestamp).toLocaleTimeString([], { hour12: false }); +} + +function payloadPreview(payload) { + if (!payload) return '{}'; + if (typeof payload === 'string') return payload; + try { + return JSON.stringify(payload, null, 2); + } catch { + return String(payload); + } +} + +function createAlertId(alert) { + return `${normalizeKey(alert.owner)}_${normalizeKey(alert.type)}_${normalizeKey(alert.severity)}`; +} + +function App() { + const [repoUrl, setRepoUrl] = useState(DEFAULT_REPO_URL); + const [backendUrl] = useState(BACKEND_WS_URL); + const [rollNumber, setRollNumber] = useState(''); + const [connectionState, setConnectionState] = useState('idle'); + const connectionStateRef = useRef('idle'); + const [sessionId, setSessionId] = useState('idle'); + const sessionIdRef = useRef('idle'); + const [analysisData, setAnalysisData] = useState(null); + const [pendingAlerts, setPendingAlerts] = useState([]); + const [agentState, setAgentState] = useState(createInitialAgentState()); + const [liveEvents, setLiveEvents] = useState([]); + const [selectedAgent, setSelectedAgent] = useState('ALL'); + const [selectedSession, setSelectedSession] = useState('ALL'); + const [selectedRange, setSelectedRange] = useState(null); + const [dismissedAlerts, setDismissedAlerts] = useState([]); + const [highlightEdgeId, setHighlightEdgeId] = useState('e-gatekeeper-oracle'); + const [statusMessage, setStatusMessage] = useState('Connect to the backend to stream live agent data.'); + const [activeSection, setActiveSection] = useState('node-graph'); + const wsRef = useRef(null); + const startTimeRef = useRef(0); + const eventCounterRef = useRef(0); + + const nodes = useMemo(() => buildNodes(agentState), [agentState]); + const edges = useMemo(() => buildEdges([highlightEdgeId]), [highlightEdgeId]); + const sessionOptions = useMemo(() => ['ALL', ...new Set(liveEvents.map((event) => event.session_id))], [liveEvents]); + const timelineBlocks = useMemo(() => buildTimelineBlocks(liveEvents), [liveEvents]); + + const filteredEvents = useMemo(() => { + return liveEvents.filter((event) => { + const byAgent = + selectedAgent === 'ALL' || + event.source_agent === selectedAgent || + event.target_agent === selectedAgent; + const bySession = selectedSession === 'ALL' || event.session_id === selectedSession; + const byRange = + !selectedRange || + (event.duration_ms >= selectedRange.startMs && event.duration_ms <= selectedRange.endMs); + return byAgent && bySession && byRange; + }); + }, [liveEvents, selectedAgent, selectedSession, selectedRange]); + + const visibleAlerts = useMemo(() => { + const runtimeAlerts = []; + + if (analysisData?.runtime_risks?.length) { + analysisData.runtime_risks.forEach((risk, index) => { + runtimeAlerts.push({ + id: `risk-${index}-${normalizeKey(risk.value)}`, + owner: 'SENTINEL', + type: risk.value, + severity: String(risk.severity || 'medium').toLowerCase(), + summary: risk.evidence?.[0] || 'Runtime risk detected from analysis result.', + }); + }); + } + + if (analysisData?.inconsistencies?.length) { + analysisData.inconsistencies.forEach((flag, index) => { + runtimeAlerts.push({ + id: `flag-${index}-${normalizeKey(flag.issue)}`, + owner: 'SENTINEL', + type: flag.issue, + severity: String(flag.severity || 'medium').toLowerCase(), + summary: flag.evidence?.[0] || 'Analysis inconsistency detected.', + }); + }); + } + + if (analysisData?.gatekeeper_status === 'rejected') { + runtimeAlerts.push({ + id: 'gatekeeper-rejected', + owner: 'GATEKEEPER', + type: 'identity mismatch', + severity: 'critical', + summary: analysisData.gatekeeper_reason || 'Submission rejected by Gatekeeper.', + }); + } + + const merged = [...DEFAULT_ALERTS, ...pendingAlerts, ...runtimeAlerts]; + const unique = merged.filter((alert, index, array) => index === array.findIndex((candidate) => candidate.id === alert.id)); + return unique.filter((alert) => !dismissedAlerts.includes(alert.id)); + }, [analysisData, dismissedAlerts, pendingAlerts]); + + async function fetchPendingAlerts() { + try { + const res = await fetch(`${API_BASE}/face/pending-alerts`); + if (!res.ok) return; + const payload = await res.json(); + // normalize to array of alerts + const items = Array.isArray(payload) ? payload : payload.results || []; + const normalized = items.map((a, i) => ({ + id: a.conflict_id || a.id || `pending-${i}`, + owner: a.owner || 'SENTINEL', + type: a.type || a.issue || 'unknown', + severity: (a.severity || 'medium').toLowerCase(), + summary: a.summary || a.evidence || JSON.stringify(a).slice(0, 120), + })); + setPendingAlerts(normalized); + } catch (err) { + // ignore + } + } + + // fetch pending alerts when session starts + useEffect(() => { + if (sessionId && sessionId !== 'idle') { + fetchPendingAlerts(); + } + }, [sessionId]); + + function setConnection(nextState) { + connectionStateRef.current = nextState; + setConnectionState(nextState); + } + + function setSession(nextSession) { + sessionIdRef.current = nextSession; + setSessionId(nextSession); + } + + function closeWebSocket() { + if (wsRef.current) { + try { + wsRef.current.close(); + } catch { + // noop + } + wsRef.current = null; + } + } + + function resetRunState(nextSessionId) { + setSession(nextSessionId); + setConnection('connecting'); + setAnalysisData(null); + setLiveEvents([]); + setAgentState(createInitialAgentState()); + setDismissedAlerts([]); + setSelectedAgent('ALL'); + setSelectedSession('ALL'); + setSelectedRange(null); + setHighlightEdgeId('e-gatekeeper-oracle'); + setStatusMessage('Connecting to backend websocket...'); + startTimeRef.current = Date.now(); + eventCounterRef.current = 0; + } + + function appendEvent(message, logType, payload, explicitSource = null) { + const timestamp = new Date().toISOString(); + const elapsedMs = Math.max(Date.now() - startTimeRef.current, 0); + const sourceAgent = explicitSource || inferSourceAgent(message); + const eventType = inferEventType(logType, message); + const targetAgent = inferTargetAgent(sourceAgent, eventType, message); + + const event = { + event_id: `evt-${String(++eventCounterRef.current).padStart(3, '0')}`, + timestamp, + source_agent: sourceAgent, + target_agent: targetAgent, + event_type: eventType, + session_id: sessionIdRef.current, + payload, + duration_ms: elapsedMs, + }; + + setLiveEvents((prev) => [...prev, event]); + setHighlightEdgeId(EDGE_BY_SOURCE[sourceAgent] || 'e-main-sentinel'); + setStatusMessage(message); + + setAgentState((prev) => { + const next = { ...prev }; + const progress = inferProgress(sourceAgent, message, eventType); + + const updateSource = (key) => { + next[key] = { + ...next[key], + ...progress, + lastEvent: message, + durationMs: elapsedMs, + }; + }; + + if (sourceAgent === 'GATEKEEPER') { + updateSource('gatekeeper'); + if (progress.status === 'done' && next.oracle.status === 'idle') { + next.oracle = { + ...next.oracle, + status: 'running', + progress: 10, + lastEvent: 'Waiting for ORACLE analysis to begin', + durationMs: elapsedMs, + }; + } + } + + if (sourceAgent === 'ORACLE') { + updateSource('oracle'); + if (progress.status === 'done' && next.main.status === 'idle') { + next.main = { + ...next.main, + status: 'running', + progress: 10, + lastEvent: 'Waiting for MAIN VIVA handoff', + durationMs: elapsedMs, + }; + } + } + + if (sourceAgent === 'MAIN VIVA') { + updateSource('main'); + if (progress.status === 'done' && next.sentinel.status === 'idle') { + next.sentinel = { + ...next.sentinel, + status: 'running', + progress: 15, + lastEvent: 'Waiting for SENTINEL oversight', + durationMs: elapsedMs, + }; + } + } + + if (sourceAgent === 'SENTINEL') { + updateSource('sentinel'); + } + + return next; + }); + + return event; + } + + function startAnalysis() { + if (!repoUrl) { + setStatusMessage('Enter a repository URL first.'); + return; + } + + closeWebSocket(); + const nextSessionId = `session-${Date.now()}`; + resetRunState(nextSessionId); + + const socket = new WebSocket(backendUrl); + wsRef.current = socket; + + socket.onopen = () => { + setConnection('running'); + setStatusMessage('Backend connected. Starting GATEKEEPER → ORACLE → MAIN VIVA → SENTINEL pipeline.'); + socket.send( + JSON.stringify({ + repo_url: repoUrl, + report_path: null, + enable_viva: true, + enable_debug: true, + generate_report: false, + // optional roll_number for Gatekeeper + roll_number: rollNumber || undefined, + }) + ); + + appendEvent( + '[MainAgent] Live analysis session opened.', + 'info', + { + repo_url: repoUrl, + backend_url: backendUrl, + }, + 'MAIN VIVA' + ); + }; + + socket.onmessage = (rawEvent) => { + let data; + try { + data = JSON.parse(rawEvent.data); + } catch { + return; + } + + // If backend sends structured PlatformEvent or 'event' messages + if (data.event_type || data.type === 'event' || data.agent_name || data.source_agent) { + const sourceName = data.agent_name || data.source_agent || data.agent || (data.payload && data.payload.source) || null; + const sourceAgent = sourceName ? sourceName : inferSourceAgent(String(data.message || '')); + const eventType = data.event_type || data.type || 'event'; + const session = data.session_id || data.session || sessionIdRef.current; + const timestamp = data.timestamp || new Date().toISOString(); + const duration_ms = data.duration_ms || data.durationMs || Math.max(Date.now() - startTimeRef.current, 0); + const payload = data.payload || data.data || data.message || {}; + + const evt = { + event_id: data.event_id || `evt-${Date.now()}`, + timestamp, + source_agent: sourceAgent, + target_agent: data.target_agent || data.to_agent || inferTargetAgent(sourceAgent, eventType, String(payload?.message || '')), + event_type: eventType, + session_id: session, + payload, + duration_ms, + }; + + setLiveEvents((prev) => [...prev, evt]); + + // update agent progress if provided in payload or if event_type suggests progress + if ((String(eventType).toLowerCase().includes('progress')) || payload?.progress !== undefined || payload?.status) { + const agentKey = mapAgentNameToId(evt.source_agent); + const p = typeof payload.progress === 'number' ? payload.progress : undefined; + const status = payload.status || undefined; + setAgentState((prev) => { + const next = { ...prev }; + if (agentKey && next[agentKey]) { + next[agentKey] = { + ...next[agentKey], + progress: p !== undefined ? p : next[agentKey].progress, + status: status || next[agentKey].status, + lastEvent: typeof payload === 'string' ? payload : (payload.summary || JSON.stringify(payload).slice(0, 120)), + durationMs: duration_ms, + }; + } + return next; + }); + } + + // highlight handoff edges + if (String(eventType).toLowerCase().includes('handoff')) { + setHighlightEdgeId(EDGE_BY_SOURCE[evt.source_agent] || 'e-main-sentinel'); + } + + return; + } + + // legacy log messages + if (data.type === 'log') { + const message = String(data.message || ''); + const logType = String(data.log_type || data.type || 'info'); + const source = inferSourceAgent(message); + const event = appendEvent(message, logType, { message, log_type: logType }, source); + + if (event.event_type === 'ERROR') { + setConnection('error'); + } + if (message.toLowerCase().includes('analysis complete')) { + setConnection('done'); + } + return; + } + + if (data.type === 'result') { + const payload = data.data || {}; + setAnalysisData(payload); + + appendEvent( + '[Oracle] Structured result received from backend.', + 'success', + { + project_name: payload.project_name?.value, + backend_framework: payload.backend_framework?.value, + architecture_pattern: payload.architecture_pattern?.value, + viva_targets: Array.isArray(payload.implementation_viva_targets) + ? payload.implementation_viva_targets.length + : Array.isArray(payload.viva_intelligence_targets) + ? payload.viva_intelligence_targets.length + : 0, + }, + 'ORACLE' + ); + + setAgentState((prev) => ({ + gatekeeper: { + ...prev.gatekeeper, + status: prev.gatekeeper.status === 'error' ? 'error' : 'done', + progress: 100, + lastEvent: prev.gatekeeper.lastEvent, + durationMs: prev.gatekeeper.durationMs, + }, + oracle: { + ...prev.oracle, + status: 'done', + progress: 100, + lastEvent: payload.backend_framework?.value + ? `Backend: ${payload.backend_framework.value}` + : 'Submission intelligence complete', + durationMs: prev.oracle.durationMs, + }, + main: { + ...prev.main, + status: 'done', + progress: 100, + lastEvent: payload.viva_intelligence_targets?.length + ? `${payload.viva_intelligence_targets.length} viva targets prepared` + : 'Main viva completed', + durationMs: prev.main.durationMs, + }, + sentinel: { + ...prev.sentinel, + status: prev.sentinel.status === 'error' ? 'error' : 'done', + progress: 100, + lastEvent: payload.runtime_risks?.length + ? `${payload.runtime_risks.length} runtime risks under review` + : 'Oversight complete', + durationMs: prev.sentinel.durationMs, + }, + })); + + setHighlightEdgeId('e-main-sentinel'); + setConnection('done'); + } + }; + + socket.onerror = () => { + setConnection('error'); + setStatusMessage('WebSocket connection failed. Make sure the backend is running on port 8000.'); + setAgentState((prev) => ({ + ...prev, + oracle: { ...prev.oracle, status: 'error' }, + })); + }; + + socket.onclose = () => { + wsRef.current = null; + if (connectionStateRef.current !== 'done' && connectionStateRef.current !== 'error') { + setConnection('idle'); + } + }; + } + + function onTimelineSelect(agent, startMs, endMs) { + setSelectedAgent(agent); + setSelectedRange({ startMs, endMs }); + setHighlightEdgeId(EDGE_BY_SOURCE[agent] || 'e-main-sentinel'); + setStatusMessage(`Timeline window filtered for ${agent} (${startMs}ms → ${endMs}ms).`); + } + + function dismissAlert(id) { + // attempt to resolve on backend, then mark dismissed locally + (async () => { + try { + await fetch(`${API_BASE}/face/resolve-alert`, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ conflict_id: id, approved: false }), + }); + } catch (err) { + // ignore network errors; still dismiss locally + } + setDismissedAlerts((prev) => [...prev, id]); + })(); + } + + function scrollToSection(id) { + const target = document.getElementById(id); + if (target) { + target.scrollIntoView({ behavior: 'smooth', block: 'start' }); + setActiveSection(id); + } + } + + return ( +
+
+
ORACLE Backend Testing UI
+ +
+ +
+
+ + +
+ Backend: {backendUrl} + Session: {sessionId} + Status: {connectionState.toUpperCase()} +
+ +
+ +
+
+

1. Agent Topology Graph

+
React Flow
+
+
+ + + + + + + +
+
+ +
+
+

2. Agent Progress Overview

+
Four-agent completion cards
+
+
+ {Object.values(agentState).map((agent) => ( +
+
+ {agent.name} + {agent.progress}% +
+
{agent.status.toUpperCase()}
+
+
+
+

{agent.lastEvent}

+
Last operation: {agent.durationMs}ms
+
+ ))} +
+
+ +
+
+
+

3. Session Timeline Panel

+
Gantt Style
+
+
+ +

{statusMessage}

+
+
+ +
+
+

4. Live Event Feed

+
Chronological Event Stream
+
+
+
+ + +
+ +
+ {filteredEvents.map((event) => ( +
+
+ {formatTime(event.timestamp)} + {event.event_type} + {event.session_id} +
+
+ {event.source_agent} → {event.target_agent} +
+
+ event_id: {event.event_id} + duration_ms: {event.duration_ms} +
+
{payloadPreview(event.payload)}
+
+ ))} + {filteredEvents.length === 0 &&

No events match the current filter.

} +
+
+
+
+ +
+
+

5. Alert Cards Panel

+
SENTINEL + GATEKEEPER
+
+
+ {visibleAlerts.length === 0 &&

No active alerts.

} + {visibleAlerts.map((alert) => ( +
+
+ {alert.owner} + {alert.severity.toUpperCase()} +
+
{alert.type}
+

{alert.summary}

+ +
+ ))} +
+
+
+
+ ); +} + +export default App; diff --git a/testing_backend_ui/src/main.jsx b/testing_backend_ui/src/main.jsx new file mode 100644 index 0000000..d106f4f --- /dev/null +++ b/testing_backend_ui/src/main.jsx @@ -0,0 +1,10 @@ +import React from 'react'; +import ReactDOM from 'react-dom/client'; +import App from './App'; +import './styles.css'; + +ReactDOM.createRoot(document.getElementById('root')).render( + + + +); diff --git a/testing_backend_ui/src/styles.css b/testing_backend_ui/src/styles.css new file mode 100644 index 0000000..732279a --- /dev/null +++ b/testing_backend_ui/src/styles.css @@ -0,0 +1,531 @@ +:root { + --highlight: #E8E3DC; + --text-primary: #1C1A17; + --text-secondary: #4A4740; + --text-muted: #8A8580; + --accent: #2B2825; + --rule: #D4D0CA; + + --status-idle: #3A78C2; + --status-running: #D3A200; + --status-done: #2D9A48; + --status-error: #C32E2E; + + --font-serif: 'Cormorant Garamond', serif; + --font-sans: 'DM Sans', sans-serif; + --font-mono: 'SFMono-Regular', Consolas, 'Liberation Mono', Menlo, monospace; +} + +* { + box-sizing: border-box; +} + +html, +body, +#root { + margin: 0; + height: 100%; + font-family: var(--font-sans); + color: var(--text-primary); + background: #F5F2EE; + overflow: auto; +} + +.dashboard-shell { + min-height: 100%; + display: flex; + flex-direction: column; +} + +.topbar { + display: flex; + align-items: center; + justify-content: space-between; + gap: 1rem; + padding: 0.9rem 1rem; + border-bottom: 1px solid var(--rule); + background: #FAFAF8; + position: sticky; + top: 0; + z-index: 20; +} + +.brand { + font-family: var(--font-serif); + font-size: 1.5rem; + font-weight: 600; + letter-spacing: 0.02em; +} + +.menu { + display: flex; + align-items: center; + gap: 0.7rem; +} + +.menu-item { + border: none; + background: transparent; + color: var(--text-secondary); + font-family: var(--font-sans); + font-weight: 600; + font-size: 0.78rem; + letter-spacing: 0.08em; + padding: 0.4rem 0.5rem; + cursor: pointer; +} + +.menu-item.active { + color: var(--accent); +} + +.menu-separator { + width: 1px; + height: 16px; + background: var(--rule); +} + +.dashboard-stack { + padding: 1rem; + display: flex; + flex-direction: column; + gap: 1rem; +} + +.control-strip { + border: 1px solid var(--rule); + background: #FAFAF8; + padding: 0.85rem 0.95rem; + display: grid; + grid-template-columns: 1.2fr 1fr auto; + gap: 0.8rem; + align-items: end; +} + +.control-field { + display: grid; + gap: 0.35rem; + font-size: 0.78rem; + color: var(--text-secondary); +} + +.control-field input { + border: 1px solid var(--rule); + background: #fff; + color: var(--text-primary); + padding: 0.55rem 0.7rem; + font-family: var(--font-mono); +} + +.control-summary { + display: flex; + flex-wrap: wrap; + gap: 0.65rem; + color: var(--text-muted); + font-size: 0.76rem; +} + +.control-summary span { + border: 1px solid var(--rule); + padding: 0.35rem 0.55rem; + background: #fff; +} + +.control-button { + border: 1px solid var(--accent); + background: var(--accent); + color: #FAFAF8; + padding: 0.7rem 1rem; + font-family: var(--font-sans); + font-weight: 700; + letter-spacing: 0.05em; + cursor: pointer; +} + +.control-button:hover { + background: #1C1A17; +} + +.split-panels { + display: grid; + grid-template-columns: 1fr 1.3fr; + gap: 1rem; +} + +.panel { + border: 1px solid var(--rule); + background: #FAFAF8; + display: flex; + flex-direction: column; + min-height: 0; +} + +.panel-header { + padding: 0.75rem 0.9rem; + border-bottom: 1px solid var(--rule); + display: flex; + align-items: center; + justify-content: space-between; +} + +.panel-header h2 { + margin: 0; + font-size: 1rem; + font-family: var(--font-serif); +} + +.panel-meta { + color: var(--text-muted); + font-size: 0.78rem; +} + +.panel-body { + padding: 0.8rem; + overflow: auto; +} + +.panel-progress .panel-body { + padding: 0.9rem; +} + +.panel-graph { + min-height: 520px; +} + +.graph-body { + padding: 0; +} + +.progress-grid { + display: grid; + grid-template-columns: repeat(4, minmax(0, 1fr)); + gap: 0.8rem; +} + +.progress-card { + border: 1px solid var(--rule); + background: #fff; + padding: 0.75rem; +} + +.progress-card-head { + display: flex; + justify-content: space-between; + align-items: center; + font-size: 0.8rem; + margin-bottom: 0.35rem; +} + +.agent-node { + min-width: 220px; + max-width: 260px; + border: 2px solid var(--rule); + background: #fff; + border-radius: 8px; + padding: 0.6rem 0.7rem; + box-shadow: 0 2px 10px rgba(28, 26, 23, 0.08); +} + +.agent-node-name { + font-family: var(--font-serif); + font-size: 1.05rem; + margin-bottom: 0.1rem; +} + +.agent-node-status { + font-size: 0.7rem; + font-weight: 700; + letter-spacing: 0.08em; + margin-bottom: 0.4rem; +} + +.agent-node-progress { + display: flex; + justify-content: space-between; + font-size: 0.68rem; + color: var(--text-muted); + margin-bottom: 0.25rem; +} + +.agent-node-bar, +.progress-bar-shell { + height: 6px; + background: var(--rule); + position: relative; + overflow: hidden; + margin-bottom: 0.5rem; +} + +.agent-node-bar-fill, +.progress-bar-fill { + position: absolute; + inset: 0 auto 0 0; + background: var(--accent); +} + +.agent-node-last { + color: var(--text-secondary); + font-size: 0.78rem; + line-height: 1.3; + margin-bottom: 0.35rem; +} + +.agent-node-time { + font-family: var(--font-mono); + font-size: 0.72rem; + color: var(--text-muted); +} + +.status-idle { + border-color: var(--status-idle); +} + +.status-running { + border-color: var(--status-running); +} + +.status-done { + border-color: var(--status-done); +} + +.status-error { + border-color: var(--status-error); +} + +.timeline-grid { + display: grid; + gap: 0.65rem; +} + +.timeline-row { + display: grid; + grid-template-columns: 110px 1fr; + align-items: center; + gap: 0.5rem; +} + +.timeline-agent { + font-size: 0.75rem; + font-weight: 700; + color: var(--text-secondary); +} + +.timeline-track { + position: relative; + height: 34px; + border: 1px solid var(--rule); + background: #F5F2EE; +} + +.timeline-block { + position: absolute; + top: 3px; + height: 26px; + border: 1px solid var(--accent); + background: var(--highlight); + color: var(--text-primary); + font-size: 0.7rem; + cursor: pointer; + font-family: var(--font-sans); +} + +.window-message { + margin: 0.8rem 0 0; + font-size: 0.8rem; + color: var(--text-muted); +} + +.filters-row { + display: flex; + gap: 0.8rem; + margin-bottom: 0.7rem; +} + +.filters-row label { + display: grid; + gap: 0.25rem; + font-size: 0.78rem; + color: var(--text-secondary); +} + +.filters-row select { + border: 1px solid var(--rule); + background: #fff; + color: var(--text-primary); + padding: 0.3rem 0.45rem; + font-family: var(--font-sans); +} + +.event-feed { + display: grid; + gap: 0.55rem; +} + +.event-card { + border: 1px solid var(--rule); + background: #fff; + padding: 0.55rem; +} + +.event-head { + display: flex; + gap: 0.45rem; + align-items: center; + font-size: 0.72rem; + color: var(--text-muted); +} + +.event-type { + font-weight: 700; + color: var(--text-primary); +} + +.event-session { + margin-left: auto; + font-family: var(--font-mono); +} + +.event-route { + margin-top: 0.3rem; + font-size: 0.82rem; + font-weight: 600; +} + +.event-schema { + margin-top: 0.3rem; + display: flex; + gap: 0.7rem; + font-family: var(--font-mono); + font-size: 0.68rem; + color: var(--text-muted); +} + +.event-payload { + margin: 0.38rem 0 0; + padding: 0.4rem; + border: 1px solid var(--rule); + background: #F5F2EE; + font-size: 0.66rem; + font-family: var(--font-mono); + overflow: auto; +} + +.event-handoff { + border-left: 4px solid #4B74B4; +} + +.event-status { + border-left: 4px solid #9A7C35; +} + +.event-result { + border-left: 4px solid #2D9A48; +} + +.event-flag { + border-left: 4px solid #AD6F1A; +} + +.event-error { + border-left: 4px solid #C32E2E; +} + +.alert-grid { + display: grid; + grid-template-columns: repeat(2, minmax(0, 1fr)); + gap: 0.8rem; +} + +.alert-card { + border: 1px solid var(--rule); + background: #fff; + padding: 0.7rem; +} + +.alert-head { + display: flex; + justify-content: space-between; + align-items: center; + margin-bottom: 0.35rem; + font-size: 0.78rem; +} + +.alert-type { + font-family: var(--font-serif); + font-size: 1.06rem; + margin-bottom: 0.35rem; +} + +.alert-card p { + margin: 0 0 0.6rem; + color: var(--text-secondary); + font-size: 0.84rem; +} + +.alert-card button { + border: 1px solid var(--rule); + background: #F5F2EE; + color: var(--text-primary); + font-family: var(--font-sans); + font-size: 0.75rem; + padding: 0.25rem 0.6rem; + cursor: pointer; +} + +.severity-low { + border-left: 4px solid #4A7D6A; +} + +.severity-medium { + border-left: 4px solid #B88A2C; +} + +.severity-high { + border-left: 4px solid #C26C1A; +} + +.severity-critical { + border-left: 4px solid #C32E2E; +} + +@media (max-width: 1200px) { + .control-strip { + grid-template-columns: 1fr; + } + + .split-panels { + grid-template-columns: 1fr; + } + + .progress-grid { + grid-template-columns: 1fr 1fr; + } + + .timeline-row { + grid-template-columns: 88px 1fr; + } + + .alert-grid { + grid-template-columns: 1fr; + } + + .filters-row { + flex-direction: column; + } +} + +@media (max-width: 760px) { + .progress-grid { + grid-template-columns: 1fr; + } + + .menu { + gap: 0.35rem; + flex-wrap: wrap; + justify-content: flex-end; + } + + .menu-separator { + display: none; + } +} diff --git a/testing_backend_ui/vite.config.js b/testing_backend_ui/vite.config.js new file mode 100644 index 0000000..0466183 --- /dev/null +++ b/testing_backend_ui/vite.config.js @@ -0,0 +1,6 @@ +import { defineConfig } from 'vite'; +import react from '@vitejs/plugin-react'; + +export default defineConfig({ + plugins: [react()], +});