Skip to content

s3ak6i-dev/Cortex

Repository files navigation

Cortex — Memory-First Multi-Agent Orchestration

"Every existing tool models agents. Cortex models memory. The agents are just the workers."

Built with Mem0 FastAPI React Flow Live Demo

Cortex is a visual multi-agent orchestration platform where the Shared Memory Pool is the central node — not an afterthought. It inverts the standard design: instead of "how do I wire agents together?", you ask "what does each agent need to remember, and what should they share?"

Built on Mem0 to demonstrate what AI infrastructure looks like when memory is the primary design primitive.

→ Live Demo


The Problem

Every multi-agent tool today — LangGraph, CrewAI, AutoGen — treats agents as the primary objects. Memory is a plugin, a config option, a footnote in the documentation. This is architecturally backwards.

In production systems, agents fail not because of bad orchestration logic, but because they can't share context:

  • A triage agent asks for the customer's name — the customer already gave it twice
  • A knowledge agent gives generic advice because it doesn't know what the triage agent found
  • A resolution agent contradicts what was already promised

Cortex inverts this. The memory pool is the brain. The agents are the workers that read from and write to it.


Architecture

cortex/
├── cortex-frontend/          # React 18 + TypeScript + Vite + React Flow
├── cortex-api/               # FastAPI — primary API + WebSocket server
├── cortex-execution-engine/  # Async agent execution pipeline (Kahn's algorithm)
└── cortex-memory-service/    # Mem0 abstraction layer + conflict detection

The Key Architectural Decision

The execution engine is separated from the API layer. POST /api/execution/start returns {execution_id, session_id} immediately via FastAPI's BackgroundTask. The frontend connects to /ws/execution/{session_id} and receives typed events as the pipeline runs.

This means:

  • The API stays fast and responsive even when 5 agents run concurrently
  • Agent failures don't crash the API server — fault isolation per agent
  • WebSocket streaming is completely decoupled from business logic
  • The execution engine scales independently

Memory Namespacing

Every Mem0 read/write is namespaced to the authenticated user. No two users share a memory pool unless they explicitly join the same workspace.

Personal canvas:   user_id::canvas_id        (private to the user)
Org canvas:        org_id::canvas_id          (shared across all org members)
Private agent:     user_id::canvas_id::agent_id  (agent-level isolation)

This means org members building on the same pipeline accumulate a shared memory pool — every run makes the collective knowledge richer.


Tech Stack

Layer Technology Why
Frontend React 18 + TypeScript + Vite Type safety, fast HMR, production-grade
Canvas React Flow (xyflow) Production-proven — used by Stripe, Vercel, Typeform
State Zustand + persist Lightweight, localStorage persistence, no boilerplate
Styling Tailwind CSS + CSS variables Design tokens system, consistent dark theme
Backend FastAPI (Python) Async-native, perfect for WebSocket + Mem0 Python SDK
Memory Mem0 Python SDK Semantic memory with conflict detection
LLM Groq llama-3.3-70b-versatile Fast streaming inference
Vision Groq llama-3.2-11b-vision-preview Auto-selected when image attachments present
Database Neon (serverless Postgres via asyncpg) Canvas persistence, auth, sharing, orgs
Realtime Native WebSockets Low-latency canvas animation during execution
Conflict Detection Cosine similarity + LLM judgment Two-stage: similarity gate (0.85) → Groq contradiction check
Auth JWT (FastAPI + bcrypt) Token-based, expiry-checked client and server side
Deploy Vercel (frontend) + Render (backend) Zero-config CI/CD from GitHub

Features

Core Pipeline

  • Visual canvas — drag-and-drop nodes: Trigger, Agent, Shared Memory, Output
  • Real-time execution — WebSocket streaming; agents think visibly, memory edges pulse on read/write
  • Topological execution order — Kahn's algorithm; parallel agents run concurrently per depth level
  • Shared Memory toggle — flip between collaborative (Mem0 ON) and isolated (Mem0 OFF) in one click

Memory

  • Per-user namespacinguser_id::canvas_id ensures no cross-user memory leakage
  • Org shared poolsorg_id::canvas_id for teams; all members contribute to and read from the same pool
  • Private agent scope — agents can maintain personal memory separate from the shared pool
  • Conflict detection — two-stage algorithm: cosine similarity gate → LLM contradiction check → visual ConflictPanel
  • Memory dashboard — browse, search, filter, and delete memories; D3 force graph of the memory network; health score

File & Image Upload

  • Attach .txt, .md, .csv files — content injected as context blocks into every agent's prompt
  • Attach .jpg, .png, .webp images — execution engine automatically switches to vision model
  • No configuration required — model selection is automatic based on attachment type

Collaboration

  • Canvas sharing — invite teammates by email with view or edit permission
  • Share links — copy a direct link; recipients land on the exact canvas after login
  • Multi-tenant workspaces — create orgs, invite members, switch context in the topbar
  • Permission enforcement — view-only shares cannot write; enforced at the API level

Developer Experience

  • Onboarding tour — 11-step interactive walkthrough, auto-starts for first-time users
  • Terminal drawer — raw WebSocket event log for every execution
  • Execution summary — agents run, memories written, memories read, conflicts, duration
  • Three example pipelines — Customer Support, Job Screener, Codebase Q&A

Quick Start

Prerequisites

1. Frontend

cd cortex-frontend
npm install
cp .env.example .env.local
# Set VITE_API_BASE_URL=http://127.0.0.1:8001
# Set VITE_WS_BASE_URL=ws://127.0.0.1:8001
npm run dev
# → http://localhost:5173

2. Backend

cd cortex-api
python -m venv .venv
.venv\Scripts\activate      # Windows
# source .venv/bin/activate # macOS/Linux
pip install -r requirements.txt
cp .env.example .env
# Set MEM0_API_KEY, GROQ_API_KEY, DATABASE_URL (Neon connection string)
uvicorn main:app --host 127.0.0.1 --port 8001 --reload
# → http://127.0.0.1:8001

3. Database

Run supabase_schema.sql against your Neon database to create the required tables (canvases, executions, canvas_shares, orgs, org_members).

Both servers running? Open http://localhost:5173 and hit Run.


The Demo

The canvas pre-loads a Customer Support pipeline:

Trigger → Triage Agent ───┐
                            ├─── Shared Memory Pool ─── Resolution Agent → Output
           Knowledge Agent ─┘

Shared Memory ON — Agents collaborate. Triage reads the customer's billing history. Knowledge pulls the refund policy. Resolution synthesizes both into a personalized response. ~4 seconds.

Shared Memory OFF — Agents are isolated. Triage asks for the customer's name again. Knowledge gives generic advice. Resolution contradicts what Triage said.

The contrast makes Mem0's value proposition viscerally clear in under 10 seconds.


Mem0 SDK Patterns

# Shared pool write — all agents in this user's canvas can read it
memory_service.add(
    content=agent_response,
    user_id=f"{user_id}::{canvas_id}",        # user-scoped shared pool
    metadata={"category": "resolution", "source_agent": agent_id}
)

# Org shared pool — all org members read and write here
memory_service.add(
    content=agent_response,
    user_id=f"{org_id}::{canvas_id}",          # org-scoped shared pool
)

# Private agent memory — only this agent reads it
memory_service.add(
    content=agent_context,
    user_id=f"{user_id}::{canvas_id}::{agent_id}",   # private scope
)

# Semantic search at runtime
results = memory_service.search(
    query=trigger_message,
    user_id=f"{user_id}::{canvas_id}",
    top_k=5,
)
# results[i].memory, results[i].score (confidence), results[i].id

# Conflict detection — runs after every add()
similar = memory_service.search(query=new_memory_text, user_id=scope)
if similar[0].score > 0.85:
    # LLM contradiction check → emit CONFLICT_DETECTED WebSocket event

WebSocket Event Protocol

The frontend receives these events during execution:

Event Trigger Frontend Effect
EXECUTION_STARTED Pipeline begins Status pill → "running"
AGENT_ACTIVATED Agent begins work Node ring → violet pulse
MEMORY_READ mem0.search() called Memory edge pulses toward agent
MEMORY_WRITE mem0.add() called Memory edge pulses toward pool, toast fires
CONFLICT_DETECTED Similarity > 0.85 Red ConflictEdge appears, ConflictPanel opens
AGENT_THINKING LLM streaming token Token appended to OutputNode
AGENT_ERROR Agent fails Node ring → red, pipeline continues
AGENT_COMPLETE Agent finishes Node ring → green
EXECUTION_COMPLETE All agents done ExecutionSummaryPanel slides up
EXECUTION_ERROR Timeout or fatal error Status pill → error

Conflict Resolution

When two agents write contradictory memories, Cortex surfaces this visually and lets you resolve it:

  • Keep A / Keep B — deletes the losing memory via DELETE /api/memory/{id}
  • AI Merge — Groq synthesizes a single accurate statement from both, previews it, then writes the merged memory and deletes both originals
  • Dismiss — removes from view locally, both memories preserved

Key Engineering Constraints

  • Sequential execution is intentional on Windowsasyncio.gather() deadlocks in FastAPI BackgroundTasks on Windows. Agents run sequentially per depth level.
  • Fault isolation — individual agent failures emit AGENT_ERROR (ring turns red) but do not stop the pipeline. Other agents continue.
  • 120-second timeoutasyncio.wait_for() wraps the pipeline; EXECUTION_ERROR emitted on timeout.
  • Mem0 search() takes user_id as a direct kwarg, not inside filters={}.
  • PostHog telemetry patched at mem0.client.main.capture_client_event to prevent startup errors on Render.
  • Render cold starts — free tier sleeps after 15 min inactivity; first execution after a gap takes ~30s to wake.

Project Structure — Frontend

src/
├── canvas/
│   ├── nodes/          AgentNode, SharedMemoryNode, TriggerNode, OutputNode
│   ├── edges/          MemoryEdge (animated pulse), MessageEdge, ConflictEdge
│   ├── controls/       CanvasToolbar, AgentConfigDrawer, TriggerEditDrawer (+ file upload),
│   │                   ShareModal, OutputExpandModal, ExecutionSummaryPanel, TourOverlay
│   └── hooks/          useCanvasStore (Zustand), useCanvasSync (DB sync), useExecution (WS)
├── shell/              Sidebar, Topbar, OrgSwitcher, OrgPanel, MemoryToast,
│                       TerminalDrawer, ConflictPanel
├── memory-dashboard/   D3 force graph, health score, memory cards, search + filter
└── transcripts/        AI build transcript viewer

Architecture

The full design rationale — the memory-first inversion thesis, three-service split, namespacing strategy, tradeoffs, and what comes next — is documented in ARCHITECTURE.md.


Build Log

This project was built in ~2 days (May 13–15, 2026) using Claude Code. The full decision log — every prompt, every architectural choice, every bug and fix — is in TRANSCRIPTS.md and rendered live inside the app at /transcripts.

10 sessions. 75 features. 36 errors resolved. Every decision documented.


License

MIT — built as a demonstration artifact for the Mem0 Frontend + Backend Engineer role (₹1 Crore).

Stack chosen deliberately. Every decision documented. Memory-first by design.

About

Cortex is a visual multi-agent orchestration platform where the Shared Memory Pool is the central node — not an afterthought. It inverts the standard design: instead of "how do I wire agents together?", you ask "what does each agent need to remember, and what should they share?"

Topics

Resources

Stars

Watchers

Forks

Contributors