Math Mentor: logical-reasoning-agent

A deterministic, multi-agent system for robust mathematical problem solving.

Project Overview

Math Mentor is an autonomous reasoning system designed to solve high-school and undergraduate level mathematics problems with high reliability. Unlike varied "chat" interfaces, this application decouples semantic understanding from deterministic computation.

The system accepts multimodal inputs (text, image, audio) and employs a Human-in-the-Loop (HITL) workflow to handle ambiguity before it propagates to the solver.

Core Philosophy

Separation of Concerns: LLMs are excellent at planning and translation but poor at arithmetic. This system uses Gemini 2.0 Flash solely for semantic understanding and code generation, while deterministic solvers handle computation.
Reflexion: A unified Orchestrator manages a feedback loop where failures in solution verification trigger immediate introspection and strategy adjustment, rather than silent failure.
Episodic Memory: The system persists successful solution patterns. When faced with a new problem, it retrieves semantically similar past successes to guide its current strategy (Self-Learning).

System Architecture

The application is structured as a pipeline of specialized agents coordinated by a central Orchestrator.

1. Input Processing Layer

OCR Engine: Uses Google Gemini Vision (Program-of-Thought prompting) to transcribe mathematical images into structured text. Includes transparency handling and contrast optimization.
ASR Engine: Uses Google Cloud Speech-to-Text v2 (Chirp 2) to transcribe spoken mathematical queries with state-of-the-art accuracy, converting natural speech into formal problem statements.

2. Cognitive Layer (The Agents)

Parser Agent: Normalizes raw input into a structured schema, identifying variables, constraints, and specifically "what needs to be solved". Flags ambiguous input for user clarification.
Router Agent: Classifies intent (e.g., Algebra, Probability, Calculus) to select the optimal solving strategy and filter the knowledge base.
Solver Agent: The core reasoning engine. It adopts a "Program-of-Thought" approach, generating SymPy code to solve problems deterministically. It integrates RAG to access mathematical laws and reference material.
Verifier Agent: A "Judge" model that validates solutions by:
1. Numerical Substitution (plugging answers back into equations).
2. Conceptual Sanity Checks (validating units, domains, and bounds).
Explainer (DeckGen + Solver): Rather than a redundant text summarizer, the system uses a specialized Visual Deck Generator. This component takes the Solver's logical trace and transforms it into a step-by-step visual explanation.

3. Memory & Persistence

Vector Store (FAISS): Stores embeddings of past interactions.
Relational DB (SQLite): Logs full conversation history, user feedback, and verification states.

System Diagram

flowchart TD
    %% Nodes
    User([User])
    UI[Frontend Interface]
    Orchestrator{Orchestrator}
    DeckGen["Visual Explainer\n(Deck Generator)"]

    subgraph Perception["Perception Layer"]
        OCR[OCR Engine]
        ASR[ASR Engine]
    end

    subgraph Agents["Cognitive Layer"]
        Parser[Parser Agent]
        Router[Router Agent]
        Solver[Solver Agent]
        Verifier[Verifier Agent]
    end

    subgraph Memory["Memory System"]
        RAG[(Knowledge Base)]
        History[(Episodic Memory)]
    end

    %% Edges
    User <--> UI
    UI -->|Image| OCR
    UI -->|Audio| ASR
    UI -->|Text| Orchestrator
    OCR --> Orchestrator
    ASR --> Orchestrator

    %% Core Loop
    Orchestrator --> Parser
    Parser -->|Structured JSON| Orchestrator

    Orchestrator --> Router
    Router -->|Strategy| Orchestrator

    Orchestrator -->|Problem + Context| Solver
    Solver <-->|Retrieve Similar| History
    Solver <-->|Retrieve Knowledge| RAG

    Solver -->|Python Code| Verifier
    Verifier -->|Substitution Check| Orchestrator

    Orchestrator -.->|Reflexion Retry| Solver

    Orchestrator -->|Verified Trace| DeckGen
    DeckGen -->|Visual Deck| UI

Technology Stack

Frontend: Streamlit (Python)
LLM Orchestration: Google Gemini 2.0 Flash & Pro
Symbolic Math: SymPy, NumPy
Vector Search: FAISS
Backend Framework: Python 3.10+

Setup & Installation

Prerequisites

Python 3.11 or higher
Google Gemini API Key (required)
Google Cloud credentials (optional, for audio transcription)

Installation

Clone the repository

git clone <repository_url>
cd math-mentor

Install dependencies It is recommended to use a virtual environment.

# Using uv (recommended)
uv sync

# Or using pip
pip install -r requirements.txt

Configure Environment Copy .env.example to .env and configure:

cp .env.example .env

Minimum configuration (text input only):

GEMINI_API_KEY=your_gemini_api_key_here

Full configuration (audio + vision):

GEMINI_API_KEY=your_gemini_api_key_here
GOOGLE_APPLICATION_CREDENTIALS=path/to/gcp-credentials.json
GCP_PROJECT_ID=your_project_id
STT_LOCATION=global
STT_RECOGNIZER=_

Run the Application

# Using uv
uv run streamlit run frontend/app.py

# Or with venv activated
streamlit run frontend/app.py

Usage Guide

Select Input Mode: Use the sidebar to switch between Text, Image Upload, or Audio Recording.
Submit Problem: Enter the math problem. The system will first parse and validate the input.
Review Plan: The "Thinking Process" expander visualizes the real-time agent workflow (Parsing -> Retrieval -> Planning -> Verification).
Verify & Explain: The final output includes the computed answer, a step-by-step derivation, and (where applicable) dynamic visual aids.
Feedback: Use the Thumbs Up/Down and "Edit" buttons to provide feedback, which is stored to improve future performance.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.streamlit		.streamlit
backend		backend
frontend-react		frontend-react
frontend		frontend
tests		tests
$env		$env
.dockerignore		.dockerignore
.env.example		.env.example
.gcloudignore		.gcloudignore
.gitignore		.gitignore
.python-version		.python-version
AUDIO_FIX_SUMMARY.md		AUDIO_FIX_SUMMARY.md
DEPLOY.md		DEPLOY.md
Dockerfile		Dockerfile
OAUTH_SETUP.txt		OAUTH_SETUP.txt
RATE_LIMITING_COMPLETE.md		RATE_LIMITING_COMPLETE.md
RATE_LIMIT_UPDATE.md		RATE_LIMIT_UPDATE.md
README.md		README.md
agents.md		agents.md
analyze_project.py		analyze_project.py
cloudbuild.yaml		cloudbuild.yaml
deploy.ps1		deploy.ps1
grading_report.txt		grading_report.txt
log.txt		log.txt
log3.txt		log3.txt
log4.txt		log4.txt
math_mentor.db		math_mentor.db
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Math Mentor: logical-reasoning-agent

Project Overview

Core Philosophy

System Architecture

1. Input Processing Layer

2. Cognitive Layer (The Agents)

3. Memory & Persistence

System Diagram

Technology Stack

Setup & Installation

Prerequisites

Installation

Usage Guide

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Math Mentor: logical-reasoning-agent

Project Overview

Core Philosophy

System Architecture

1. Input Processing Layer

2. Cognitive Layer (The Agents)

3. Memory & Persistence

System Diagram

Technology Stack

Setup & Installation

Prerequisites

Installation

Usage Guide

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages