LearnMate

An AI-powered, hands-free study assistant built for students with physical disabilities. LearnMate runs as a Chrome extension overlay on any webpage — listening to your voice, watching your eye gestures, and responding with spoken explanations, concept diagrams, and lip-synced avatar teaching videos. No keyboard required.

Status: In Production · actively being built

The Problem

Students with motor impairments — limited hand mobility, cerebral palsy, muscular dystrophy — are underserved by every existing edtech tool. Screen readers address content consumption, not comprehension. LearnMate addresses the gap: a student who can't type should still be able to ask a question, get an explanation, and watch it taught to them — entirely hands-free, on any page they're already reading.

Features

Voice control — say "Hey LearnMate, explain binary search trees" and the panel responds
Eye gesture control — wink left eye (2s) to trigger an explanation; wink right eye (2s) to generate a video
AI explanations — Groq LLM (llama-3.1-8b-instant) generates clear, beginner-friendly explanations
Diagram generation — Mermaid concept maps + Wikipedia images combined into a visual slideshow
Lip-synced avatar video — Wav2Lip animates a face speaking the explanation, streamed directly into the browser panel
Works on any webpage — Chrome extension injects the overlay without disrupting the page

Architecture

Chrome Extension (overlay.js)
    ↓ HTTPS fetch
Flask Backend (app.py)
    ├── Groq LLM       → explanation / summary text
    ├── gTTS           → text-to-speech MP3
    ├── Mermaid (mmdc) → concept diagram PNG
    ├── Wikipedia API  → topic images
    ├── FFmpeg         → composite slideshow + avatar
    └── Wav2Lip        → lip-sync MP3 to avatar face → MP4
         ↓
    Video streamed back → played inline in extension panel

Voice (wake_listener.py)      →  /command  →  overlay_state
Gesture (eye_blink_cursor.py) →  /command  →  overlay_state
    ↓ polled every 1.5s by extension
Panel reacts: explain / video / minimize / navigate

Getting Started

Prerequisites

Python 3.10+
Node.js (for mmdc — Mermaid CLI)
FFmpeg on PATH
A Groq API key
Chrome browser

Installation

git clone https://github.com/s3ak6i-dev/LearnMate.git
cd LearnMate
pip install -r requirements.txt
npm install -g @mermaid-js/mermaid-cli

Install Wav2Lip dependencies:

pip install mediapipe face-alignment

Configuration

Replace the Groq API key in app.py:

client = Groq(api_key="YOUR_GROQ_API_KEY", ...)

Add your avatar image as avatar.jpg in the project root.

Running

Start everything at once:

python launch.py

Options:

python launch.py --no-gesture   # voice only (no webcam)
python launch.py --no-voice     # gesture only (no microphone)

First-time HTTPS trust (required once):

Open Chrome → go to https://127.0.0.1:5000
Click Advanced → Proceed to 127.0.0.1 (unsafe)
Done — the extension will now work on all HTTPS sites

Load the Chrome Extension

Open chrome://extensions
Enable Developer mode
Click Load unpacked → select the learnmate_extension/ folder
The LearnMate panel will appear on every webpage

Voice Commands

Say	Action
"Hey LearnMate"	Open / expand the panel
"LearnMate leave"	Minimize the panel
"Explain binary search trees"	Get an AI explanation
"Summarize this"	Get a short bullet-point summary
"Generate a video for this"	Start Wav2Lip video generation
"Scroll down"	Scroll the page
"Go back" / "Go forward"	Browser navigation

Eye Gesture Controls

Gesture	Action
Left eye wink, hold 2s	Explain current topic
Right eye wink, hold 2s	Generate teaching video
Mouth open	Scroll (wider = faster)
Both eyes blink	Left click
Right eye blink	Right click
Left eye hold	Click and drag

Project Structure

LearnMate/
├── app.py                  # Flask backend — all AI pipelines
├── launch.py               # Single launcher for all components
├── voice_input.py          # Always-on voice command loop
├── wake_listener.py        # Wake-word ("Hey LearnMate") listener
├── eye_blink_cursor.py     # Gesture + iris cursor control
├── avatar.jpg              # Avatar face used by Wav2Lip
├── requirements.txt
├── learnmate_extension/
│   ├── manifest.json
│   ├── overlay.js          # Extension UI logic + Flask polling
│   └── overlay.css         # Extension panel styles
└── Wav2Lip/                # Wav2Lip model + inference

Tech Stack

Layer	Technology
Browser extension	Chrome MV3, Vanilla JS
Backend	Python, Flask
LLM	Groq API (llama-3.1-8b-instant)
Text-to-speech	gTTS
Lip-sync	Wav2Lip
Diagram generation	Mermaid CLI (mmdc)
Image enrichment	Wikipedia API
Video compositing	FFmpeg, imageio
Gesture tracking	MediaPipe Face Mesh, OpenCV
Voice recognition	SpeechRecognition (Google Web Speech API)

What's Next

Calibration UI — first-time setup to tune gesture thresholds per user
Conversational follow-up — "explain it simpler", "give me an example" in context
Institutional pilots — disability support centres at universities

Team

Built over 12 weeks by a team of engineers focused on accessible edtech.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LearnMate

The Problem

Features

Architecture

Getting Started

Prerequisites

Installation

Configuration

Running

Load the Chrome Extension

Voice Commands

Eye Gesture Controls

Project Structure

Tech Stack

What's Next

Team

License

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
eye_control		eye_control
learnmate_extension		learnmate_extension
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
avatar.jpg		avatar.jpg
eye_blink_cursor.py		eye_blink_cursor.py
launch.py		launch.py
requirements.txt		requirements.txt
voice_input.py		voice_input.py
wake_listener.py		wake_listener.py

Folders and files

Latest commit

History

Repository files navigation

LearnMate

The Problem

Features

Architecture

Getting Started

Prerequisites

Installation

Configuration

Running

Load the Chrome Extension

Voice Commands

Eye Gesture Controls

Project Structure

Tech Stack

What's Next

Team

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages