Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
6926b81
feat: Phase 3 - demo JSON configs (Dubai RE, Hey Saga, Deepgram, Hey …
Feb 26, 2026
5d60904
docs: mark Phase 3 (Demo JSON Configs) as Complete in STATE.md
Feb 26, 2026
b828748
docs(01-01): complete Phase 1 Backend JSON Config System plan
Feb 26, 2026
14778ba
feat: Phase 2 - frontend redesign with Deepgram design system
Feb 26, 2026
28c5215
feat(04-integration): wire config_id to VoiceAgent and update fronten…
Feb 26, 2026
0a6ef06
feat: Phase 4 - integration wiring config selection to VoiceAgent
Feb 26, 2026
033cf65
fix: create AudioContext during user gesture to unblock Chrome autoplay
Feb 27, 2026
edf279a
fix: force WebSocket transport to fix binary audio on Fly.io
Feb 27, 2026
f355b73
fix: deepgram auto-selected first, builder language/voice model sync
Feb 27, 2026
e4cb401
fix(partial): agentName substitution in system prompt + loop policy fix
Feb 27, 2026
35f9a72
Fix stop/restart: close WS and track thread for clean restarts
Feb 27, 2026
51f0f60
Use {{agentName}} in all config greetings and system prompts
Feb 27, 2026
df6367b
Fix agentName substitution and implement hotword detection via functi…
Feb 27, 2026
f3cbe80
Fix hotword matching: tolerate STT punctuation between words
Feb 27, 2026
883e39a
Add conversational continuity after hotword activation
Feb 27, 2026
a2cbdb2
Add close_hotword_session for LLM-controlled conversation end
Feb 27, 2026
6605e5e
Disable Hey Manny demo, set Dubai voice to Pandora
Mar 2, 2026
feda620
Add ElevenLabs TTS support and Tagalog language for non-English STT
Mar 2, 2026
2639b0d
Set ElevenLabs Tagalog voice ID for bpo-tagalog config
Mar 2, 2026
133126b
Fix ElevenLabs speak provider structure per Deepgram VA API spec
Mar 2, 2026
a45d645
Remove language_code from ElevenLabs provider; let voice auto-detect
Mar 2, 2026
c02590e
Add model_id as URL query param for ElevenLabs reconnections
Mar 2, 2026
8440b35
Add voice_id to ElevenLabs provider; revert URL query param change
Mar 2, 2026
44b70e6
Fix ElevenLabs BYO TTS: use multi-stream-input endpoint, QOL UI impro…
Mar 2, 2026
ea09850
chore: ignore .claude/ and .planning/ directories
Mar 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
{
"permissions": {
"allow": [
"Bash(git checkout:*)"
"Bash(git checkout:*)",
"Bash(flyctl version:*)",
"Bash(flyctl deploy:*)",
"Bash(fly auth login:*)",
"Bash(fly deploy:*)"
]
}
}
104 changes: 52 additions & 52 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,52 +1,52 @@
# flyctl launch added from .gitignore
**/*.py[cod]
**/.cache-*
**/.DS_Store

# C extensions
**/*.so

# Environments
**/.env
**/.venv
**/env
**/venv

# Packages
**/*.egg
**/*.egg-info
**/dist
**/build
**/eggs
**/parts
**/bin
**/var
**/sdist
**/develop-eggs
**/.installed.cfg
**/lib
**/lib64
**/__pycache__

# Installer logs
**/pip-log.txt

# Unit test / coverage reports
**/.coverage
**/.tox
**/nosetests.xml

# Translations
**/*.mo
**/requirements_PA.txt
**/app.db

# Misc
**/mock_data_outputs
**/misc
**/pyproject.toml
**/poetry.lock

# Deepgram docs
**/deepgram-docs
fly.toml
# flyctl launch added from .gitignore
**/*.py[cod]
**/.cache-*
**/.DS_Store
# C extensions
**/*.so
# Environments
**/.env
**/.venv
**/env
**/venv
# Packages
**/*.egg
**/*.egg-info
**/dist
**/build
**/eggs
**/parts
**/bin
**/var
**/sdist
**/develop-eggs
**/.installed.cfg
**/lib
**/lib64
**/__pycache__
# Installer logs
**/pip-log.txt
# Unit test / coverage reports
**/.coverage
**/.tox
**/nosetests.xml
# Translations
**/*.mo
**/requirements_PA.txt
**/app.db
# Misc
**/mock_data_outputs
**/misc
**/pyproject.toml
**/poetry.lock
# Deepgram docs
**/deepgram-docs
fly.toml
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,5 @@ poetry.lock
deepgram-docs/

# Claude
.claude/settings.local.json
.claude/
.planning/
60 changes: 60 additions & 0 deletions .planning/REQUIREMENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Flask Voice Agent Demo Redesign — Requirements

**Project:** Flask Voice Agent Demo Redesign
**Created:** 2026-02-26

---

## R-01: JSON Config System (Backend)

- R-01-A: Create `configs/` directory at repo root with one JSON file per demo
- R-01-B: JSON schema per config: `id`, `name`, `company`, `personality`, `language`, `voiceModel`, `voiceName`, `systemPrompt`, `functions` (array), `hotword` (optional), `mode` (`voice_agent` | `agent_assist`), `greeting`
- R-01-C: Replace hardcoded `match/case` in `common/agent_templates.py` with dynamic JSON config loader
- R-01-D: New Flask route: `GET /configs` — returns list of all configs
- R-01-E: New Flask route: `POST /configs` — creates new config, writes JSON file
- R-01-F: New Flask route: `DELETE /configs/<id>` — deletes config file
- R-01-G: Maintain backward compat: `VoiceAgent(industry, voiceModel, voiceName, language, browser_audio)` signature unchanged
- R-01-H: `/industries` route kept or aliased to `/configs` for compatibility

## R-02: Frontend Redesign (Deepgram Design System)

- R-02-A: Replace `static/style.css` with Deepgram CDN: `https://unpkg.com/@deepgram/styles/dist/deepgram.css`
- R-02-B: Load Font Awesome: `https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/all.min.css`
- R-02-C: Force dark mode: `:root { color-scheme: dark; }`
- R-02-D: Design tokens: brand green `#13ef95`, brand blue `#149afb`, bg `#0b0b0c`, Inter/Noto Sans fonts
- R-02-E: Use `dg-columns` for 3-panel layout (left sidebar + center conversation + right logs)
- R-02-F: Demo selector: replace popup with `dg-card--selectable` grid loading all JSON configs from `GET /configs`
- R-02-G: Start button: large `dg-btn--primary` with Font Awesome `fa-microphone` icon
- R-02-H: Status indicator: `dg-status` component
- R-02-I: Language/voice selects: `dg-select` inside `dg-form-field`
- R-02-J: Components to use: `dg-btn`, `dg-card dg-card--selectable`, `dg-form-field`, `dg-input`, `dg-select`, `dg-textarea`, `dg-toggle`, `dg-status`, `dg-spinner`, `dg-columns`, `dg-page-heading`, `dg-alert`

## R-03: Builder Form (New Feature)

- R-03-A: "New Demo" button opens slide-in panel or modal
- R-03-B: Form fields: name, company, personality (textarea), system prompt (textarea), language (select), voice model (select from `/tts-models`), functions (toggle group), hotword (optional input)
- R-03-C: Submit POSTs to `/configs`; new card appears in selector immediately (no page reload)
- R-03-D: Edit existing: pre-populate form from existing config

## R-04: Demo JSON Configs

- R-04-A: `hey-manny.json` — Manny Pacquiao persona, Filipino English BPO, voice model `aura-2-arcas-en`
- R-04-B: `dubai-real-estate.json` — Luxury real estate AI concierge, Dubai, English, professional
- R-04-C: `bpo-tagalog.json` — BPO call center agent, Tagalog, language `tl`, voice model `aura-2-luna-en`
- R-04-D: `hey-saga.json` — Smart city concierge, hotword "Hey Saga", Saga persona
- R-04-E: `deepgram.json` — Existing Deepgram tech support demo converted to config format

## R-05: Integration

- R-05-A: Frontend demo selector populates from `GET /configs` at page load
- R-05-B: Selecting a config loads it into the VoiceAgent session on connect
- R-05-C: Builder form creates configs via `POST /configs` and refreshes selector grid live
- R-05-D: All 5 demo configs render in selector and successfully start a voice session
- R-05-E: Existing WebSocket/audio logic in `client.py` remains unchanged

## Constraints

- Keep all Python WebSocket/audio logic in `client.py` untouched
- No external JS frameworks — vanilla JS only for frontend
- Python only for backend — no Node.js
- VoiceAgent class init signature must remain compatible
42 changes: 42 additions & 0 deletions .planning/ROADMAP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Flask Voice Agent Demo Redesign — Roadmap

**Project:** Flask Voice Agent Demo Redesign
**Working directory:** /coding/flask-agent-function-calling-demo
**Created:** 2026-02-26

---

## Phases

### Phase 1: Backend JSON Config System
**Goal:** Replace hardcoded `match/case` industry templates with a dynamic JSON config loader. Add CRUD routes for configs.
**Parallel stream:** A
**Status:** Planned

### Phase 2: Frontend Redesign (Deepgram Design System)
**Goal:** Replace `static/style.css` and `templates/index.html` with Deepgram design system components, new 3-panel layout, demo selector grid, and builder form.
**Parallel stream:** B
**Status:** Planned

### Phase 3: Demo JSON Configs
**Goal:** Create 5 demo JSON config files in `configs/` directory: hey-manny, dubai-real-estate, bpo-tagalog, hey-saga, deepgram.
**Parallel stream:** C
**Status:** Planned

### Phase 4: Integration
**Goal:** Wire together backend config API, frontend selector/builder, and demo configs. Verify end-to-end voice agent flow works with any config.
**Depends on:** Phase 1, 2, 3
**Status:** Planned

---

## Parallel Execution Strategy

Phases 1, 2, 3 can all run in parallel (no dependencies between them).
Phase 4 depends on all three completing first.

```
Phase 1 (Backend) ─┐
Phase 2 (Frontend) ─┼──► Phase 4 (Integration)
Phase 3 (Configs) ──┘
```
51 changes: 51 additions & 0 deletions .planning/STATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Flask Voice Agent Demo Redesign — Project State

**Project:** Flask Voice Agent Demo Redesign
**Working directory:** /coding/flask-agent-function-calling-demo
**Created:** 2026-02-26
**Status:** Planning

---

## Locked Decisions

### Backend
- JSON config loader replaces `match/case` in `common/agent_templates.py`
- `configs/` directory at repo root holds one JSON file per demo
- Config schema: `id`, `name`, `company`, `personality`, `language`, `voiceModel`, `voiceName`, `systemPrompt`, `functions[]`, `hotword?`, `mode`, `greeting`
- CRUD routes: `GET /configs`, `POST /configs`, `DELETE /configs/<id>`
- `VoiceAgent` class signature stays unchanged
- `/industries` aliased or kept for backward compat

### Frontend
- Deepgram CDN design system replaces custom CSS
- Force dark mode via `:root { color-scheme: dark; }`
- 3-panel layout using `dg-columns` (sidebar | conversation | logs)
- Demo selector = `dg-card--selectable` grid (replaces popup)
- Vanilla JS only — no frameworks
- Builder form as slide-in panel or modal

### Demo Configs
- 5 configs: hey-manny, dubai-real-estate, bpo-tagalog, hey-saga, deepgram

---

## Phase Status

| Phase | Name | Stream | Status |
|-------|------|--------|--------|
| 1 | Backend JSON Config System | A | Complete |
| 2 | Frontend Redesign | B | Complete |
| 3 | Demo JSON Configs | C | Complete |
| 4 | Integration | - | Complete |

---

## Key Files

- `client.py` — Flask server, VoiceAgent class, SocketIO handlers (DO NOT change audio/WS logic)
- `common/agent_templates.py` — AgentTemplates (hardcoded match/case, to be replaced)
- `common/agent_functions.py` — FUNCTION_MAP, FUNCTION_DEFINITIONS
- `common/prompt_templates.py` — PROMPT_TEMPLATE, DEEPGRAM_PROMPT_TEMPLATE
- `templates/index.html` — Main UI template (~500+ lines inline JS)
- `static/style.css` — Custom CSS (to be replaced with DG design system)
Loading