PrintChakra is a Windows-first document workflow platform built for scanning, OCR, print configuration, phone-assisted capture, and voice-driven interaction. It combines a Flask backend and a React frontend into a single experience for processing documents from intake to output.
It is designed around practical operations:
- Upload and manage document images and PDFs
- Clean and enhance scans before OCR or printing
- Extract text with OCR pipelines
- Configure print and scan workflows from the browser
- Capture documents from a phone-oriented flow
- Use voice sessions for transcription, orchestration, and spoken responses
- Keep UI state synchronized in real time through Socket.IO
|
|
|
Advanced document cleanup and OCR flow for scanned or photographed pages.
|
Browser-based print setup and orchestration for Windows environments.
|
Voice session startup, transcription, chat, and speech response.
|
|
A phone-oriented intake flow for documents captured outside the desktop UI.
|
Live file browsing, previews, system info, and document actions.
|
Built around Windows printer and local device workflows.
|
printchakra/
├── README.md
├── Document_Processing_Pipeline.ipynb
├── backend/
│ ├── app.py
│ ├── requirements.txt
│ ├── .venv/
│ ├── app/
│ │ ├── api/
│ │ ├── config/
│ │ ├── core/
│ │ ├── features/
│ │ ├── modules/
│ │ ├── sockets/
│ │ ├── utils/
│ │ ├── print_scripts/
│ │ └── .env
│ ├── public/
│ │ └── data/
│ └── logs/
├── frontend/
│ ├── package.json
│ ├── public/
│ └── src/
└── phase-2/
- Backend entry point: backend/app.py
- Backend dependencies: backend/requirements.txt
- Frontend dependencies: frontend/package.json
- Backend environment file used by settings: backend/app/.env
- Notebook pipeline: Document_Processing_Pipeline.ipynb
- Windows 10 or 11
- Python 3.10 recommended
- Node.js 18+
- npm
cd backend
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txtIf backend/.venv already exists and is working, reuse it.
cd frontend
npm installPrintChakra now includes a production-oriented Docker setup for the full app:
- Backend container on port 5000
- Frontend container on port 3000
- Persistent backend data mounted from backend/public/data
- Linux-native OCR and PDF runtime packages baked into the backend image
- Optional host Ollama access through
host.docker.internal
docker compose up --build- Frontend: http://localhost:3000
- Backend: http://localhost:5000
- Browser-to-backend routing is controlled by
REACT_APP_API_URLat frontend build time. - The backend image sets
POPPLER_PATH=/usr/binandTESSERACT_CMD=/usr/bin/tesseract. - Ollama is not bundled; by default Compose points the backend to
http://host.docker.internal:11434. - Windows-native printing is not available inside the default Linux container. Linux printing can work if the host exposes CUPS.
cd backend
.\.venv\Scripts\Activate.ps1
python app.pycd frontend
npm run devPipecat runs as a separate FastAPI WebSocket server used by the docked AI panel.
cd pipecat-web-voice
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
python app.py- Frontend: usually http://localhost:5173 (Vite), sometimes http://localhost:3000 depending on your setup
- Backend: http://localhost:5000
- Pipecat WS: ws://localhost:8765/ws
- Pipecat health: http://localhost:8765/health
- Backend Pipecat proxy health: http://localhost:5000/pipecat/health
If port 3000 is occupied, the frontend may move to another port such as 3001.
- Backend:
GET /pipecat/statusreturnsavailable: true - Backend:
GET /pipecat/healthreturns a non-emptywebsocket_url - Pipecat:
GET http://localhost:8765/healthreturns healthy - Dashboard: docked AI panel connects and can play back TTS audio
- Navigation: backend Socket.IO
voice_command_detectedevents can navigate routes while keeping the AI panel open
The backend settings currently load environment variables from backend/app/.env.
When running with Docker Compose, container environment variables override local file-based defaults.
FRONTEND_URL=http://localhost:3000
BACKEND_PUBLIC_URL=http://localhost:5000
API_CORS_ORIGINS=http://localhost:3000
VOICE_AI_MODEL=smollm2:135m
GROQ_API_KEY=your_key_here
GROQ_LLM_MODEL=llama-3.1-8b-instant
GROQ_STT_MODEL=whisper-large-v3-turbo
GROQ_TTS_ENDPOINT=https://api.groq.com/openai/v1/audio/speech
GROQ_TTS_MODEL=canopylabs/orpheus-v1-englishThe backend defaults to HTTP. HTTPS is opt-in.
USE_HTTPS=1
SSL_CERT=certs/cert.pem
SSL_KEY=certs/key.pemflowchart TD
A[Phone Capture / Dashboard / Voice UI] --> B[React Frontend]
B --> C[Axios + Socket.IO]
C --> D[Flask Backend]
D --> E[Document Processing Modules]
D --> F[OCR + Image Enhancement]
D --> G[Print and Scan Orchestration]
D --> H[Voice Services]
H --> I[Local Whisper / Local TTS / Local LLM]
H --> J[Groq Fallback]
D --> K[Windows Printing + Local File Storage]
PrintChakra uses a local-first voice strategy and can fall back to Groq when local services are unavailable.
Configured fallback areas:
- LLM chat
- Speech-to-text
- Text-to-speech
The /voice/status endpoint reports current readiness for local and fallback providers.
Runtime file storage is served through backend data directories inside the backend tree.
Canonical backend test outputs are kept in:
Redundant generated output folders outside that canonical path were intentionally cleaned up.
Check:
- Python version is compatible
- The backend virtual environment is activated
- Port 5000 is not occupied by another process
- Dependencies from backend/requirements.txt are installed
Check:
- Backend is running on port 5000
- Frontend dev server is running
- CORS points to the correct frontend origin
- Backend is not accidentally running under HTTPS while the frontend expects HTTP
Check:
- Local voice dependencies installed correctly
- Groq settings are present in backend/app/.env if fallback is expected
/voice/statusreports the providers you expect
Check:
- PaddleOCR and image dependencies are installed
- PDF tooling is available for conversion paths
TESSERACT_CMDpoints to a valid binary when running in a container- GPU support is optional and CPU fallback may be slower
Check:
- The default containers are Linux-based and cannot use Windows
pywin32printing - Linux printing requires host CUPS access and compatible printer visibility
- For Windows printer integration, run the backend locally on Windows instead of inside Docker
The repository includes a standalone notebook for experimenting with the document pipeline:
PrintChakra is a document workflow app centered on OCR, print and scan control, voice interaction, and phone-assisted capture. For local development, use Python 3.10, run the backend on port 5000, run the frontend with npm run dev, and keep backend environment values in backend/app/.env.