A high-performance, multi-engine OCR (Optical Character Recognition) system that runs entirely in your browser. No server uploads, no privacy concerns—just fast, secure text extraction.
- 🔒 Privacy First: All processing happens locally on your device. Images never leave your browser.
- ⚡ Multi-Engine Strategy: Choose the best engine for your needs:
- Tesseract.js: The industry standard for general-purpose OCR.
- Transformers.js (TrOCR): State-of-the-art AI accuracy using Transformer models.
- eSearch-OCR (PaddleOCR): High-speed, high-accuracy engine optimized for Chinese/English mixed text.
- 🔋 Performance Optimized: Uses WebAssembly (WASM), Web Workers, and WebGPU acceleration for near-native speeds.
- 📦 Intelligent Caching: Heavy model files are cached in IndexedDB for instant subsequent loads.
- 🎨 Glassmorphism UI: A modern, clean interface with drag-and-drop, URL, and paste support.
| Engine | Best For | Tech Stack | Model Size |
|---|---|---|---|
| Tesseract.js | General use, 100+ languages | WASM | ~4.3 MB (eng/fast) |
| Transformers.js | Highest accuracy, modern AI | WebGPU / ONNX | ~40-150 MB |
| eSearch-OCR | Chinese/English, complex layouts | ONNX Runtime | ~7-10 MB |
- Modern Browser:
- Basic Support (WASM/Workers): Chrome 92+, Firefox 79+, Safari 15.2+ (required for
SharedArrayBuffer) - WebGPU Acceleration: Chrome 113+, Firefox 121+, Safari 17+
- Basic Support (WASM/Workers): Chrome 92+, Firefox 79+, Safari 15.2+ (required for
- Node.js: v18 or higher recommended
-
Clone the repository:
git clone https://github.com/your-repo/multi-engine-browser-ocr.git cd multi-engine-browser-ocr -
Install dependencies:
npm install
-
Start development server:
npm run dev
Most engines download their models automatically from CDNs (Hugging Face or Tesseract CDN) on their first run and cache them locally.
By default, eSearch-OCR fetches models from Hugging Face. If you need to use it offline or host models yourself:
- Download models from eSearch-OCR releases.
- Place
det.onnx,rec.onnx, andppocr_keys_v1.txtintopublic/models/esearch/.
src/engines/: Implementation of different OCR strategies.src/utils/: Image processing, feature detection, and model caching.src/types/: Shared TypeScript interfaces.tests/: Comprehensive test suite using Vitest.docs/: Technical specifications and decision logs.
npm run dev: Start Vite development server.npm run build: Build for production.npm test: Run all tests once.npm run test:watch: Run tests in watch mode.npm run lint: Check for code style issues.npm run format: Automatically fix formatting.
This application is designed with security as a core principle:
- No Data Collection: Your images are processed entirely in the local browser context. No data is sent to external servers or APIs.
- Offline Capability: Once the models are cached, the engine can function without an active internet connection.
- Open Source: The entire pipeline is transparent and verifiable.
Contributions are welcome! Please see CONTRIBUTING.md (if available) or simply open a Pull Request.
- Technical Specification: Deep dive into architecture and design.
- Decision Log: Rationale behind technical choices.