Feat/card capture#278
Open
BosuBose132 wants to merge 19 commits into
Open
Conversation
Combine frame preprocessing, ONNX session execution, and YOLO post-processing. Return a simple detected/confidence result while keeping model geometry internal. Validate missing model inputs and outputs with actionable errors, and add unit coverage for positive detections, empty results, and invalid model output.
Load and reuse the browser ONNX session while sampling frames from the existing camera video reference. Prevent overlapping inference calls, require consecutive positive predictions, and tolerate a configurable number of temporary misses before resetting detection. Expose model loading, searching, detected, and error states with explicit start, stop, and reset controls. Add hook tests for model loading, stable detection, missed-frame resets, inference locking, session cleanup, and loading failures.
Compose the existing MIE Modal, Button, Spinner, Alert, Text, icon, and camera primitives into a reusable ID-card capture experience. Require both document-quality readiness and stable semantic ID-card detection before starting the automatic capture countdown. Preserve manual capture as a fallback when the model is loading, unavailable, or uncertain. Support camera switching, permission and model errors, captured-image preview, retake, confirmation, modal cleanup, and File-based capture output.
…pture thresholds for minor handheld movement
…issions, model fallback, confirmation, and cleanup
…rnalize ONNX Runtime Web
…ction pipeline, testing, and maintenance constraints
… Storybook viewport
There was a problem hiding this comment.
Pull request overview
Adds a new reusable CardCapture component to @mieweb/ui that performs in-browser ONNX-based ID-card detection (via onnxruntime-web) and integrates it with the existing camera + image-quality checks to enable stable automatic capture with a manual fallback.
Changes:
- Introduces
CardCapture(component, hooks, inference pipeline, Storybook story) plus unit/component tests. - Extends
useDocumentDetectionwith a configurablestabilityThreshold(default preserved) and updates tests. - Updates packaging/build config to ship
CardCaptureas a tree-shakeable subpath and keepsonnxruntime-webexternal + optional peer dependency; adds Storybook static assets mapping for ONNX runtime WASM files.
Reviewed changes
Copilot reviewed 21 out of 23 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tsup.config.ts | Adds components/CardCapture/index entry and marks onnxruntime-web as external. |
| src/components/DocumentScanner/useDocumentDetection.ts | Adds stabilityThreshold to config/defaults and uses it in stability comparison. |
| src/components/DocumentScanner/useDocumentDetection.test.ts | Adds a stability similarity test case for relaxed thresholds. |
| src/components/CardCapture/useCardDetection.ts | Implements polling-based detection loop, model lifecycle, and stable-detection logic. |
| src/components/CardCapture/useCardDetection.test.ts | Tests model loading, stable detections, misses handling, inference locking, and errors. |
| src/components/CardCapture/runCardInference.ts | Wires preprocessing + session run + postprocessing into a single inference step. |
| src/components/CardCapture/runCardInference.test.ts | Unit tests for inference orchestration and error cases. |
| src/components/CardCapture/preprocessCardFrame.ts | Implements letterbox resize + RGBA→NCHW float tensor conversion. |
| src/components/CardCapture/preprocessCardFrame.test.ts | Unit tests for letterboxing and tensor conversion behavior. |
| src/components/CardCapture/postprocessCardDetections.ts | Parses YOLO output, maps boxes to source coords, and applies NMS. |
| src/components/CardCapture/postprocessCardDetections.test.ts | Unit tests for parsing/layout support, IoU/NMS, and threshold validation. |
| src/components/CardCapture/MAINTAINERS.md | Adds maintainer-focused documentation for model/runtime requirements and tuning. |
| src/components/CardCapture/loadCardModel.ts | Loads ONNX model and configures ORT WASM asset path. |
| src/components/CardCapture/loadCardModel.test.ts | Unit tests for model URL validation/trim and WASM path handling. |
| src/components/CardCapture/index.ts | Exposes CardCapture and useCardDetection via subpath entry. |
| src/components/CardCapture/CardCapture.tsx | Implements modal UI, countdown auto-capture, preview/confirm/retake flow, and cleanup. |
| src/components/CardCapture/CardCapture.test.tsx | Component tests for open/start, manual capture, retake, permissions, auto-capture countdown, and cleanup. |
| src/components/CardCapture/CardCapture.stories.tsx | Adds live Storybook demo and documentation for model-based capture. |
| pnpm-lock.yaml | Locks onnxruntime-web and its transitive dependencies. |
| package.json | Adds onnxruntime-web as optional peer dependency and dev dependency. |
| eslint.config.js | Adds Canvas-related globals to avoid lint false-positives. |
| .storybook/main.ts | Serves onnxruntime-web/dist as static assets for local Storybook inference. |
Files not reviewed (1)
- pnpm-lock.yaml: Generated file
Comments suppressed due to low confidence (1)
src/components/CardCapture/MAINTAINERS.md:265
- The file ends with extra triple-backtick fences (```), leaving unterminated/empty code blocks and breaking Markdown formatting.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…eusable canvas preprocessing, and consistent model references
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Comment on lines
+204
to
+209
| if (!enabled) { | ||
| setStatus('idle'); | ||
| setIsModelReady(false); | ||
| setError(null); | ||
| return; | ||
| } |
Comment on lines
+292
to
+306
| const { | ||
| isModelReady, | ||
| isCardDetected, | ||
| error: cardDetectionError, | ||
| startDetection: startCardDetection, | ||
| stopDetection: stopCardDetection, | ||
| resetDetection: resetCardDetection, | ||
| } = useCardDetection(videoRef, { | ||
| modelUrl, | ||
| wasmPaths, | ||
| confidenceThreshold, | ||
| detectionIntervalMs: 500, | ||
| stableDetectionsRequired: 2, | ||
| allowedMisses: 1, | ||
| }); |
Comment on lines
+132
to
+137
| <div | ||
| className={cn( | ||
| 'absolute inset-4 rounded-lg border-4 transition-colors duration-300', | ||
| ready ? 'border-success' : 'border-neutral-100/50' | ||
| )} | ||
| > |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a reusable
CardCapturecomponent to@mieweb/ui.The component runs an ONNX ID-card detection model directly in the browser and starts automatic capture only after an ID card is detected consistently and the existing image-quality checks pass.
Problem
The existing camera flow checks focus, brightness, and frame stability, but it cannot confirm that the object in front of the camera is actually an ID card.
This means other clear rectangular objects could pass the general image-quality checks without semantic card validation.
Solution
CardCapturecombines:The model is executed through
onnxruntime-web, and camera frames remain inside the browser during detection.https://youtube.com/shorts/kOTIUUnLONU?feature=share (A small demo short of my output)
Implementation
The component:
[1, 3, 640, 640]tensor formatFileThe visible interface is built using existing MIE UI components and utilities, including:
ModalButtonSpinnerAlertTextuseCamerauseDocumentDetectionFeatures Added
Usage
Testing
Verified that:
FileAutomated tests cover model loading, preprocessing, prediction processing, stable detection, capture flows, permission handling, model fallback, retake, confirmation, and cleanup.
Reviewer Instructions
Open:
Suggested checks:
Current Model Scope
The included MVP model supports a single
id_cardclass.It performs best on ID formats represented in its training data. Some additional formats, such as certain student IDs or employee badges, may require future model calibration or retraining.
The model can be updated independently without changing the public
CardCaptureAPI.Breaking Changes
None.
The existing
DocumentScannerbehavior remains unchanged by default.Validation