AR 3D object detection for iPhone with LiDAR. Detects objects with YOLO and lifts them to 3D oriented bounding boxes using BoxerNet (Meta Research), displayed in augmented reality.
iPhone Camera + LiDAR
│
├──► YOLO11n (2D detection) ──► top 3 bounding boxes
│
├──► LiDAR depth ──► median depth per 16×16 patch
│
├──► ARKit ──► camera pose + intrinsics + gravity
│
└──► BoxerNet (3D lifting) ──► oriented 3D bounding boxes
│
SceneKit AR rendering
- YOLO11n detects objects in 2D (640×640, 80 COCO classes)
- BoxerNet lifts 2D boxes to 7-DoF 3D boxes (center, size, yaw) using DINOv3 visual features + LiDAR depth
- ARKit Camera poses + Gravity Vector + LiDAR depth
- SceneKit renders 3D wireframe boxes anchored in the real world
- iPhone 12 Pro or later (LiDAR required)
- iOS 16.0+
- ~450 MB storage for models
-
Clone
git clone git@github.com:Barath19/Boxer3D.git cd Boxer3D -
Download models from Hugging Face
pip install huggingface_hub huggingface-cli download Barath/boxer3d --local-dir boxer/
This places the following in the
boxer/directory:BoxerNet.onnx(~391 MB, float32) — exported from BoxerNet checkpointyolo11n.onnx(~10 MB, float32) — exported from Ultralytics YOLO11n
-
Open in Xcode
open boxer.xcodeproj
Xcode will automatically resolve the ONNX Runtime SPM dependency.
-
Build & Run on your iPhone (Cmd+R)
| Model | Size | Input | Output |
|---|---|---|---|
| yolo11n | 10 MB | (1, 3, 640, 640) RGB | (1, 84, 8400) boxes + classes |
| BoxerNet | 391 MB | (1, 3, 960, 960) RGB + (1, 1, 60, 60) depth + (1, M, 4) boxes + (1, 3600, 6) rays | (M, 3) center, (M, 3) size, (M,) yaw, (M,) confidence |
Both models run with ONNX Runtime CoreML Execution Provider for Metal/Neural Engine acceleration.
- ONNX Runtime v1.24.2 (via SPM)
- ARKit, SceneKit, SwiftUI (built-in)
- Port BoxerNet to Swift
- Convert BoxerNet.pt to ONNX
- Upload ONNX weights for download
- Optimize for portrait mode
Based on Boxer: Robust Lifting of Open-World 2D Bounding Boxes to 3D by Daniel DeTone, Tianwei Shen, Fan Zhang, Lingni Ma, Julian Straub, Richard Newcombe, and Jakob Engel (Meta Reality Labs Research).
@article{boxer2026,
title={Boxer: Robust Lifting of Open-World 2D Bounding Boxes to 3D},
author={Daniel DeTone and Tianwei Shen and Fan Zhang and Lingni Ma
and Julian Straub and Richard Newcombe and Jakob Engel},
year={2026},
}