CalorieScan API — A Food Weight and Calorie Estimation Solution (FastAPI)

🔍 中文版本请点击：README_zh.md

This is a small project I built back in my sophomore year: you take a photo of your meal, run detection/segmentation/depth estimation, and it returns a rough estimate of food weight (grams) and calories. It’s not a “one-click 100% accurate” product, but it’s a complete end-to-end pipeline that’s handy for learning, demos, and further tweaking.

If you find it useful, feel free to star the repo, fork it, and hack around. Issues/PRs are welcome.

What It Does

YOLOv8: detects tableware / containers / food targets
SAM: segments containers and food regions (and produces overlays for visual inspection)
MiDaS: estimates relative depth (mainly used to infer “height”)
Classifier + calorie table: maps recognized dishes to an approximate kcal/g
Output: weight/calories (point + range), confidence, warnings, and debug image paths

The Core Idea (Simple but Practical)

Use a “common reference object” as a ruler (by default: a bank card). First segment the container and food to get a reasonable footprint area; then use a depth model to estimate relative height and approximate volume; finally convert volume to weight using a dish-specific density and a calorie table.

In practice, the biggest sources of error are usually not the model itself, but: whether the mask includes bowl/background, whether the reference object and food are on the same plane, and whether the photo is blurry/reflective/dim. That’s why this project also saves debug overlays—so you can quickly see where the bias comes from.

Project Layout & Entrypoints

This repo contains two FastAPI apps (one full pipeline and one lightweight demo):

Full weight/calories analysis service (recommended)
- Entrypoint: model_training/caloriscan_api.py
- Port: 8000
- Endpoint: POST /analyze
- Notes: returns weight/calories + saves debug images to model_training/outputs/
Lightweight segmentation demo (segmentation only)
- Entrypoint: main.py
- Port: 8001
- Endpoint: POST /estimate
- Notes: a minimal service to quickly validate segmentation/recognition flow

model_training/main.py is an older offline inference entry (using predict.py), mainly kept for my own past debugging. It’s not recommended as the public API entry.

Requirements

Python: 3.9 / 3.10 recommended
GPU: NVIDIA GPU is much faster; CPU-only also works (but slower)
First run needs internet: MiDaS weights are downloaded from Hugging Face and then cached

Installation

1) Create and activate a virtual environment (recommended)

Windows PowerShell:

python -m venv venv
.\venv\Scripts\activate
python -m pip install -U pip

2) Install dependencies

pip install -r requirements.txt

If you hit issues installing torch==...+cuXXX (platform / GPU / driver differences can be painful), a practical approach is:

Install torch/torchvision/torchaudio using the official PyTorch instructions for your machine (CPU or CUDA build)
Install the rest (and if needed, remove/adjust torch-related lines in requirements.txt)

Model Files (Required)

To keep the repository small, some large model weights are not committed. Place them as follows:

Model	Purpose	File you need	Where to put it (relative to repo root)
SAM ViT-B	Segmentation	`sam_vit_b_01ec64.pth`	`model_training/models/sam_vit_b_01ec64.pth`
YOLOv8n	Detection	`yolov8n.pt`	`model_training/yolov8n.pt`
Cuisine classifier (optional)	Classification	`cuisine_classifier_full.pt`	`model_training/cuisine_classifier_full.pt`
MiDaS (DPT)	Depth	Auto-download	No manual placement

Notes:

If model_training/models/ doesn’t exist, create it.
If your downloaded filenames differ, rename them to match the table to avoid editing code.
If yolov8n.pt is missing, the first run will download it automatically.
The cuisine classifier weights are not published by default. If you need them, please contact the author first (see the contact info in “Commercial Use / Commercial License”). If it’s missing, the API still runs but returns unknown and uses a default calories factor.

Train Your Own Cuisine Classifier (Optional, Recommended)

If you don’t want to use my trained classifier, you can train your own and drop the outputs back into model_training/.

Prepare your dataset in ImageFolder format (one folder per class):
- model_training/my_dataset/<class_name>/xxx.jpg
Run training from the model_training/ directory:

python train_classifier.py

Outputs:

model_training/cuisine_classifier_full.pt
model_training/cuisine_classifier.pth
model_training/classes.txt

Run the Service

A. Run the full analysis service (recommended)

From the repository root:

python model_training\caloriscan_api.py

Swagger UI:

http://localhost:8000/docs

B. Run the lightweight demo service (optional)

From the repository root:

python main.py

Swagger UI:

http://localhost:8001/docs

API Usage (Full Analysis Service)

Request

POST /analyze (multipart form):

file: image file (jpg/png)

curl example:

curl -X POST "http://localhost:8000/analyze" ^
  -H "accept: application/json" ^
  -H "Content-Type: multipart/form-data" ^
  -F "file=@your_food.jpg"

Response Fields (simplified)

The response is a list; each item corresponds to one detected dish/container:

name: dish name (Chinese label)
weight / weight_low / weight_high: weight estimate (g) and range
calories / calories_low / calories_high: calories estimate and range
confidence: high/medium/low
warnings: quality/model/scale/depth warnings (for diagnosing errors)
card: reference object detection info (not always used for scaling)
model_scores: debug numeric signals (yolo/cls/sam scores, depth stats, area ratio, scaling method, etc.)
debug_files: paths to debug images (overlay/mask)
image: base64 overlay image (handy for frontends)

Example (real output contains more fields):

[
  {
    "name": "炒饭",
    "weight": 132,
    "calories": 250,
    "confidence": "low",
    "warnings": ["blurry", "card_low_confidence"],
    "debug_files": {
      "container": { "overlay": "...", "mask": "..." },
      "food": { "overlay": "...", "mask": "..." }
    }
  }
]

Debug Images (Highly Recommended)

Each /analyze call saves debug images into:

model_training/outputs/

Typical files:

..._food.jpg / ..._food_mask.png
..._container.jpg / ..._container_mask.png

Overlay convention:

Blue: container/bowl mask
Green: food mask
Red: reference object (bank card) outline (if 4 corners are available, it draws a quadrilateral)

Photography Tips (Accuracy Depends a Lot on the Photo)

Use good lighting and avoid blur (blurry significantly increases uncertainty)
Keep the scene simple (one main food target, or clear separation between targets)
If you want to use a bank card / reference object for scaling:
- Place it on the same plane as the food (same tabletop) for best results
- A perfect top-down shot is not required; perspective/tilt is supported via 4-corner estimation
- If the card and food are clearly not on the same plane, it triggers card_plane_mismatch and falls back to a default scale (better than a wrong calibration)

Common Issues (Troubleshooting)

Can’t open /docs / connection refused
- Usually the server hasn’t fully started, the port is in use, or you opened the wrong port (8000 vs 8001)
Missing sam_vit_b_01ec64.pth
- Make sure it’s under model_training/models/ and the filename matches exactly
First run takes a long time
- MiDaS downloads weights from Hugging Face; it caches them for future runs
Weight looks too high/too low and you want to locate the cause
- Check whether food/container masks in model_training/outputs/ look reasonable
- Then check food_container_ratio, depth_height, cm_per_pixel_method, and warnings in the API response

Example Outputs

Debug images are generated after you call the API and saved to model_training/outputs/ (this folder is not committed by default). Typical files look like:

..._food.jpg / ..._food_mask.png
..._container.jpg / ..._container_mask.png

Contributing / Support

Want higher accuracy? Data is the most effective path (a bit of self-collected photos + real weights). Contributions are welcome.
Found a bug? Please open an issue, and ideally attach the overlay + mask images from outputs/.
Want to add features? Fork it and go wild; PRs are also welcome.

Commercial Use / Commercial License

This project is released under AGPL-3.0. If you want to use it commercially but integrate/deploy it as closed-source (or you cannot comply with AGPL obligations), please contact the author first for a separate commercial license.

Email: 1374552774@qq.com
WeChat: Akuri2133

License

AGPL-3.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
model_training		model_training
.gitignore		.gitignore
LICENSE		LICENSE
OPTIMIZATION_LOG.txt		OPTIMIZATION_LOG.txt
README.md		README.md
README_zh.md		README_zh.md
main.py		main.py
requirements.txt		requirements.txt
segment.py		segment.py
识别结果演示.png		识别结果演示.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CalorieScan API — A Food Weight and Calorie Estimation Solution (FastAPI)

What It Does

The Core Idea (Simple but Practical)

Project Layout & Entrypoints

Requirements

Installation

1) Create and activate a virtual environment (recommended)

2) Install dependencies

Model Files (Required)

Train Your Own Cuisine Classifier (Optional, Recommended)

Run the Service

A. Run the full analysis service (recommended)

B. Run the lightweight demo service (optional)

API Usage (Full Analysis Service)

Request

Response Fields (simplified)

Debug Images (Highly Recommended)

Photography Tips (Accuracy Depends a Lot on the Photo)

Common Issues (Troubleshooting)

Example Outputs

Contributing / Support

Commercial Use / Commercial License

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

ChenYY-Offical/CalorieScan-Food-Calorie-API

Folders and files

Latest commit

History

Repository files navigation

CalorieScan API — A Food Weight and Calorie Estimation Solution (FastAPI)

What It Does

The Core Idea (Simple but Practical)

Project Layout & Entrypoints

Requirements

Installation

1) Create and activate a virtual environment (recommended)

2) Install dependencies

Model Files (Required)

Train Your Own Cuisine Classifier (Optional, Recommended)

Run the Service

A. Run the full analysis service (recommended)

B. Run the lightweight demo service (optional)

API Usage (Full Analysis Service)

Request

Response Fields (simplified)

Debug Images (Highly Recommended)

Photography Tips (Accuracy Depends a Lot on the Photo)

Common Issues (Troubleshooting)

Example Outputs

Contributing / Support

Commercial Use / Commercial License

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages