
Hi, authors and community members,
I recently tried running the official evaluation script provided in this project. The command I used is:
python train_net.py --eval_only --resume --num-gpus $n --config-file configs/semantic_sam_only_sa-1b_swinL.yaml COCO.TEST.BATCH_SIZE_TOTAL=$n MODEL.WEIGHTS=/path/to/weights
The evaluation completed successfully, and I obtained metrics like the following (please see the attached screenshot as well):
'noc@0.5': 8.81
'noc@0.8': 13.51
'noc@0.9': 16.86
'miou@iter1': 0.5508
....
However, as I am relatively new to the segmentation field, I am not fully sure how these metrics correspond to the results or tables reported in the paper.
Could anyone kindly help me with:
-
A brief explanation of what these metrics mean in practice
-
How they align with the reported numbers in the paper (e.g., specific tables or benchmarks)
-
Whether there is an official recommended evaluation setup to exactly reproduce the reported results
Any suggestions or clarifications would be greatly appreciated. Thanks a lot in advance!