Add Automated QDQ placement example - Part 4.1#841
Add Automated QDQ placement example - Part 4.1#841willg-nv wants to merge 3 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: Will Guo <willg@nvidia.com>
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
📝 WalkthroughWalkthroughAdds comprehensive documentation and a batch size modification utility for the QDQ placement optimization example, along with a minor logging format enhancement. These changes introduce resources for users to set up and run QDQ quantization workflows on ONNX models. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
1b8d896 to
f36aa99
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@examples/qdq_placement/README.md`:
- Around line 191-192: Fix the typo in the README sentence about TensorRT remote
autotuning: change "autouning" to "autotuning" in the line that reads "TensorRT
10.16 support remote autotuning, pass remoteAutoTuningConfig to trtexec to
benchmark with remote autouning." to correctly spell "autotuning" and ensure the
sentence still reads clearly (e.g., "TensorRT 10.16 supports remote autotuning;
pass remoteAutoTuningConfig to trtexec to benchmark with remote autotuning.").
- Around line 128-129: The downloaded filename in the curl command is
misleading: the URL fetches resnet101-v2-7.onnx but the saved name and
subsequent commands use resnet101_Opset17.onnx; update the saved filename and
downstream usage to a consistent, accurate name (e.g., use resnet101-v2-7.onnx
in the curl -o and in the python3 set_batch_size.py command) or add a one-line
clarifying comment above the commands explaining that resnet101_Opset17.onnx is
an alias for resnet101-v2-7.onnx so readers know which model variant is being
used.
🧹 Nitpick comments (3)
examples/qdq_placement/set_batch_size.py (3)
46-48: Consider validating that the model has inputs.If the model has no graph inputs, accessing
graph.input[0]will raise anIndexError. While unlikely for typical models, adding a guard improves robustness.🛡️ Proposed defensive check
# Get the input tensor graph = model.graph + if not graph.input: + raise ValueError(f"Model {model_path} has no graph inputs") input_tensor = graph.input[0]
60-64: Output batch dimension assumption may not hold for all models.This code assumes the first dimension of every output is the batch dimension. While true for ResNet50 and most classification models, some models may have scalar outputs or outputs where batch isn't the first dimension. Consider adding a note in the docstring about this assumption, or making output modification opt-in.
78-84: Use the repository's utility functions for saving and checking the model to handle large files consistently.The codebase provides
save_onnx()andcheck_model()utilities inmodelopt/onnx/utils.pythat handle models larger than 2GB by using external data. Replace the standardonnx.save()(line 80) andonnx.checker.check_model()(line 84) with calls tomodelopt.onnx.utils.save_onnx()andmodelopt.onnx.utils.check_model(). While ResNet50 won't encounter this limitation, using the existing utilities ensures consistency across the codebase and prevents issues when the script is applied to larger models.
Signed-off-by: Will Guo <willg@nvidia.com>
f36aa99 to
7257a12
Compare
Signed-off-by: Will Guo <willg@nvidia.com>
What does this PR do?
Type of change: ?
This change implement a simple example to illustrate the usage of Automated QDQ placement tool.
Overview: ?
Usage
Testing
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit
Documentation
New Features
Improvements