Integrate Automated QDQ placement tool - part 4.3 by willg-nv · Pull Request #843 · NVIDIA/Model-Optimizer

willg-nv · 2026-02-03T02:52:47Z

What does this PR do?

This PR upload user guide of Automated QDQ placement tool. This tool automatically search QDQ insertion points with better performance.

Overview: ?

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

Summary by CodeRabbit

Documentation
- Added comprehensive guide for Automated Q/DQ Placement Optimization, including CLI and Python API examples, configuration options, deployment strategies, and troubleshooting guidance.

copy-pr-bot · 2026-02-03T02:52:50Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-03T02:53:09Z

📝 Walkthrough

Walkthrough

New documentation file added detailing automated Q/DQ placement optimization for ONNX models, including quick start guides, configuration options, advanced usage patterns, CLI usage, troubleshooting, and API references.

Changes

Cohort / File(s)	Summary
Documentation – Q/DQ Placement Guide `docs/source/guides/9_qdq_placement.rst`	New comprehensive guide covering automated Q/DQ node placement optimization for ONNX models. Includes quick start via CLI and Python API, detailed operation workflow, configuration options, pattern cache management, deployment guidance, troubleshooting, and frequently asked questions.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Integrate Automated QDQ placement tool - part 4.3' accurately reflects the main change: adding comprehensive documentation for the Automated Q/DQ Placement Optimization tool.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Will Guo <willg@nvidia.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@docs/source/guides/9_qdq_placement.rst`:
- Around line 279-281: The conditional checking
autotuner.current_profile_pattern_schemes being None is counter-intuitive as
written; update the example around the if statement that references
autotuner.current_profile_pattern_schemes to either (a) add a one-line
clarifying comment explaining the API semantics (e.g., that the API returns None
when a region is already profiled/complete) or (b) change the condition if it
was reversed after confirming the API behavior; ensure the comment references
autotuner.current_profile_pattern_schemes and the subsequent print("  Already
profiled, skipping") / continue so readers understand why None means "already
profiled."
- Line 516: Replace the inconsistent pattern-cache filename reference
`./bert_base_run/pattern_cache.yaml` with the correct
`autotuner_state_pattern_cache.yaml` in the `--pattern-cache` example so the
flag matches the output structure; update the single occurrence of
`--pattern-cache ./bert_base_run/pattern_cache.yaml` to `--pattern-cache
./bert_base_run/autotuner_state_pattern_cache.yaml`.

🧹 Nitpick comments (1)

docs/source/guides/9_qdq_placement.rst (1)
457-457: Clarify scheme selection guidance.

The guidance states "For models with many small regions, start with fewer schemes. For models with many big regions, start with more schemes." The rationale behind this recommendation isn't immediately clear and could benefit from explanation—specifically, whether "big regions" refers to region size (number of nodes) or computational complexity.
💡 Suggested clarification
-For models with many small regions, start with fewer schemes. For models with many big regions, start with more schemes.
+The optimal scheme count depends on your model's structure:
+
+* **Many small regions** (e.g., 100+ patterns): Use fewer schemes (20-30) per region to keep total optimization time reasonable, as you'll be testing many unique patterns.
+* **Few large regions** (e.g., <20 patterns): Use more schemes (50-100) per region to thoroughly explore each pattern's optimization space.

coderabbitai · 2026-02-03T02:56:02Z

docs/source/guides/9_qdq_placement.rst

+       if autotuner.current_profile_pattern_schemes is None:
+           print("  Already profiled, skipping")
+           continue


⚠️ Potential issue | 🟡 Minor

Potentially confusing logic in code example.

The code checks if autotuner.current_profile_pattern_schemes is None: and then prints "Already profiled, skipping". This logic appears counter-intuitive—typically, checking if something is None suggests it doesn't exist or needs initialization, not that it's already complete. Without API documentation context, this example may confuse users about when regions are skipped.

Consider adding a clarifying comment explaining the API behavior, or verify this logic is correct:

# Check if already profiled (API returns None when region is complete) if autotuner.current_profile_pattern_schemes is None: print(" Already profiled, skipping") continue

🤖 Prompt for AI Agents

In `@docs/source/guides/9_qdq_placement.rst` around lines 279 - 281, The conditional checking autotuner.current_profile_pattern_schemes being None is counter-intuitive as written; update the example around the if statement that references autotuner.current_profile_pattern_schemes to either (a) add a one-line clarifying comment explaining the API semantics (e.g., that the API returns None when a region is already profiled/complete) or (b) change the condition if it was reversed after confirming the API behavior; ensure the comment references autotuner.current_profile_pattern_schemes and the subsequent print(" Already profiled, skipping") / continue so readers understand why None means "already profiled."

coderabbitai · 2026-02-03T02:56:02Z

docs/source/guides/9_qdq_placement.rst

+   python -m modelopt.onnx.quantization.autotune \
+       --model bert_large.onnx \
+       --output ./bert_large_run \
+       --pattern-cache ./bert_base_run/pattern_cache.yaml


⚠️ Potential issue | 🟡 Minor

Inconsistent filename reference.

Line 516 references ./bert_base_run/pattern_cache.yaml, but based on the output structure shown in lines 60 and 150, the pattern cache filename should be autotuner_state_pattern_cache.yaml.

📝 Proposed fix

- --pattern-cache ./bert_base_run/pattern_cache.yaml + --pattern-cache ./bert_base_run/autotuner_state_pattern_cache.yaml

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

--pattern-cache ./bert_base_run/pattern_cache.yaml

--pattern-cache ./bert_base_run/autotuner_state_pattern_cache.yaml

🤖 Prompt for AI Agents

In `@docs/source/guides/9_qdq_placement.rst` at line 516, Replace the inconsistent pattern-cache filename reference `./bert_base_run/pattern_cache.yaml` with the correct `autotuner_state_pattern_cache.yaml` in the `--pattern-cache` example so the flag matches the output structure; update the single occurrence of `--pattern-cache ./bert_base_run/pattern_cache.yaml` to `--pattern-cache ./bert_base_run/autotuner_state_pattern_cache.yaml`.

Integrate Automated QDQ placement tool - part 4.3

916687c

Signed-off-by: Will Guo <willg@nvidia.com>

willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part4.3 branch from 76c395b to 916687c Compare February 3, 2026 02:53

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate Automated QDQ placement tool - part 4.3#843

Integrate Automated QDQ placement tool - part 4.3#843
willg-nv wants to merge 1 commit intoNVIDIA:mainfrom
willg-nv:dev-willg-integrate-auto-qdq-placement-part4.3

willg-nv commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 3, 2026

Uh oh!

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 3, 2026

Uh oh!

coderabbitai bot Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	--pattern-cache ./bert_base_run/pattern_cache.yaml
	--pattern-cache ./bert_base_run/autotuner_state_pattern_cache.yaml

Conversation

willg-nv commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Feb 3, 2026

Uh oh!

coderabbitai bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

willg-nv commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 3, 2026 •

edited

Loading