Skip to content

[5868890][ONNX][Autocast] Fix: failure when checking input shape with unknown dimension#859

Merged
gcunhase merged 4 commits intoNVIDIA:mainfrom
gcunhase:dev/gcunhasergio/5868890_autocast_input_shape_dyn
Feb 9, 2026
Merged

[5868890][ONNX][Autocast] Fix: failure when checking input shape with unknown dimension#859
gcunhase merged 4 commits intoNVIDIA:mainfrom
gcunhase:dev/gcunhasergio/5868890_autocast_input_shape_dyn

Conversation

@gcunhase
Copy link
Contributor

@gcunhase gcunhase commented Feb 5, 2026

What does this PR do?

Type of change: Bug fix

Overview: Skip unknown dimensions when comparing input shape in model vs calibration data.

Usage

$ python -m modelopt.onnx.autocast --onnx_path=$MODEL_NAME.onnx --calibration_data=calib_data_10.npz

Testing

See bug 5868890.

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: No
  • Did you add or update any necessary documentation?: No
  • Did you update Changelog?: No

Summary by CodeRabbit

Bug Fixes

  • Enhanced input shape validation to properly handle dynamic tensor dimensions, allowing more flexible dimension checking while maintaining validation accuracy.

Aditional info

Regression introduced in #652.

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
@gcunhase gcunhase requested a review from a team as a code owner February 5, 2026 19:53
@gcunhase gcunhase requested a review from ajrasane February 5, 2026 19:53
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 5, 2026

📝 Walkthrough

Walkthrough

The change refines input shape validation in the reference runner to properly handle dynamic dimensions (represented as -1) by only comparing known dimensions between model and data shapes, allowing flexible dimension matching where either side is undefined.

Changes

Cohort / File(s) Summary
Input Validation Refinement
modelopt/onnx/autocast/referencerunner.py
Modified _validate_inputs to handle unknown dimensions (-1) by converting shapes to NumPy arrays, computing a mask that ignores -1 values, and comparing only known dimensions before raising validation errors.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main fix: handling unknown dimensions in input shape validation for ONNX Autocast.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gcunhase gcunhase requested a review from galagam February 5, 2026 19:54
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@modelopt/onnx/autocast/referencerunner.py`:
- Around line 92-97: The rank comparison currently builds a mask between
inp_shape_model and inp_shape_data which will raise a NumPy broadcasting error
if their lengths differ; before creating mask, explicitly check
len(inp_shape_model) == len(inp_shape_data) and if not raise the same ValueError
you would for a mismatch so callers see the intended user-facing error; update
the logic around the variables inp_shape_model, inp_shape_data, and mask in
referencerunner.py to perform this length check and early raise before any
elementwise comparison.
- Around line 92-96: The input_shapes extraction currently ignores ONNX
dim_param and leaves dim_value==0 for dynamic dims, so update the handling so
dynamic dimensions are treated as unknown: either (preferred) normalize any
dimension with a non-empty dim_param to -1 when building self.input_shapes (so
self.input_shapes[...] contains -1 for dynamic dims), or modify the mask in
Referencerunner where inp_shape_model = np.array(self.input_shapes[inp_name]) is
used to also treat entries with dim_value==0 as unknown when the original ONNX
dim_param was set; adjust the mask logic that computes mask = (inp_shape_model
!= -1) & (inp_shape_data != -1) to consider dim_value==0 (and/or stored
dim_param markers) as unknown so validation skips those dimensions correctly.

@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.73%. Comparing base (452c5a0) to head (8c26e5b).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #859   +/-   ##
=======================================
  Coverage   73.72%   73.73%           
=======================================
  Files         196      196           
  Lines       20457    20463    +6     
=======================================
+ Hits        15082    15088    +6     
  Misses       5375     5375           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
@gcunhase gcunhase merged commit 24e3587 into NVIDIA:main Feb 9, 2026
40 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants