Skip to content

support WOQ model input, such as kimi2.5#1642

Merged
xin3he merged 18 commits intomainfrom
xinhe/3-31a
Apr 16, 2026
Merged

support WOQ model input, such as kimi2.5#1642
xin3he merged 18 commits intomainfrom
xinhe/3-31a

Conversation

@xin3he
Copy link
Copy Markdown
Contributor

@xin3he xin3he commented Mar 31, 2026

Description

Please briefly describe your main changes, the motivation.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Signed-off-by: Xin He <xin3.he@intel.com>
Copilot AI review requested due to automatic review settings March 31, 2026 03:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds initial support for WOQ (weight-only quantization) model inputs by extending the weight-type detection/conversion framework and adding a CPU test that exercises loading and converting a WOQ model to high precision.

Changes:

  • Introduces ModuleWeightType.WOQ and registers a new WOQHandler for detection/conversion in auto_round/utils/weight_handler.py.
  • Adds a new CPU test (test_w4a16) that loads a WOQ model and asserts WOQ detection + conversion behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
auto_round/utils/weight_handler.py Adds WOQ enum value and a new WOQHandler intended to detect/convert WOQ quantized layers.
test/test_cpu/advanced/test_low_precision_input_model.py Adds a new test case for a WOQ (w4a16) model path and validates detection/conversion.

Comment thread auto_round/utils/weight_handler.py
Comment thread auto_round/utils/weight_handler.py Outdated
Comment thread test/test_cpu/advanced/test_low_precision_input_model.py
Comment thread auto_round/utils/weight_handler.py Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@xin3he xin3he marked this pull request as draft April 2, 2026 03:00
xin3he and others added 9 commits April 2, 2026 05:12
Signed-off-by: Xin He <xin3.he@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com>
…h failure (#1621)

Signed-off-by: Xin He <xin3.he@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…hemes in utils and tests (#1643)

Signed-off-by: Xin He <xin3.he@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
… during module conversion

Signed-off-by: Xin He <xin3.he@intel.com>
@xin3he xin3he marked this pull request as ready for review April 7, 2026 05:45
@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 7, 2026

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@xin3he xin3he requested review from wenhuach21 and yiliu30 April 7, 2026 05:52
@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 9, 2026

Will update to fix CI after this PR is merged.

@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 10, 2026

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

xin3he added 2 commits April 13, 2026 08:37
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
@xin3he xin3he requested a review from lvliang-intel April 14, 2026 03:01
@xin3he xin3he requested a review from n1ck-guo April 14, 2026 03:01
Comment thread auto_round/utils/weight_handler.py
Copy link
Copy Markdown
Contributor

@yiliu30 yiliu30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others lgtm

Signed-off-by: Xin He <xin3.he@intel.com>
@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 14, 2026

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@xin3he xin3he requested a review from yiliu30 April 14, 2026 07:01
Comment thread auto_round/utils/weight_handler.py
Comment thread auto_round/utils/weight_handler.py
@xin3he xin3he requested a review from lvliang-intel April 14, 2026 12:39
@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 16, 2026

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
Successfully started running 1 pipeline(s).

@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 16, 2026

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
Successfully started running 1 pipeline(s).

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
Successfully started running 6 pipeline(s).
1 pipeline(s) require an authorized user to comment /azp run to run.

@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 16, 2026

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@xin3he xin3he merged commit d5a0097 into main Apr 16, 2026
42 checks passed
@xin3he xin3he deleted the xinhe/3-31a branch April 16, 2026 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants