Feats: Quantize/save/evaluate the Wan-AI/WAN2.2 models in w4a16 format by lvliang-intel · Pull Request #1678 · intel/auto-round

lvliang-intel · 2026-04-14T01:26:45Z

Description

Quantize/save/evaluate the Wan-AI/WAN2.2 in w4a16 format.

Models:
https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B-Diffusers
https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B-Diffusers
https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B-Diffusers

Target dtypes: w4a16

Type of Change

Related Issues

#1672

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

…upport_wan2.2

for more information, see https://pre-commit.ci

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

…upport_wan2.2

into lvl/support_wan2.2

Copilot

Pull request overview

Adds support to quantize/save/evaluate Wan-AI/Wan2.2 diffusion models in W4A16 by improving diffusion pipeline loading, calibration, and multi-device handling within AutoRound’s diffusion compressor.

Changes:

Add a fallback diffusion pipeline loader when AutoPipelineForText2Image cannot resolve a linked pipeline.
Extend DiffusionCompressor to better handle Wan-specific block I/O, multi-device dispatch before caching, and calibration inputs (including required image).
Make tie_weights() calls conditional to support models that don’t implement it; document Wan2.2 models in the diffusion README.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
auto_round/utils/model.py	Fallback from AutoPipeline to DiffusionPipeline for unsupported/unknown pipeline links.
auto_round/utils/device.py	Guard `tie_weights()` in block-wise dispatch to avoid attribute errors.
auto_round/compressors/diffusion/compressor.py	Add Wan block output config, calibration image handling, multi-device dispatch before caching, and config saving tweak.
auto_round/compressors/diffusion/README.md	Document Wan2.2 models and calibration dataset.
auto_round/compressors/base.py	Skip `update_module()` for diffusion; guard `tie_weights()`; adjust multi-device auto-offload logic for diffusion.
auto_round/auto_scheme/utils.py	Guard `tie_weights()` in device dispatch utility.

+                    pipe_kwargs["image"] = self._get_calibration_image(len(prompts))
                try:
-                    self.pipe(
-                        prompt=prompts,
-                        guidance_scale=self.guidance_scale,
-                        num_inference_steps=self.num_inference_steps,
-                        generator=(
-                            None
-                            if self.generator_seed is None
-                            else torch.Generator(device=self.pipe.device).manual_seed(self.generator_seed)
-                        ),
-                    )
+                    self.pipe(**pipe_kwargs)
                except NotImplementedError:
                    pass
                except Exception as error:


                val.save_pretrained(sub_module_path)
-        self.pipe.config.save_pretrained(output_dir)
+        if hasattr(self.pipe, "save_config"):
+            self.pipe.save_config(output_dir)


+            if "AutoPipeline can't find a pipeline linked" not in str(exc):
+                raise
+            pipe = pipelines.pipeline_utils.DiffusionPipeline.from_pretrained(
+                pretrained_model_name_or_path, torch_dtype=torch_dtype
+            )


Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

into lvl/support_wan2.2

lvliang-intel · 2026-04-14T02:58:09Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-14T02:58:18Z

Azure Pipelines successfully started running 1 pipeline(s).

lvliang-intel · 2026-04-14T07:37:03Z

@copilot resolve the merge conflicts in this pull request

- Merge f81e20d (Enable NextStepDiffusion and support multi-device tuning for diffusion) - Resolve import conflicts: use dispatch_model_by_all_available_devices - Resolve base.py: use getattr(self, 'is_diffusion', False) and callable() check - Resolve README: combine FLUX, WAN2.2, Z-Image, NextStep model tables - Resolve compressor.py: use _run_pipeline/_align_device_and_dtype approach from main, keep WAN2.2-specific output_configs, _uses_single_hidden_state_input, _requires_calibration_image, _get_calibration_image methods - Fix inspect._empty -> inspect.Parameter.empty - Fix 'raise error' -> 'raise' to preserve traceback Co-authored-by: lvliang-intel <104267837+lvliang-intel@users.noreply.github.com>

Agent-Logs-Url: https://github.com/intel/auto-round/sessions/993bffda-50a4-414a-bbef-3754099c7c21 Co-authored-by: lvliang-intel <104267837+lvliang-intel@users.noreply.github.com>

Copilot · 2026-04-14T07:57:10Z

@copilot resolve the merge conflicts in this pull request

Resolved merge conflicts with origin/main (commit f81e20d - "Enable NextStepDiffusion and support multi-device tuning for diffusion") in commit 72125d2.

Key resolutions:

auto_round/auto_scheme/utils.py: Removed duplicate dispatch_model_by_all_available_devices (moved to utils/device.py in main)
auto_round/compressors/base.py: Used getattr(self, "is_diffusion", False) pattern and added callable() guard for tie_weights
auto_round/compressors/diffusion/README.md: Combined model tables (FLUX, WAN2.2, Z-Image, NextStep)
auto_round/compressors/diffusion/compressor.py:
- Removed cache_inter_data override (superseded by _align_device_and_dtype from main)
- Preserved WAN2.2-specific additions: WanTransformerBlock output config, _uses_single_hidden_state_input, _requires_calibration_image, _get_calibration_image
- Integrated WAN2.2 I2V image support into _run_pipeline() via extra_kwargs
- Fixed inspect._empty → inspect.Parameter.empty and raise error → raise

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

into lvl/support_wan2.2 Signed-off-by: lvliang-intel <liang1.lv@intel.com>

azure-pipelines · 2026-04-14T13:18:48Z

Azure Pipelines: Successfully started running 6 pipeline(s). 1 pipeline(s) require an authorized user to comment /azp run to run.

for more information, see https://pre-commit.ci

azure-pipelines · 2026-04-14T13:19:32Z

Azure Pipelines: Successfully started running 6 pipeline(s). 1 pipeline(s) require an authorized user to comment /azp run to run.

lvliang-intel · 2026-04-14T13:50:12Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-14T13:50:23Z

Azure Pipelines: Successfully started running 1 pipeline(s).

lvliang-intel · 2026-04-14T14:30:09Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-14T14:30:21Z

Azure Pipelines: Successfully started running 1 pipeline(s).

lvliang-intel · 2026-04-15T01:54:52Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-15T01:54:59Z

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

azure-pipelines · 2026-04-15T01:55:15Z

Azure Pipelines: Successfully started running 6 pipeline(s). 1 pipeline(s) require an authorized user to comment /azp run to run.

lvliang-intel added 2 commits April 13, 2026 16:01

Support WAN2.2 models W4A16 quantization

7fd45d4

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

Merge branch 'main' of https://github.com/intel/auto-round into lvl/s…

dc44b4d

…upport_wan2.2

Copilot AI review requested due to automatic review settings April 14, 2026 01:26

[pre-commit.ci] auto fixes from pre-commit.com hooks

b57fde7

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of lvliang-intel April 14, 2026 01:27 View session

lvliang-intel added 3 commits April 14, 2026 09:31

fix ci issue

4853279

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

Merge branch 'main' of https://github.com/intel/auto-round into lvl/s…

ab388cb

…upport_wan2.2

Merge branch 'lvl/support_wan2.2' of https://github.com/intel/auto-round

aa8e45c

into lvl/support_wan2.2

Copilot AI reviewed Apr 14, 2026

View reviewed changes

lvliang-intel and others added 6 commits April 14, 2026 09:33

Potential fix for pull request finding

257690b

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

8195bc1

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

be09c1d

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

3874e89

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

avoid lint issue

05019a8

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

Merge branch 'lvl/support_wan2.2' of https://github.com/intel/auto-round

92436a8

into lvl/support_wan2.2

Copilot started work on behalf of lvliang-intel April 14, 2026 07:37 View session

Copilot AI and others added 2 commits April 14, 2026 07:54

Merge origin/main into branch to resolve conflicts

72125d2

Agent-Logs-Url: https://github.com/intel/auto-round/sessions/993bffda-50a4-414a-bbef-3754099c7c21 Co-authored-by: lvliang-intel <104267837+lvliang-intel@users.noreply.github.com>

Copilot finished work on behalf of lvliang-intel April 14, 2026 07:59

lvliang-intel added 2 commits April 14, 2026 19:14

support quantize transformers_2 in WAN2.2

3f0b0d0

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

Merge branch 'lvl/support_wan2.2' of https://github.com/intel/auto-round

b5a9c71

into lvl/support_wan2.2 Signed-off-by: lvliang-intel <liang1.lv@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

e7e45d0

for more information, see https://pre-commit.ci

lvliang-intel requested review from changwangss and mengniwang95 April 14, 2026 13:51

Merge branch 'main' into lvl/support_wan2.2

f0ee948

Conversation

lvliang-intel commented Apr 14, 2026

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lvliang-intel commented Apr 14, 2026

Uh oh!

azure-pipelines bot commented Apr 14, 2026

Uh oh!

lvliang-intel commented Apr 14, 2026

Uh oh!

Copilot AI commented Apr 14, 2026

Uh oh!

azure-pipelines bot commented Apr 14, 2026

Uh oh!

azure-pipelines bot commented Apr 14, 2026

Uh oh!

lvliang-intel commented Apr 14, 2026

Uh oh!

azure-pipelines bot commented Apr 14, 2026

Uh oh!

lvliang-intel commented Apr 14, 2026

Uh oh!

azure-pipelines bot commented Apr 14, 2026

Uh oh!

lvliang-intel commented Apr 15, 2026

Uh oh!

azure-pipelines bot commented Apr 15, 2026

Uh oh!

azure-pipelines bot commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants