Skip to content

Feats: Quantize/save/evaluate the Wan-AI/WAN2.2 models in w4a16 format#1678

Open
lvliang-intel wants to merge 18 commits intomainfrom
lvl/support_wan2.2
Open

Feats: Quantize/save/evaluate the Wan-AI/WAN2.2 models in w4a16 format#1678
lvliang-intel wants to merge 18 commits intomainfrom
lvl/support_wan2.2

Conversation

@lvliang-intel
Copy link
Copy Markdown
Contributor

Description

Quantize/save/evaluate the Wan-AI/WAN2.2 in w4a16 format.

Models:
https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B-Diffusers
https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B-Diffusers
https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B-Diffusers

Target dtypes: w4a16

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

#1672

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Copilot AI review requested due to automatic review settings April 14, 2026 01:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support to quantize/save/evaluate Wan-AI/Wan2.2 diffusion models in W4A16 by improving diffusion pipeline loading, calibration, and multi-device handling within AutoRound’s diffusion compressor.

Changes:

  • Add a fallback diffusion pipeline loader when AutoPipelineForText2Image cannot resolve a linked pipeline.
  • Extend DiffusionCompressor to better handle Wan-specific block I/O, multi-device dispatch before caching, and calibration inputs (including required image).
  • Make tie_weights() calls conditional to support models that don’t implement it; document Wan2.2 models in the diffusion README.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
auto_round/utils/model.py Fallback from AutoPipeline to DiffusionPipeline for unsupported/unknown pipeline links.
auto_round/utils/device.py Guard tie_weights() in block-wise dispatch to avoid attribute errors.
auto_round/compressors/diffusion/compressor.py Add Wan block output config, calibration image handling, multi-device dispatch before caching, and config saving tweak.
auto_round/compressors/diffusion/README.md Document Wan2.2 models and calibration dataset.
auto_round/compressors/base.py Skip update_module() for diffusion; guard tie_weights(); adjust multi-device auto-offload logic for diffusion.
auto_round/auto_scheme/utils.py Guard tie_weights() in device dispatch utility.

Comment thread auto_round/compressors/diffusion/compressor.py
Comment thread auto_round/compressors/diffusion/compressor.py Outdated
Comment thread auto_round/compressors/diffusion/compressor.py Outdated
Comment thread auto_round/compressors/diffusion/compressor.py Outdated
Comment on lines 406 to 411
pipe_kwargs["image"] = self._get_calibration_image(len(prompts))
try:
self.pipe(
prompt=prompts,
guidance_scale=self.guidance_scale,
num_inference_steps=self.num_inference_steps,
generator=(
None
if self.generator_seed is None
else torch.Generator(device=self.pipe.device).manual_seed(self.generator_seed)
),
)
self.pipe(**pipe_kwargs)
except NotImplementedError:
pass
except Exception as error:
val.save_pretrained(sub_module_path)
self.pipe.config.save_pretrained(output_dir)
if hasattr(self.pipe, "save_config"):
self.pipe.save_config(output_dir)
Comment thread auto_round/utils/model.py
Comment on lines +692 to +696
if "AutoPipeline can't find a pipeline linked" not in str(exc):
raise
pipe = pipelines.pipeline_utils.DiffusionPipeline.from_pretrained(
pretrained_model_name_or_path, torch_dtype=torch_dtype
)
lvliang-intel and others added 6 commits April 14, 2026 09:33
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
@lvliang-intel
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@lvliang-intel
Copy link
Copy Markdown
Contributor Author

@copilot resolve the merge conflicts in this pull request

Copilot AI and others added 2 commits April 14, 2026 07:54
- Merge f81e20d (Enable NextStepDiffusion and support multi-device tuning for diffusion)
- Resolve import conflicts: use dispatch_model_by_all_available_devices
- Resolve base.py: use getattr(self, 'is_diffusion', False) and callable() check
- Resolve README: combine FLUX, WAN2.2, Z-Image, NextStep model tables
- Resolve compressor.py: use _run_pipeline/_align_device_and_dtype approach from main,
  keep WAN2.2-specific output_configs, _uses_single_hidden_state_input,
  _requires_calibration_image, _get_calibration_image methods
- Fix inspect._empty -> inspect.Parameter.empty
- Fix 'raise error' -> 'raise' to preserve traceback

Co-authored-by: lvliang-intel <104267837+lvliang-intel@users.noreply.github.com>
Agent-Logs-Url: https://github.com/intel/auto-round/sessions/993bffda-50a4-414a-bbef-3754099c7c21

Co-authored-by: lvliang-intel <104267837+lvliang-intel@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 14, 2026

@copilot resolve the merge conflicts in this pull request

Resolved merge conflicts with origin/main (commit f81e20d - "Enable NextStepDiffusion and support multi-device tuning for diffusion") in commit 72125d2.

Key resolutions:

  • auto_round/auto_scheme/utils.py: Removed duplicate dispatch_model_by_all_available_devices (moved to utils/device.py in main)
  • auto_round/compressors/base.py: Used getattr(self, "is_diffusion", False) pattern and added callable() guard for tie_weights
  • auto_round/compressors/diffusion/README.md: Combined model tables (FLUX, WAN2.2, Z-Image, NextStep)
  • auto_round/compressors/diffusion/compressor.py:
    • Removed cache_inter_data override (superseded by _align_device_and_dtype from main)
    • Preserved WAN2.2-specific additions: WanTransformerBlock output config, _uses_single_hidden_state_input, _requires_calibration_image, _get_calibration_image
    • Integrated WAN2.2 I2V image support into _run_pipeline() via extra_kwargs
    • Fixed inspect._emptyinspect.Parameter.empty and raise errorraise

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
 into lvl/support_wan2.2

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
Successfully started running 6 pipeline(s).
1 pipeline(s) require an authorized user to comment /azp run to run.

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
Successfully started running 6 pipeline(s).
1 pipeline(s) require an authorized user to comment /azp run to run.

@lvliang-intel
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
Successfully started running 1 pipeline(s).

@lvliang-intel
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
Successfully started running 1 pipeline(s).

@lvliang-intel
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
Successfully started running 6 pipeline(s).
1 pipeline(s) require an authorized user to comment /azp run to run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants