Feats: Quantize/save/evaluate the Wan-AI/WAN2.2 models in w4a16 format#1678
Open
lvliang-intel wants to merge 18 commits intomainfrom
Open
Feats: Quantize/save/evaluate the Wan-AI/WAN2.2 models in w4a16 format#1678lvliang-intel wants to merge 18 commits intomainfrom
lvliang-intel wants to merge 18 commits intomainfrom
Conversation
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
for more information, see https://pre-commit.ci
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds support to quantize/save/evaluate Wan-AI/Wan2.2 diffusion models in W4A16 by improving diffusion pipeline loading, calibration, and multi-device handling within AutoRound’s diffusion compressor.
Changes:
- Add a fallback diffusion pipeline loader when
AutoPipelineForText2Imagecannot resolve a linked pipeline. - Extend
DiffusionCompressorto better handle Wan-specific block I/O, multi-device dispatch before caching, and calibration inputs (including requiredimage). - Make
tie_weights()calls conditional to support models that don’t implement it; document Wan2.2 models in the diffusion README.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| auto_round/utils/model.py | Fallback from AutoPipeline to DiffusionPipeline for unsupported/unknown pipeline links. |
| auto_round/utils/device.py | Guard tie_weights() in block-wise dispatch to avoid attribute errors. |
| auto_round/compressors/diffusion/compressor.py | Add Wan block output config, calibration image handling, multi-device dispatch before caching, and config saving tweak. |
| auto_round/compressors/diffusion/README.md | Document Wan2.2 models and calibration dataset. |
| auto_round/compressors/base.py | Skip update_module() for diffusion; guard tie_weights(); adjust multi-device auto-offload logic for diffusion. |
| auto_round/auto_scheme/utils.py | Guard tie_weights() in device dispatch utility. |
Comment on lines
406
to
411
| pipe_kwargs["image"] = self._get_calibration_image(len(prompts)) | ||
| try: | ||
| self.pipe( | ||
| prompt=prompts, | ||
| guidance_scale=self.guidance_scale, | ||
| num_inference_steps=self.num_inference_steps, | ||
| generator=( | ||
| None | ||
| if self.generator_seed is None | ||
| else torch.Generator(device=self.pipe.device).manual_seed(self.generator_seed) | ||
| ), | ||
| ) | ||
| self.pipe(**pipe_kwargs) | ||
| except NotImplementedError: | ||
| pass | ||
| except Exception as error: |
| val.save_pretrained(sub_module_path) | ||
| self.pipe.config.save_pretrained(output_dir) | ||
| if hasattr(self.pipe, "save_config"): | ||
| self.pipe.save_config(output_dir) |
Comment on lines
+692
to
+696
| if "AutoPipeline can't find a pipeline linked" not in str(exc): | ||
| raise | ||
| pipe = pipelines.pipeline_utils.DiffusionPipeline.from_pretrained( | ||
| pretrained_model_name_or_path, torch_dtype=torch_dtype | ||
| ) |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Contributor
Author
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Author
|
@copilot resolve the merge conflicts in this pull request |
- Merge f81e20d (Enable NextStepDiffusion and support multi-device tuning for diffusion) - Resolve import conflicts: use dispatch_model_by_all_available_devices - Resolve base.py: use getattr(self, 'is_diffusion', False) and callable() check - Resolve README: combine FLUX, WAN2.2, Z-Image, NextStep model tables - Resolve compressor.py: use _run_pipeline/_align_device_and_dtype approach from main, keep WAN2.2-specific output_configs, _uses_single_hidden_state_input, _requires_calibration_image, _get_calibration_image methods - Fix inspect._empty -> inspect.Parameter.empty - Fix 'raise error' -> 'raise' to preserve traceback Co-authored-by: lvliang-intel <104267837+lvliang-intel@users.noreply.github.com>
Agent-Logs-Url: https://github.com/intel/auto-round/sessions/993bffda-50a4-414a-bbef-3754099c7c21 Co-authored-by: lvliang-intel <104267837+lvliang-intel@users.noreply.github.com>
Contributor
Resolved merge conflicts with Key resolutions:
|
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
|
Azure Pipelines: Successfully started running 6 pipeline(s). 1 pipeline(s) require an authorized user to comment /azp run to run. |
for more information, see https://pre-commit.ci
|
Azure Pipelines: Successfully started running 6 pipeline(s). 1 pipeline(s) require an authorized user to comment /azp run to run. |
Contributor
Author
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines: Successfully started running 1 pipeline(s). |
Contributor
Author
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines: Successfully started running 1 pipeline(s). |
Contributor
Author
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
|
Azure Pipelines: Successfully started running 6 pipeline(s). 1 pipeline(s) require an authorized user to comment /azp run to run. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Quantize/save/evaluate the Wan-AI/WAN2.2 in w4a16 format.
Models:
https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B-Diffusers
https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B-Diffusers
https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B-Diffusers
Target dtypes: w4a16
Type of Change
Related Issues
#1672
Checklist Before Submitting