`AttributeError: 'InputCollector' object has no attribute 'attention_type'` during GPTQ quantization with nvfp format



I encountered an issue while running the model quantization script for GPTQ using the `nvfp` format. The error occurs during the forward pass when attempting to apply GPTQ quantization. Specifically, the error traceback indicates that the `InputCollector` object does not have an attribute `attention_type`.

Steps to Reproduce:

1. Clone or download the project repository.
2. Ensure all dependencies are installed (Torch, transformers, etc.).
3. Run the following bash script with the given arguments:
```
#!/bin/bash
export OMP_NUM_THREADS=8
export CUDA_VISIBLE_DEVICES=4 

MODEL="/share/global/models/Qwen3-8B"
SAVE_PATH="quantized_models/Qwen3-8B-MR-GPTQ-NVFP4"

python3 model_quant.py \
 --model_name_or_path ${MODEL} \
 --format nvfp \
 --w_bits 4 \
 --a_bits 4 \
 --gptq \
 --transform_class hadamard \
 --hadamard_group_size 128 \
 --dataset_name_or_path c4 \
 --num_sequences 128 \
 --sequence_length 2048 \
 --w_observer minmax \
 --quantization_order default \
 --save_path ${SAVE_PATH} \
 --export_quantized_model realquant \
 --fuse_global_scale \
 --amp \
 --dtype bfloat16
```


**Actual Behavior**:
The script throws the following error during the forward pass:

```
Traceback (most recent call last):
 File "/share/global/xiaonan.zhang/workspace/FP-Quant_Back/FP-Quant/model_quant.py", line 482, in <module>
 main()
 File "/share/global/xiaonan.zhang/workspace/FP-Quant_Back/FP-Quant/model_quant.py", line 394, in main
 quantized_state_dict, non_quantized_state_dict = gptq_quantization(model, calibration_data, args, device)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/share/global/xiaonan.zhang/workspace/FP-Quant_Back/FP-Quant/src/quantization/gptq.py", line 534, in gptq_quantization
 model(sample.to(device=device))
 File "/usr/local/python3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
 return self._call_impl(*args, **kwargs)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/python3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
 return forward_call(*args, **kwargs)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/python3/lib/python3.12/site-packages/transformers/utils/generic.py", line 918, in wrapper
 output = func(self, *args, **kwargs)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/python3/lib/python3.12/site-packages/transformers/models/qwen3/modeling_qwen3.py", line 480, in forward
 outputs: BaseModelOutputWithPast = self.model(
 ^^^^^^^^^^^
 File "/usr/local/python3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
 return self._call_impl(*args, **kwargs)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/python3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
 return forward_call(*args, **kwargs)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/python3/lib/python3.12/site-packages/transformers/utils/generic.py", line 1072, in wrapper
 outputs = func(self, *args, **kwargs)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/python3/lib/python3.12/site-packages/transformers/models/qwen3/modeling_qwen3.py", line 412, in forward
 attention_mask=causal_mask_mapping[decoder_layer.attention_type],
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/python3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1962, in __getattr__
 raise AttributeError(
AttributeError: 'InputCollector' object has no attribute 'attention_type'
```

**Environment**:

* Python version: 3.12
* CUDA version: 12.8
* Model: Qwen3-8B
* Torch: 2.8.0+cu12
* 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`AttributeError: 'InputCollector' object has no attribute 'attention_type'` during GPTQ quantization with nvfp format #20

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AttributeError: 'InputCollector' object has no attribute 'attention_type' during GPTQ quantization with nvfp format #20

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`AttributeError: 'InputCollector' object has no attribute 'attention_type'` during GPTQ quantization with nvfp format #20