Cannot quantize Qwen2-VL-72B-Instruct on a single gpu card #389

WeiweiZhang1 · 2024-12-17T09:43:20Z

when running below shell cmd, the device-related error is reported, and the quantization of the 72B model is expected to be done by a single gpu card.

CUDA_VISIBLE_DEVICES=$device
python3 -m auto_round --mllm
--model ${model_dir}/Qwen2-VL-72B-Instruct
--group_size 64
--bits 2
--iters 2000
--nsample 1024
--low_gpu_mem_usage
--seqlen 2048
--model_dtype "float16"
--format 'auto_gptq,auto_round' \

wenhuach21 · 2024-12-17T09:47:12Z

@n1ck-guo following llm, disable auto-mapping for single card

wenhuach21 · 2024-12-17T09:52:17Z

Besides, please add a gpu ut for 70B models

wenhuach21 · 2024-12-31T01:45:25Z

#395

wenhuach21 assigned n1ck-guo Dec 17, 2024

wenhuach21 closed this as completed Dec 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot quantize Qwen2-VL-72B-Instruct on a single gpu card #389

Cannot quantize Qwen2-VL-72B-Instruct on a single gpu card #389

WeiweiZhang1 commented Dec 17, 2024

wenhuach21 commented Dec 17, 2024

wenhuach21 commented Dec 17, 2024

wenhuach21 commented Dec 31, 2024

Cannot quantize Qwen2-VL-72B-Instruct on a single gpu card #389

Cannot quantize Qwen2-VL-72B-Instruct on a single gpu card #389

Comments

WeiweiZhang1 commented Dec 17, 2024

wenhuach21 commented Dec 17, 2024

wenhuach21 commented Dec 17, 2024

wenhuach21 commented Dec 31, 2024