The evaluation device_map setting is not optimized for quantized models using the ExLlamaV2 backend. #629

WeiweiZhang1 · 2024-11-26T07:50:52Z

The evaluation setting is not optimized for quantized models using the ExLlamaV2 backend, which serves as the default kernel for Auto-GPTQ and Transformers.

Changing device_map="cpu" to device_map='auto' will resolve the model loading issue.

The known problematic model files are:
vlmeval/vlm/llama_vision.py
vlmeval/vlm/llava/llava.py
vlmeval/vlm/qwen2_vl/model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The evaluation device_map setting is not optimized for quantized models using the ExLlamaV2 backend. #629

The evaluation device_map setting is not optimized for quantized models using the ExLlamaV2 backend. #629

WeiweiZhang1 commented Nov 26, 2024

The evaluation device_map setting is not optimized for quantized models using the ExLlamaV2 backend. #629

The evaluation device_map setting is not optimized for quantized models using the ExLlamaV2 backend. #629

Comments

WeiweiZhang1 commented Nov 26, 2024