Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DJL 0.30 Sagemaker Endpoint Deployment using vllm of quantized model parameter option.quantization is not working #3545

Open
adi7820 opened this issue Nov 28, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@adi7820
Copy link

adi7820 commented Nov 28, 2024

Hello,

I'm trying to deploy LLAMA 3.2 vision 4 bit bitsandbytes quantized model as a sagemaker endpoint, but I've encountered one error regarding quantization.

{1099C9D7-43A0-452E-AF12-2ADCF50A5A60}

As per the above image it says it is receiving quantization as 'None' even though I've set it's properties during configuration while creating serving.properties of sagemaker endpoint.

%%writefile serving.properties
engine=Python
option.model_id=unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit
option.rolling_batch=vllm
option.dtype=bf16
option.max_model_len=8192
option.max_num_seqs=1
option.enforce_eager=True
option.gpu_memory_utilization=0.9
option.quantization=bitsandbytes
option.load_format=bitsandbytes

When I re-iterated the error in vllm github repo then I found that the actual reason of the error that parameter quantization is not receiving its value.

{E56BD39D-7EB4-4D97-B530-61A61BCB6380}

Can you guys help me with the possible solution or need to wait for another version?

@adi7820 adi7820 added the bug Something isn't working label Nov 28, 2024
@adi7820 adi7820 changed the title DJL 0.30 Sagemaker Endpoint Deployment of quantized model parameter option.quantization is not working DJL 0.30 Sagemaker Endpoint Deployment using vllm of quantized model parameter option.quantization is not working Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant