You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
Qwen2-VL examples
Expected behavior
Dose Qwen2-VL support batch prompt?
When the input is a batch, only the first result returns correctly, while the rest are all empty.
print(input_ids.shape)
print(prompt_table.shape)
print(prompt_tasks)
outputs = self.model.generate(
input_ids,
input_position_ids=None,
mrope_params=mrope_params,
sampling_config=None,
prompt_table=prompt_table,
prompt_tasks=prompt_tasks,
max_new_tokens=max_new_tokens,
end_id=end_id,
pad_id=self.model.tokenizer.pad_token_id
if self.model.tokenizer.pad_token_id is not None else
self.model.tokenizer.all_special_ids[0],
top_k=self.args.top_k,
top_p=self.args.top_p,
temperature=self.args.temperature,
repetition_penalty=self.args.repetition_penalty,
num_beams=self.args.num_beams,
output_sequence_lengths=True,
return_dict=True)
actual behavior
input_ids only differ in the first dimension, but the results are incorrect(empty).
additional notes
none
The text was updated successfully, but these errors were encountered:
System Info
x86
Tensorrt_LLM 0.16.0
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Qwen2-VL examples
Expected behavior
Dose Qwen2-VL support batch prompt?
When the input is a batch, only the first result returns correctly, while the rest are all empty.
print(input_ids.shape)
print(prompt_table.shape)
print(prompt_tasks)
outputs = self.model.generate(
input_ids,
input_position_ids=None,
mrope_params=mrope_params,
sampling_config=None,
prompt_table=prompt_table,
prompt_tasks=prompt_tasks,
max_new_tokens=max_new_tokens,
end_id=end_id,
pad_id=self.model.tokenizer.pad_token_id
if self.model.tokenizer.pad_token_id is not None else
self.model.tokenizer.all_special_ids[0],
top_k=self.args.top_k,
top_p=self.args.top_p,
temperature=self.args.temperature,
repetition_penalty=self.args.repetition_penalty,
num_beams=self.args.num_beams,
output_sequence_lengths=True,
return_dict=True)
actual behavior
input_ids only differ in the first dimension, but the results are incorrect(empty).
additional notes
none
The text was updated successfully, but these errors were encountered: