-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix inference_step #38
base: master
Are you sure you want to change the base?
Conversation
) -> ModelOutput: | ||
if inference is True: | ||
input_embeddings = self.image_prefix(images) | ||
asks = [self.tokenizer.encode('Describe the painting:')] * len(images) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the hardcoded instruction here, since not all images are paintings. For the purpose of this codepath I would prefer not to use any instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I write this just because the example :)
) | ||
return self.generate( | ||
embeddings = input_embeddings, | ||
max_steps = 6, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 6 sampling steps might be a bit short in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest putting hardcoded asks, max_steps, temperature, and top_k into config.py as a dictionary.
do you think it would be better?
Hi, thanks for your efforts to contribute to the open source MAGMA code. The codepath for inference during training should indeed be fixed and your help is appreciated :-) In addition to the comments I added, I would in general prefer not to overload the forward function of the model too much. Maybe you could try to just change the Line 85 in 4d01e51
Thanks again and let me know what you think. Best, Constantin |
Hi Constantin, thank you for your advice. I was going to do the same. But in practice, I found that deepspeed's For the above reasons, I had to add the code into Best, Changxu |
Ok, I think you can just access the model by Best, Constantin |
inference_step passes
inference=True
tomodel_engine
.However, the
__forward__
of the Magma model does not accept this parameter, which will cause an error during training.I fix it by simply copying the inference code from
example_inference.py
.