You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run the caption generation workflow and was wondering what I have to do if the inputs to the TextTransformer model are always padded to a fixed length? How should the attn mask be updated in both the TextTransformer and the MultiModalDecoder? Currently, the input to the TextTransformer increases as the caption is generated but I'd like to pad the input to a fixed length.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello,
I am trying to run the caption generation workflow and was wondering what I have to do if the inputs to the
TextTransformer
model are always padded to a fixed length? How should the attn mask be updated in both theTextTransformer
and theMultiModalDecoder
? Currently, the input to theTextTransformer
increases as the caption is generated but I'd like to pad the input to a fixed length.Thanks.
@lucidrains @gpucce @iejMac
Beta Was this translation helpful? Give feedback.
All reactions