You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A new large language and vision model (LLVM) that uses auxiliary visual information and natural language for prediction.
It uses 2 modules: 𝙈𝙤𝘼𝙄-𝘾𝙤𝙢𝙥𝙧𝙚𝙨𝙨𝙤𝙧 and 𝙈𝙤𝘼𝙄-𝙈𝙞𝙭𝙚𝙧. Here 𝗖𝗼𝗺𝗽𝗿𝗲𝘀𝘀𝗼𝗿 condenses the verbalized outputs of the external CV models into auxiliary visual information and 𝗠𝗶𝘅𝗲𝗿 blends three types of intelligence — visual features, auxiliary features from external CV models and language features into a cohesive whole.
MoAI-7B surpasses both open-source and closed-source LLVMs in vision language tasks.
Sure! Free free to open a PR and let us know when it's ready for review or you need help integrating into the library.
In general, we prioritise reviewing based on PRs opened rather than comments on issues, as we find this prevents issues from becoming stale. You're free to work on something if there's no active linked PRs open.
Model description
A new large language and vision model (LLVM) that uses auxiliary visual information and natural language for prediction.
It uses 2 modules: 𝙈𝙤𝘼𝙄-𝘾𝙤𝙢𝙥𝙧𝙚𝙨𝙨𝙤𝙧 and 𝙈𝙤𝘼𝙄-𝙈𝙞𝙭𝙚𝙧. Here 𝗖𝗼𝗺𝗽𝗿𝗲𝘀𝘀𝗼𝗿 condenses the verbalized outputs of the external CV models into auxiliary visual information and 𝗠𝗶𝘅𝗲𝗿 blends three types of intelligence — visual features, auxiliary features from external CV models and language features into a cohesive whole.
MoAI-7B surpasses both open-source and closed-source LLVMs in vision language tasks.
Model repo: https://github.com/ByungKwanLee/MoAI
Open source status
Provide useful links for the implementation
No response
The text was updated successfully, but these errors were encountered: