[VLM] Merged multi-modal processor for LLaVA-NeXT #11682

DarkLight1337 · 2025-01-02T10:51:20Z

This PR implements the merged multi-modal processor for LLaVA-NeXT. To avoid redundant code, I have introduced base classes VisionEncoderInfo and BaseLlavaMultiModalProcessor.

github-actions · 2025-01-02T10:51:31Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: DarkLight1337 <[email protected]>

Isotr0py · 2025-01-02T14:38:46Z

Seems that llava-next test is failing...

DarkLight1337 · 2025-01-02T14:40:29Z

Fixed!

Signed-off-by: DarkLight1337 <[email protected]>

Isotr0py

Overall LGTM!

tests/multimodal/test_mapper.py

DarkLight1337 requested a review from Isotr0py January 2, 2025 10:51

DarkLight1337 requested a review from ywang96 as a code owner January 2, 2025 10:51

This was referenced Jan 2, 2025

[RFC]: Multi-modality Support on vLLM #4194

Open

[RFC]: Merge input processor and input mapper for multi-modal models #10114

Open

DarkLight1337 force-pushed the llava-next-mm-processor branch 2 times, most recently from 8035ddc to 67e50fd Compare January 2, 2025 10:57

Implement merged processor for llava-next

9bfbc82

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 force-pushed the llava-next-mm-processor branch from 67e50fd to 9bfbc82 Compare January 2, 2025 11:00

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 2, 2025

Fix mantis

24ec89c

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 force-pushed the llava-next-mm-processor branch from ce0a7dc to 24ec89c Compare January 2, 2025 11:26

Move outdated tests

aa6dfcf

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 added 2 commits January 2, 2025 14:40

Fix embedding inputs

d2d2d46

Signed-off-by: DarkLight1337 <[email protected]>

ignore type error

deb521d

Signed-off-by: DarkLight1337 <[email protected]>

Isotr0py approved these changes Jan 2, 2025

View reviewed changes

tests/multimodal/test_mapper.py Show resolved Hide resolved

Isotr0py enabled auto-merge (squash) January 2, 2025 16:26

Isotr0py merged commit 8c38ee7 into vllm-project:main Jan 2, 2025
55 checks passed

DarkLight1337 deleted the llava-next-mm-processor branch January 2, 2025 16:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VLM] Merged multi-modal processor for LLaVA-NeXT #11682

[VLM] Merged multi-modal processor for LLaVA-NeXT #11682

DarkLight1337 commented Jan 2, 2025

github-actions bot commented Jan 2, 2025

Isotr0py commented Jan 2, 2025

DarkLight1337 commented Jan 2, 2025 •

edited

Loading

Isotr0py left a comment

[VLM] Merged multi-modal processor for LLaVA-NeXT #11682

[VLM] Merged multi-modal processor for LLaVA-NeXT #11682

Conversation

DarkLight1337 commented Jan 2, 2025

github-actions bot commented Jan 2, 2025

Isotr0py commented Jan 2, 2025

DarkLight1337 commented Jan 2, 2025 • edited Loading

Isotr0py left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Jan 2, 2025 •

edited

Loading