Add LLaVA support, modify generate function #820

zazamrykh · 2024-12-22T19:43:12Z

This is continuation of #818 topic. Now make PR to dev branch.
Also I've decided that it would be good idea for generate function work with any type of input type and return type. I decided that it would be more readable and simply if generate function will generate sequence using embeddings as input to forward function. It is because string and tokens can be casted to embeddings but embeddings cannot be casted backward to tokens or string. And I made possible to give any input (string, list of strings, tokens, embeddings) ang get any output of function. Maybe there is something that I didn't make correct. For example, I'm not sure if my implementation is okay with positional embeddings.

I've met problem, LLaVA processor = AutoProcessor.from_pretrained(model_id, revision="a272c74") do not have apply_chat_template method at transformer version of project. So it is needed to update transformer library for this method work.

I've made test for any type of input and return types with gpt-2 model and it passed. But outputs of gpt-2 looks very poor though maybe for gpt-2 it is normal

I've also rerun all LLaVA.ipynb file and even tested other types of output of generate function and all worked pretty well.

New feature (non-breaking change which adds functionality)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works

…parate functions get_pos_offset and get_residual

jojje · 2024-12-27T18:36:45Z

@zazamrykh I saw the initial struggles with the branch selection (dev-3) and appreciate the effort you're making to get this PR in shape for dev. Will be incredibly useful to have transformerlens work also with llava (vision) models. Just wanted to show some early appreciation for the effort in the hope you'll stick it out and not abandon the effort midway due to "pulling your hair" in frustration, while not knowing if anyone else really cares. I care :)

zazamrykh · 2024-12-28T18:03:30Z

@zazamrykh I saw the initial struggles with the branch selection (dev-3) and appreciate the effort you're making to get this PR in shape for dev. Will be incredibly useful to have transformerlens work also with llava (vision) models. Just wanted to show some early appreciation for the effort in the hope you'll stick it out and not abandon the effort midway due to "pulling your hair" in frustration, while not knowing if anyone else really cares. I care :)

I am very glad to hear such a words of supporting) Changes in one function invoke changes in others so I've changed some functions which initially I haven't thought change. Then I've broke generate function, then, as I've thought, I've fixed it, but now some testes are not passing! I'll try to clarify what is the reason of failing tests. If you want to investigate VLM now, I think you can get last commit of feature-llava-support branch from forked repository, but I hope llava support appears soon in transformerlens!

bryce13950 · 2025-01-03T01:39:36Z

@zazamrykh I merged dev into your PR earlier today, and took a look at the changes. This is a pretty big change, so it may require a bit of testing. As for the current test failure, it very while might have something to do with a float being rounded down at some point. There are quite a few places where ints are now accepted as params in this change, so if you need some ideas of where to start looking, that may be a good place.

Also, I really like splitting out the insides of the generate function. Just make sure to add proper doc strings before this is merged, so that those functions are documented properly in the public docs.

If you need help wrapping anything up, don't hesitate to ask. I took a quick glance just to get an idea of the size of the change, but I have not done a detailed look at the changes. I normally wait until the tests are passing to do that, but I am happy to help resolve any outstanding tasks if need be.

zazamrykh added 2 commits December 22, 2024 22:19

Add LLaVA support, modify generate function

bbd664c

Positional embeddings are taken into account when generating, make se…

3466104

…parate functions get_pos_offset and get_residual

Merge branch 'dev' into feature-llava-support

792ae65

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLaVA support, modify generate function #820

Add LLaVA support, modify generate function #820

zazamrykh commented Dec 22, 2024

jojje commented Dec 27, 2024

zazamrykh commented Dec 28, 2024

bryce13950 commented Jan 3, 2025

Add LLaVA support, modify generate function #820

Are you sure you want to change the base?

Add LLaVA support, modify generate function #820

Conversation

zazamrykh commented Dec 22, 2024

jojje commented Dec 27, 2024

zazamrykh commented Dec 28, 2024

bryce13950 commented Jan 3, 2025