improve error msg for packed being incompatible #2056

felipemello1 · 2024-11-23T03:01:35Z

when running gemma2 with packed=True, i got the error below. It should be more comprehensive.

NotImplementedError: Block masks are not implemeted yet, use packed=False.

Also, if Gemma is not compatible with packed, we should fix it and/or remove the option from the config.

The text was updated successfully, but these errors were encountered:

Optimox · 2024-11-29T09:14:49Z

Hello @felipemello1,

The reason it does not work is quite complex. Block masks could work with gemma 2 without problem as there is nothing specific preventing them to work in theory. The problem comes from flex attention which I did not manage to implement for gemma 2, so gemma 2 is not using flex attention at the moment.

But the way torchtune automatically checks whether flex attention is available or not and the fact that datasets automatically create block masks specific to flex attention create an incompatibility at the moment (this incompatibility does not exist if you have a version of torch which does not contain flex attention).

Here are a few directions I see to fix this problem:

make flex attention usage a parameter in each recipes and disable it for now for gemma 2: I feel like hiding to the users which attention implementation is going to be run is not ideal.
implement a working flex attention implementation but this would probably need to wait for pytorch to solve the issue

On a similar topic, did you successfully fine tuned gemma 2? I have been running the same (custom) pretraining script with both gemma 2 2B and llama3.2 3B which yields good generations with llama 3.2 and catastrophic ones with gemma 2...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve error msg for packed being incompatible #2056

improve error msg for packed being incompatible #2056

felipemello1 commented Nov 23, 2024 •

edited

Loading

Optimox commented Nov 29, 2024 •

edited

Loading

improve error msg for packed being incompatible #2056

improve error msg for packed being incompatible #2056

Comments

felipemello1 commented Nov 23, 2024 • edited Loading

Optimox commented Nov 29, 2024 • edited Loading

felipemello1 commented Nov 23, 2024 •

edited

Loading

Optimox commented Nov 29, 2024 •

edited

Loading