Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests for Embedding op (ABANDONED) #902

Closed
wants to merge 20 commits into from

Conversation

kmilanovicTT
Copy link
Contributor

@kmilanovicTT kmilanovicTT commented Dec 12, 2024

No description provided.

pilkicTT and others added 20 commits December 5, 2024 13:38
Part of the changes for initial version of executing training e2e on
device. Changes to follow after this one will focus on the optimizer.

With this change, the loss module and the model can be compiled
separately and executed on the TT device (both forward and backward
passes). Example:

```python
tt_model = forge.compile(framework_model,
        sample_inputs=[inputs], training=True)

loss_fn = CrossEntropyLoss(name="cross_entropy_loss")
tt_loss = forge.compile(loss_fn, sample_inputs=loss_inputs,
        attach_to=tt_model, training=True)

 # Forward pass is executed as before
 #
 # The following will execute the whole backward pass from the loss
outputs down to the model backward pass.
tt_loss.backward()
```

Note: change to the API
- main compile function is modified to accept `training` parameter. To
indicate wheter to compile the module for training.
- also, loss module is removed as an argument to the compile function

For some reason, the gradient inputs in the previous stack were
represented as `InputNodeType::Loss`. I have added a new type of input
`Gradient`. The removal of the `Loss` input type is to be done as a
follow up change since there are some uses of it spread around in the
code base. Issue #829

To tie the gradients from the `loss.backward()` to the
`module.backward()` we need to "attach" the model to the loss module
when compiling the loss. This is done by passing the module to be
attached into the compile function (`attach_to` parameter).

Note: this doesn't work in general case, when there are multiple
gradients being passed between modules - because currently we don't have
a mechanism to know which gradient output to tie to which gradient
input.

Closes #177
- remove xfail mark for tests that are now passing
- for flatten op tests change verification to pcc only
- pow
 - clamp
 - log
 - log1p
* Add test without gradient accumulation

* Remove unused import and comments

* Fix formatting

* Switch to new forge compile API

* Add bf16 instructions

* Remove num of batches limit and increase batch size
The tt-forge-fe workflows accept a tt-mlir SHA override input, ensuring the specified SHA is used for building Docker images, compiling, and testing, instead of the committed version of tt-mlir.
This will be used as part of integration testing to ensure that integrating a newer version of tt-metal doesn't break downstream projects.
Users can manually trigger the workflow to run with a specific version of tt-mlir, and in CI tt-mlir will trigger this workflow to run with the uplift branch.

Relates to #214
Move FailingRulesConverter to shared utils
Extend FailingRulesConverter with kwargs support
Specify list of params for failing rule
- Package was recently deleted gives error 404
 - Update build-and-test.yml and model-analysis-weekly.yml, and need an
   apt-get update too.
@kmilanovicTT kmilanovicTT self-assigned this Dec 12, 2024
@kmilanovicTT kmilanovicTT added the Ops Support new op in tt-forge and tt-mlir label Dec 12, 2024
@kmilanovicTT kmilanovicTT reopened this Dec 12, 2024
@kmilanovicTT kmilanovicTT changed the title Tests for Embedding op Tests for Embedding op (OBSOLETE) Dec 12, 2024
@kmilanovicTT
Copy link
Contributor Author

Abandoned - wrong branches selected

@kmilanovicTT kmilanovicTT changed the title Tests for Embedding op (OBSOLETE) Tests for Embedding op (ABANDONED) Dec 12, 2024
@kmilanovicTT kmilanovicTT added the invalid This doesn't seem right label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right Ops Support new op in tt-forge and tt-mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.