[FFE - E2E] Open Llama 3B
No due date
81% complete
Core operations support is required for the Llama 3B model.
List of ops that are currently lowered through tt-forge (up to emit to TTIR)
- Add - Already supported e2e
- Concatenate - Required support on Forge and MLIR
- Embedding - Required support on Forge and MLIR
- Hslice - Should be removed from the model
- Hstack - Should be removed from the model
- Matmul - Re…
Core operations support is required for the Llama 3B model.
List of ops that are currently lowered through tt-forge (up to emit to TTIR)
- Add - Already supported e2e
- Concatenate - Required support on Forge and MLIR
- Embedding - Required support on Forge and MLIR
- Hslice - Should be removed from the model
- Hstack - Should be removed from the model
- Matmul - Required support on Forge, MLIR has it
- Multiply - Already supported e2e
- Narrow - Required via reshape op for both Forge and MLIR
- Pad_tile - Potentially redundant
- Reciprocal - Required support on Forge and MLIR
- Reduce_avg - Required support on Forge, MLIR has it
- Sigmoid - Required support on Forge and MLIR
- Softmax - Already supported e2e
- Sparse_matmul - Should be removed from the model
- Sqrt - Required support on Forge and MLIR
- Squeeze - Required via reshape op for both Forge and MLIR
- Tile_broadcast - Potentially redundant
- Transpose - Currently WIP
- Unsqueeze - Required via reshape op for both Forge and MLIR
Also, some of the basic Llama 3B building blocks that should be supported:
- Embeddings
- Self-attention
- MLP
- RMS Norm
- LM head