Skip to content

[FFE - E2E] Open Llama 3B

No due date 81% complete

Core operations support is required for the Llama 3B model.

List of ops that are currently lowered through tt-forge (up to emit to TTIR)

  • Add - Already supported e2e
  • Concatenate - Required support on Forge and MLIR
  • Embedding - Required support on Forge and MLIR
  • Hslice - Should be removed from the model
  • Hstack - Should be removed from the model
  • Matmul - Re…

Core operations support is required for the Llama 3B model.

List of ops that are currently lowered through tt-forge (up to emit to TTIR)

  • Add - Already supported e2e
  • Concatenate - Required support on Forge and MLIR
  • Embedding - Required support on Forge and MLIR
  • Hslice - Should be removed from the model
  • Hstack - Should be removed from the model
  • Matmul - Required support on Forge, MLIR has it
  • Multiply - Already supported e2e
  • Narrow - Required via reshape op for both Forge and MLIR
  • Pad_tile - Potentially redundant
  • Reciprocal - Required support on Forge and MLIR
  • Reduce_avg - Required support on Forge, MLIR has it
  • Sigmoid - Required support on Forge and MLIR
  • Softmax - Already supported e2e
  • Sparse_matmul - Should be removed from the model
  • Sqrt - Required support on Forge and MLIR
  • Squeeze - Required via reshape op for both Forge and MLIR
  • Tile_broadcast - Potentially redundant
  • Transpose - Currently WIP
  • Unsqueeze - Required via reshape op for both Forge and MLIR

Also, some of the basic Llama 3B building blocks that should be supported:

  • Embeddings
  • Self-attention
  • MLP
  • RMS Norm
  • LM head
Loading