Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Metal Direct] Implement generic runtime arguments #1632

Open
jdesousa-TT opened this issue Dec 18, 2024 · 1 comment
Open

[Metal Direct] Implement generic runtime arguments #1632

jdesousa-TT opened this issue Dec 18, 2024 · 1 comment

Comments

@jdesousa-TT
Copy link
Contributor

  • Runtime args have been set up assuming that we generate one kernel per core, and therefore we multicast the same runtime args to all cores that a kernel is dispatched to.

  • Runtime args are inherently always uint32, but we sometimes need some pre-processing in the MLIR runtime. These args fall into three major groups:

    • Runtime Args which need pre-processing by the MLIR runtime (special types, eg. TensorAddr, Semaphores).
    • Runtime Args which can be multicast to the whole core range of the kernel (this may not be necessary, we can just compile these in).
    • Runtime Args which need to be unicast to different cores with the same kernel.
@jdesousa-TT
Copy link
Contributor Author

jdesousa-TT commented Dec 18, 2024

@xanderchin and I discussed the possibility of completely generalizing Semaphores and possibly Tensor Addrs.

In theory, a Semaphore should just be some piece of memory that shares an address across cores. Since the compiler is handling memory allocation, we think it could be possible for it to allocate space for semaphores at compile time, and avoid the runtime use of tt_metal::CreateSemaphore all together. This would allow us to either compile semaphore addrs directly into kernels or store them in the flatbuffer as raw uint32 runtime arg addresses.

We then use the NoC Semaphore APIs as usual with the addresses that are pre-compiled or passed in at runtime, but skip the runtime initialization step in command_queue.cpp

This is pending further investigation, just putting it here as food for thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant