You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm writing the code for executing some useful kernels for gpt3 and other transformer model. After completing kernels, I'd like to compare the PIM & NON-PIM execution latency by PIMBENCHTESTCASES file. However, when I observe the code, I can't see any typical kernel or execution for non-pim kernel. Can you give a tip for mocking or simulating NON-PIM execution kernel? Thanks.
The text was updated successfully, but these errors were encountered:
The variety of cases is expected to arise based on the tiling method, the sequence of weight data reads by the xPU, and the management of intermediate results. These cases depend on the configuration of the xPU system, for example, the size of the SPM or the impact of on-chip caches can lead to different optimization techniques. For the non-PIM system, we used an oracle approach that assumes sequential reads of the weight data(see genMemTraffic()), which hide the computational overhead, but note that unlike GEMV, this approach cannot be used for operations with high locality data.
I'm writing the code for executing some useful kernels for gpt3 and other transformer model. After completing kernels, I'd like to compare the PIM & NON-PIM execution latency by PIMBENCHTESTCASES file. However, when I observe the code, I can't see any typical kernel or execution for non-pim kernel. Can you give a tip for mocking or simulating NON-PIM execution kernel? Thanks.
The text was updated successfully, but these errors were encountered: