Skip to content

Commit

Permalink
fix bug in tl.store mask for kernel _to_fp8_row_major_t_and_non_t
Browse files Browse the repository at this point in the history
ghstack-source-id: 270cf799161c7cca6822e7c0f9c511e618321dab
ghstack-comment-id: 2575984684
Pull Request resolved: #1516
  • Loading branch information
danielvegamyhre committed Jan 8, 2025
1 parent 367aea3 commit 878c886
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -375,8 +375,8 @@ def _to_fp8_row_major_t_and_non_t(
block_col_offs[:, None] * row_major_t_out_stride_row
+ block_row_offs[None, :] * row_major_t_out_stride_col
)
mask = (block_row_offs[:, None] < row_major_t_num_rows) & (
block_col_offs[None, :] < row_major_t_num_cols
mask = (block_col_offs[:, None] < row_major_t_num_rows) & (
block_row_offs[None, :] < row_major_t_num_cols
)
tl.store(row_major_t_out_ptr + row_major_t_offs, fp8_vals.trans(1, 0), mask=mask)

Expand Down

0 comments on commit 878c886

Please sign in to comment.