Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

Commit

Permalink
[BesTLA] AVX2: Use loaded registers of B. (#151)
Browse files Browse the repository at this point in the history
  • Loading branch information
parvizmp authored Mar 5, 2024
1 parent 750b356 commit aa4a8ab
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion bestla/bestla/bestla_gemm.h
Original file line number Diff line number Diff line change
Expand Up @@ -2716,7 +2716,7 @@ class AvxvnniN8P4 : protected bestla::xbyak::JitAvxvnni {
vpbroadcastd(vreg_t(AReg), ptr[reg_tmp1]);
add(reg_tmp1, reg_astride);
for (int i = 0; i < NRegs; i++) {
vpdpbusds_(vreg_t(CReg + mm * NRegs + i), vreg_t(AReg), ptr[reg_matBptr + kk * BKStepSize + i * VecBytes]);
vpdpbusds_(vreg_t(CReg + mm * NRegs + i), vreg_t(AReg), vreg_t(BReg + i));
}
}
}
Expand Down

0 comments on commit aa4a8ab

Please sign in to comment.