-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM64-SVE: LeadingSignCount, LeadingZeroCount, PopCount #102548
Conversation
Note regarding the
|
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
I get some failures in the stress test due to In the meantime, marking this as ready for review. @dotnet/arm64-contrib @kunalspathak |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
@@ -68,6 +68,8 @@ HARDWARE_INTRINSIC(Sve, FusedMultiplyAddNegated, | |||
HARDWARE_INTRINSIC(Sve, FusedMultiplySubtract, -1, -1, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_sve_fmls, INS_sve_fmls}, HW_Category_SIMD, HW_Flag_Scalable|HW_Flag_EmbeddedMaskedOperation|HW_Flag_HasRMWSemantics|HW_Flag_LowMaskedOperation|HW_Flag_FmaIntrinsic|HW_Flag_SpecialCodeGen) | |||
HARDWARE_INTRINSIC(Sve, FusedMultiplySubtractBySelectedScalar, -1, 4, true, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_sve_fmls, INS_sve_fmls}, HW_Category_SIMDByIndexedElement, HW_Flag_Scalable|HW_Flag_HasImmediateOperand|HW_Flag_HasRMWSemantics|HW_Flag_FmaIntrinsic|HW_Flag_LowVectorOperation) | |||
HARDWARE_INTRINSIC(Sve, FusedMultiplySubtractNegated, -1, -1, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_sve_fnmls, INS_sve_fnmls}, HW_Category_SIMD, HW_Flag_Scalable|HW_Flag_EmbeddedMaskedOperation|HW_Flag_HasRMWSemantics|HW_Flag_LowMaskedOperation|HW_Flag_FmaIntrinsic|HW_Flag_SpecialCodeGen) | |||
HARDWARE_INTRINSIC(Sve, LeadingSignCount, -1, -1, false, {INS_sve_cls, INS_invalid, INS_sve_cls, INS_invalid, INS_sve_cls, INS_invalid, INS_sve_cls, INS_invalid, INS_invalid, INS_invalid}, HW_Category_SIMD, HW_Flag_Scalable|HW_Flag_BaseTypeFromFirstArg|HW_Flag_EmbeddedMaskedOperation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these 2 are missing the flag for HW_Flag_LowMaskedOperation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really did think I had added that! Thanks for fixing.
/ba-g Issues seems to be #100174 |
* ARM64-SVE: LeadingSignCount + LeadingZeroCount * Add popcount * Fix PlatformNotSupported * Fix summary headers for popcount * Use SveSimpleVecOpTest for unsigned popcounts * Add HW_Flag_LowMaskedOperation() to LeadingSignCount() and LeadingZeroCount() --------- Co-authored-by: Kunal Pathak <[email protected]>
* ARM64-SVE: LeadingSignCount + LeadingZeroCount * Add popcount * Fix PlatformNotSupported * Fix summary headers for popcount * Use SveSimpleVecOpTest for unsigned popcounts * Add HW_Flag_LowMaskedOperation() to LeadingSignCount() and LeadingZeroCount() --------- Co-authored-by: Kunal Pathak <[email protected]>
_SveMasklessUnaryOpTestTemplate is the same as _SveUnaryOpTestTemplate, but without the Conditional Select tests.
For all these, the API method has a different Op1 and Return type and so does not fit into a standard conditional select test.
For LeadingSignCount, LeadingZeroCount the underlying instruction does not have a mask, so can't be optimised using conditional select.