Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CMO extension intrinsics #93

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Liaoshihua
Copy link
Contributor

Add CMO extension intrinsics description.

src/c-api.adoc Outdated
|`+void __riscv_cbo_flush(void *addr);+` |`cbo.flush` |Zicbom
|`+void __riscv_cbo_inval(void *addr);+` |`cbo.inval` |Zicbom
|`+void __riscv_cbo_zero(void *addr);+` |`cbo.zero` |Zicboz
|`+void __riscv_cbo_prefetch(void *addr, const int rw, const int locality);+` |`prefetch.[r][w]` |Zicbop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is __riscv_cbo_prefetch a wrapper around __builtin_prefetch where there are 4 possible localities? RISC-V supports 5 localities, 4 NTL hint prefixes and 1 without any prefix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is __builtin_prefetch's wrapper. The 4 NTL hint maping had been included in Prefetch Intrinsics part on this doc. So, this commit only support a wrapper for __builtin_prefetch on zicbop extension.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document how the different rw/locality map to different instruction sequences?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document how the different rw/locality map to different instruction sequences?

Do you mean to expand to describe these mappings?
Ok, I will send a new version tomorrow.

@yulong18 yulong18 mentioned this pull request Oct 26, 2024
@yulong18
Copy link

I follow the documentation to set the __riscv_cbo_prefetchi parameter type, but I get such an error that is invalid argument to built-in function. So, I think the authour need to redefine the parameter type.

@topperc
Copy link
Contributor

topperc commented Oct 26, 2024

I follow the documentation to set the __riscv_cbo_prefetchi parameter type, but I get such an error that is invalid argument to built-in function. So, I think the authour need to redefine the parameter type.

The rw and locality argument to __builtin_prefetch must be compile time constants. You'll need to use a macro instead of a function.

src/c-api.adoc Outdated
----
void __riscv_cbo_prefetch(void *addr, const int rw, const int locality)
{
__builtin_prefetch(addr, rw, locality);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to explain that locality==0, adds nt.all hint. locality==1, adds ntl.pall hint. locality==2, adds ntl.p1 hint.

src/c-api.adoc Outdated
|`+void __riscv_cbo_flush(void *addr);+` |`cbo.flush` |Zicbom |
|`+void __riscv_cbo_inval(void *addr);+` |`cbo.inval` |Zicbom |
|`+void __riscv_cbo_zero(void *addr);+` |`cbo.zero` |Zicboz |
|`+void __riscv_cbo_prefetch(void *addr, const int rw, const int locality);+` |`prefetch.[r][w]` |Zicbop | `rw`= [0,1], `locality` = [0..4].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

locality is only [0..3] right?

src/c-api.adoc Outdated
[%autowidth]
|===
|*Prototype* |*Instruction* |*Extension* |*Notes*
|`+void __riscv_cbo_clean(void *addr);+` |`cbo.clean` |Zicbom |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch that landed in gcc uses __riscv_cmo_clean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes,it uses _riscv_cmo* in gcc

@Liaoshihua Liaoshihua force-pushed the master branch 5 times, most recently from a60f03e to 57ee88f Compare December 19, 2024 16:01
@cmuellner
Copy link
Collaborator

So, we have the situation that the intrinsics landed in GCC about two months ago (d2c8548e0ce51dac6bc51d37236c50f98fca82f0). But we still have valid comments here, which we should take care of.
The good thing is that the recent PR update brought this PR almost into sync with the GCC implementation.

Before going deeper into the details, I'd like to mention that GCC has a generic builtin for prefetching, which also supports locality information in the form of a compile-time constant in the range of zero to three (see https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html). A mapping from these arch-independent locality levels to NTL instructions was done in summer 2024 (bf26413fc4081dfd18b915580b35bdb71481327e): 0 -> ntl.all, 1 -> ntl.pall, 2 -> ntl.p1.

The current GCC implementation of __riscv_cmo_prefetch() just invokes the arch-independet __builtin_prefetch() with the same arguments. This means that GCC does currently not support ntl.s1. Further, GCC does not support emitting a prefetch instruction without a preceding NTL instruction if Zihintntl is enabled. In case Zihintntl is not enabled, the locality information is ignored.

I don't think a 1:1 mapping from GCC arch-independent locality encodings to __riscv_cmo_prefetch() is necessary. Instead, I propose the following encoding:

#define __riscv_ntl_undefined 0
#define __riscv_ntl_p1 1
#define __riscv_ntl_pall 2
#define __riscv_ntl_s1 3
#define __riscv_ntl_all 4

Other compile-time constants for the locality parameter of __riscv_cmo_prefetch() should trigger a compile-time error. The behavior of __riscv_cmo_prefetch() could be the same regardless of whether Zihintntl is enabled or not. I.e., if the locality parameter is not zero, we can always emit the hint instruction.

Let me know what you think.

@Liaoshihua
Copy link
Contributor Author

Liaoshihua commented Dec 20, 2024

I wrote this proposal with the intention of addressing: https://github.com/riscv-non-isa/riscv-c-api-doc/issues/88,

see https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html)

I noticed that GCC has __builtin_prefetch, so __riscv_cmo_prefetch may not be necessary.

#define __riscv_ntl_undefined 0
#define __riscv_ntl_p1 1
#define __riscv_ntl_pall 2
#define __riscv_ntl_s1 3
#define __riscv_ntl_all 4

The following are included in

=== NTLH Intrinsics

So, I would remove prefetch in this PR and only include __riscv_cmo_prefetchi, __riscv_cmo_clean, __riscv_cmo_inval, __riscv_cmo_flush, and __riscv_cmo_zero in riscv_cmo.h.

What are your thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants