Proposal: simd_broadcast
intrinsic
#2031
cshenton
started this conversation in
Ideas/Requests
Replies: 2 comments 1 reply
-
An alternative, to address just the scalar broadcast case, could be to simply extend the existing simd intrinsic operations to accept scalar inputs and broadcast them for you. So the following would be allowed: import "core:simd"
a := 2.5
b := 3.5
vec := #simd[4]f64{1.0, 2.0, 3.0, 4.0}
x := a * vec // {2.5, 5, 7.5, 10}
y := simd.min(a, vec) // {1, 2, 2.5, 2.5}
z := simd.clamp(vec, a, b) // {2.5, 2.5, 3.0, 3.5} |
Beta Was this translation helpful? Give feedback.
1 reply
-
Bill has kindly pointed out to me on the discord a shorthand for scalar broadcast currently available in the language: v4f32 :: #simd[4]f32
x := v4f32{0..<4 = 3}
y := v4f32(3) I'd add that, since import "core:simd"
y := simd.f32x4(3) Funnily enough that largely covers my needs and is preferably to my suggested broadcast syntax. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Problem
It would be useful to have a compiler intrinsic for creating
#simd
vectors from scalar and other vector values in a way that guaranteed similar code gen to the equivalent intel/arm intrinsics likeFor example, initialising a 4-wide single precision float vector with a scalar float value has the following C intrinsics:
Broadcasting the lower two elements of a 4-wide float vector to an 8 wide float vector.
Potential solution
Odin currently has a
swizzle
intrinsic, but this only supports creating vectors of the same length or shorter, the following calls do not compile:This largely makes sense and matches up with swizzle behaviour in shader langs.
I propose a new intrinsic,
simd_broadcast
for creating simd vectors larger that their inputs, which provides guarantees about generating similar code to the Intel/ARM intrinsics in C. To be complementary withswizzle
it should fail to compile unless the output vector is larger than the inputThe intrinsic could be
broadcast
, but I'm not proposing this operation work on or produce regular arrays likeswizzle
does, so thesimd_
prefix (and an alias incore:simd
assimd.broadcast
) is what I'm proposing to avoid any ambiguity.Problems with solution
#simd[4]T{s, s, s, s}
and document code gen guarantees for scalar valuessimd_broadcast(x, 0, 0, 0, 0)
is more characters than_mm_set1_ps(x)
, if being more terse than C is a goal this might need rethinkingBeta Was this translation helpful? Give feedback.
All reactions