You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I expected these to emit basically equivalent code, but was surprised to see a huge difference in the amount of instructions. This translates into an actual runtime performance penalty, which is likely avoidable. (That said: I highly doubt that unit conversions should ever occur in the "hot loop" of a well designed program, so this is probably not a meaningful performance penalty.)
For nholthaus, we see two things. First, that it's multiplying and dividing by pi and 180, instead of combining them into a single factor pi / 180 at compile time. Second, that it emits a surprisingly large number of instructions that I can't explain (I'm not well versed in assembly):
For Au, we can see that the factors are combined into one (we see ~57.3, and its inverse). And we emit only two instructions:
Here's the godbolt link for gcc 13.2. I didn't use this one first because the other one actually generates comments to show you what values are being used, which is nice.
Anyway, we can see the nholthaus code looks much more reasonable for gcc than for clang, although it still emits more instructions than Au. Here it is:
And here's what we get for Au (still just two instructions):
What's the upshot? I guess it would be nice to consider combining the conversion factors into a single value, computed at compile time. This doc on applying conversion factors may be useful reading here.
I'm also curious why clang emits so much more code than gcc does, but I assume if we switched to a single conversion factor then this would all go away and the point would be moot. (Although it'd be interesting if we found that it didn't!)
Please include the following information in your issue:
Which version of units you are using
The current master.
Which compiler exhibited the problem (including compiler version)
clang 16.0.0 and gcc 13.2
The text was updated successfully, but these errors were encountered:
Consider this round-trip unit conversion, using both the nholthaus library and Au.
I expected these to emit basically equivalent code, but was surprised to see a huge difference in the amount of instructions. This translates into an actual runtime performance penalty, which is likely avoidable. (That said: I highly doubt that unit conversions should ever occur in the "hot loop" of a well designed program, so this is probably not a meaningful performance penalty.)
Here's a godbolt link using clang 16.0.0.
For nholthaus, we see two things. First, that it's multiplying and dividing by
pi
and180
, instead of combining them into a single factorpi / 180
at compile time. Second, that it emits a surprisingly large number of instructions that I can't explain (I'm not well versed in assembly):For Au, we can see that the factors are combined into one (we see ~57.3, and its inverse). And we emit only two instructions:
Here's the godbolt link for gcc 13.2. I didn't use this one first because the other one actually generates comments to show you what values are being used, which is nice.
Anyway, we can see the nholthaus code looks much more reasonable for gcc than for clang, although it still emits more instructions than Au. Here it is:
And here's what we get for Au (still just two instructions):
What's the upshot? I guess it would be nice to consider combining the conversion factors into a single value, computed at compile time. This doc on applying conversion factors may be useful reading here.
I'm also curious why clang emits so much more code than gcc does, but I assume if we switched to a single conversion factor then this would all go away and the point would be moot. (Although it'd be interesting if we found that it didn't!)
Please include the following information in your issue:
units
you are usingThe current
master
.clang 16.0.0 and gcc 13.2
The text was updated successfully, but these errors were encountered: