This repository has been archived by the owner on Feb 2, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Initial base for AltFP #2
base: main
Are you sure you want to change the base?
Initial base for AltFP #2
Changes from 8 commits
44bf540
f94139b
f5d13d8
ab80f72
3f10a10
4b497d6
12516a6
d89b343
a55b8d4
52777c4
29cb303
1c0f957
1e65e82
cf2e130
0269541
e5201c5
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intention here to support dot product as a packed-SIMD style operation, or an application of FMA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon reading the rest of the spec, it seems like this is intended to be used in packed-SIMD style ops. The spec should probably also define other operations on multiple packed BF16 operands (at least a note on how to load/store them - probably using standard FLW/FLD?) and how this is intended to work in general; I assume it's useful to to think about this for all operations, since just using 16 bits of a 32 or 64 bit register seems a bit wasteful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the intention is for dot product operations, as that is what BF16 is usually used for. It was requested that we start with the base and then we can move on to operations. I anticipate that these operations will be most useful in Vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you got these switched around: BF16 is the same as FP32, but with 16 fewer fraction bits: https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. Somehow I swapped them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FP32 only has 8 exponent bits, not 9 - as it's written now the sum of bits would be 33.
There is an implied
1
bit in there so technically you get the effect of 33 bits, but that goes into the fraction 🙂There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching the typo. I will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching. I fixed this but hadn't added it before committing. The latest pull request should look better (for this file anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I am trying to do a full review and listing typos / questions along the way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we intend to have a static rounding mode force to RNE by default and only allow rounding-mode static (opcode) or dynamic (csr) selection on a specific subset of instructions ? This seems to be in contradiction with F and D extensions and should be justified here IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these instructions would have a static rounding mode that is not overridable. Yes, this is different from '754. However, it is a common simplification (just like flushing subnormals). If someone needs more control over the rounding mode they can run in SP (F).
I agree that we will need to provide a detailed justification in the specification for this simplification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This formulation may not be future proof, we may want to cite explicitly the basic floating-point extensions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which area are concerned about: the rounding mode, default exception handling, or both?
Should the need arise, an extension could be added that allows the rounding mode to be changed by the CSR.
The handling of exceptions via the IEEE default is common across RISC-V. Is this what you mean about citing the basic floating-point extensions?
At some point there might be a TG that creates trapped exceptions for FP instructions - but right now only default is supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned that another floating-point extension may be introduced with a different way of managing FP exception flags, making "as all other floating-point instructions ..." missleading. So mentioning explicitly that we intend to managed them as extension F and D (and Q), clarifies things if such an extension should appear at some point. I agree that the use of such remark may be limited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does better means accuracte here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does. I will clarify and elaborate (a little). Also, thanks for catching the typos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could even add than one BFloat16 multiply can be done with less logic than one FP16 (mostly due to multiplier area gains)