TBox #150

vxst · 2024-05-18T07:24:23Z

Since there are only 256 values in float8 (whichever flavor), we can utilize a technique similar to the SBox in AES: pre-calculate the mapping for conversion and look it up when performing conversions.

Since conversion is actually one of the most frequently used operations in float8, this method will greatly improve performance for this library. It can be used for conversion between float8 and for conversion from float8 to float16/32/64.

I plan to implement it based on the current ConvertImpl, with a new struct ConvertTable. It utilizes ConvertImpl to calculate the mapping, so the behavior will be exactly the same, just much faster. I plan to build the table at the RegisterTwoWayCustomCast stage and utilize it when the source of the two-way cast is 8 bits or less.

Is there anything I need to pay attention to, or do you have any advice (e.g., regarding naming)? I'm starting to implement it and will make a PR when it's finished.

The text was updated successfully, but these errors were encountered:

vxst · 2024-05-18T07:41:38Z

I need your advice on whether to also use ConvertTable for bfloat16 to float8. A 64 KByte lookup table is still relatively small, and building the table is negligible compared to loading a library in Python. I'll implement ConvertTable in a way that can be easily adapted to a 16-bit source format. Perhaps we can discuss it in the PR when I can present more data, such as the performance improvements, to make a better decision on this call.

vxst mentioned this issue May 18, 2024

Refactor FloatPyCast for improved performance using lookup table #151

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TBox #150

TBox #150

vxst commented May 18, 2024 •

edited

Loading

vxst commented May 18, 2024 •

edited

Loading

TBox #150

TBox #150

Comments

vxst commented May 18, 2024 • edited Loading

vxst commented May 18, 2024 • edited Loading

vxst commented May 18, 2024 •

edited

Loading

vxst commented May 18, 2024 •

edited

Loading