lcc: overflow of relational operators #64

kervinck · 2019-05-09T19:57:26Z

The result of SUBW doesn't indicate if an overflow has occurred. Consequently, vCPU offers only signed comparisons against 0 for branching decisions. With that, the relational operators a<b, a<=b, a a>=b a>b only work when the difference between a and b fits in 15 bits. Ultimately this idiosyncrasy stems from the lack of status register in the 8-bit hardware architecture, and the underlying design idea that software can always compensate for missing hardware features...

TinyBASIC_v2.gt1 uses the following sequence to get correct comparisons over the full range:

0461  ee 02                    LDLW  $02                |..|
0463  fc 3a                    XORW  $3a                |.:|
0465  35 53 6a                 BGE   $046c              |5Sj|
0468  ee 02                    LDLW  $02                |..|
046a  90 6e                    BRA   $0470              |.n|
046c  ee 02                    LDLW  $02                |..|
046e  b8 3a                    SUBW  $3a                |.:|
0470  35 56 73                 BLE   $0475              |5Vs|

So prior to the standard subtract, it first checks if the operand highest bits are equal. If equal, it perform the normal SUBW for comparison. If not equal, the first operand is loaded for BLE/BLT (or the second operand in the case of BGE/BGT). This sequence costs 8 vCPU instructions or 18 bytes here, compared to 3/7 for the naive sequence. This is the reason that Tiny BASIC uses this idiom selectively.

For a compliant C compiler we need something similar. For example, now 0xffffu compares as smaller than 0. And we also can't really say that ints are just 15-bits wide instead.

My plan:

First bite the bullet and implement it correctly for < <= > and >=. Try hard if we can get away with at most one new helper function in rt.py ("@prepcmp"?). For example, implement only in one order and reverse the order of operands if necessary.
Provide macros/instrinsics/builtins to give access to the the naive relational operators when so desired by the programmer. Something like LT(a,b), GT(a,b), LE(a,b) and GE(a,b).
Optimise away the prelude when comparing against a zero constant.
Think deep about further optimisation options. For example, try to deduce the high bit of both sides by static analysis...?
Consider some trivial rewrites. Eg. comparisons of an unsigned integer against 0 can be always be replaced by one of true/false/NE/EQ.

The text was updated successfully, but these errors were encountered:

Cwiiis · 2019-05-09T20:07:21Z

Re 4, perhaps you could add a compiler built-in that users could mark variables they don't expect full 16-bit comparisons on? That could then aid with the static analysis. [Edit MvK: removed quoted e-mail]

kervinck · 2019-05-09T20:28:46Z

Yes, this concept needs to evolve, hence the issue to track the thought process

My idea of prioritising things in LCC are:

Correctness
Usability (e.g. error messages, 64K support, essential library functions, ...)
Optimisations
...
Floating point :-)

kervinck · 2019-05-13T20:58:47Z

This has the potential to make a lot of code very inefficient, and I like to know the impact. For example:

if (putc(…) < 0)… can become nasty, whereas

if (putc(…) == EOF) … has potentially an efficient translation (using ADDI 1 + BNE)

- No overflow cases included yet

kervinck · 2019-06-30T17:53:44Z

Perhaps we can squeeze in another new vCPU instruction that helps for these: "CMPW $DD".

Addresses: "New vCPU instructions? (#85)" #85 See also: "lcc: overflow of relational operators (#64)" #64 See also this thread: https://forum.gigatron.io/viewtopic.php?f=4&t=136 Mnem. Encoding #C Description ----- --------- -- ----------- CALLI $85 LL HH 28 Goto immediate address and remember vPC (vLR,vPC=vPC+3,$HHLL) CMPHS $1f DD 28 Adjust high byte for signed compare (vACH=XXX) CMPHU $97 DD 28 Adjust high byte for unsigned compare (vACH=XXX) Changed cycle times ------------------- LD $1A DD 22 (was 18) INC $93 DD 22 (was 16) ANDI $82 DD 22 (was 16) Regression test with Mandelbrot: 1144.551 seconds -> 1144.751 seconds ---------------------------------------------------------------- On vCPU instructions for comparisons between two 16-bit operands ---------------------------------------------------------------- vCPU's conditional branching (BCC) always compares vAC against 0, treating vAC as a two's complement 16-bit number. When we need to compare two arbitrary numnbers we normally first take their difference with SUBW. However, when this difference is too large, the subtraction overflows and we get the wrong outcome. To get it right over the entire range, an elaborate sequence is needed. TinyBASIC uses this blurp for its relational operators. (It compares stack variable $02 with zero page variable $3a.) 0461 ee 02 LDLW $02 0463 fc 3a XORW $3a 0465 35 53 6a BGE $046c 0468 ee 02 LDLW $02 046a 90 6e BRA $0470 046c ee 02 LDLW $02 046e b8 3a SUBW $3a 0470 35 56 73 BLE $0475 The CMPHS and CMPHU instructions were introduced to simplify this. They inspect both operands to see if there is an overflow risk. If so, they modify vAC such that their difference gets smaller, while preserving the relation between the two operands. After that, the SUBW instruction can't overflow and we achieve a correct comparison. Use CMPHS for signed comparisons and CMPHU for unsigned. With these, the sequence above becomes: 0461 ee 02 LDLW $02 0463 1f 3b CMPHS $3b Note: high byte of operand 0465 b8 3a SUBW $3a 0467 35 56 73 BLE $0475 CMPHS/CMPHU don't make much sense other than in combination with SUBW. These modify vACH, if needed, as given in the following table: vACH varH | vACH bit7 bit7 | CMPHS CMPHU --------------------------- 0 0 | vACH vACH no change needed 0 1 | varH+1 varH-1 narrowing the range 1 0 | varH-1 varH+1 narrowing the range 1 1 | vACH vACH no change needed ---------------------------

kervinck · 2020-04-05T19:30:17Z

ROM v5 will have CMPHS and CMPHU instructions that are designed to solve this. See also f584a2a

lb3361 · 2022-08-25T13:07:22Z

Suggesting to close this issue since glcc does this correctly already (on both roms v4 --with code-- and v5a+ --with cmphi/cmphs).

kervinck added the bug label May 9, 2019

kervinck self-assigned this May 13, 2019

kervinck added a commit that referenced this issue May 16, 2019

lcc: regression testing for relops (#64)

fc92686

- No overflow cases included yet

kervinck mentioned this issue Jun 4, 2019

lcc: compiler hangs on simple statement #77

Open

kervinck added compliancy and removed bug labels Jul 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lcc: overflow of relational operators #64

lcc: overflow of relational operators #64

kervinck commented May 9, 2019 •

edited

Loading

Cwiiis commented May 9, 2019 via email •

edited by kervinck

Loading

kervinck commented May 9, 2019

kervinck commented May 13, 2019 •

edited

Loading

kervinck commented Jun 30, 2019

kervinck commented Apr 5, 2020

lb3361 commented Aug 25, 2022

lcc: overflow of relational operators #64

lcc: overflow of relational operators #64

Comments

kervinck commented May 9, 2019 • edited Loading

Cwiiis commented May 9, 2019 via email • edited by kervinck Loading

kervinck commented May 9, 2019

kervinck commented May 13, 2019 • edited Loading

kervinck commented Jun 30, 2019

kervinck commented Apr 5, 2020

lb3361 commented Aug 25, 2022

kervinck commented May 9, 2019 •

edited

Loading

Cwiiis commented May 9, 2019 via email •

edited by kervinck

Loading

kervinck commented May 13, 2019 •

edited

Loading