Page 1 of 2
Code optimization technique
Posted: Wed Aug 26, 2020 12:38 pm
by BillG
Early last year, I reported on some of my experiments with code generation. It is now being put to use in my Pascal compiler for the 6502.
This is a sample statement showing a straightforward translation:
Code: Select all
00039 ; W0 := W0 + 1;
00040
00041 ; ; 0 := v W0 -> 1
00042 ; ; 1 L r 2
00043
00044 ; ; 2 L v W0 -> 3
00045 ; ; 3 + c 1
00046
00047
00048 ; 2 L v W0 -> 3
00049 ; 3 + c 1
0034 18 [2] 00050 clc
0035 A5 0D [3] 00051 lda W0
0037 69 01 [2] 00052 adc #1
0039 AA [2] 00053 tax
003A A5 0E [3] 00054 lda W0+1
003C 69 00 [2] 00055 adc #0
00056 ; 1 L r 2
00057 ; 0 := v W0 -> 1
003E 86 0D [3] 00058 stx W0
0040 85 0E [3] 00059 sta W0+1
and a much better result when optimization is enabled:
Code: Select all
00031 ; W0 := W0 + 1;
00032
00033 ; ; 0 := v W0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v W0 -> 3
00037 ; ; 3 + c 1
00038
00039
00040 ; 2 L v W0 -> 3
00041 ; 3 + c 1
0029 E6 0D [5] 00042 inc W0
002B D0 02 (002F) [2/3] 00043 bne 2f
002D E6 0E [5] 00044 inc W0+1
002F 00045 2:
00046 ; 1 L r 2
00047 ; 0 := v W0 -> 1
Surprisingly, an addition of a constant less than 256 can also benefit from a similar transformation:
Code: Select all
00039 ; W0 := W0 + 2;
00040
00041 ; ; 0 := v W0 -> 1
00042 ; ; 1 L r 2
00043
00044 ; ; 2 L v W0 -> 3
00045 ; ; 3 + c 2
00046
00047
00048 ; 2 L v W0 -> 3
00049 ; 3 + c 2
0034 18 [2] 00050 clc
0035 A5 0D [3] 00051 lda W0
0037 69 02 [2] 00052 adc #2
0039 AA [2] 00053 tax
003A A5 0E [3] 00054 lda W0+1
003C 69 00 [2] 00055 adc #0
00056 ; 1 L r 2
00057 ; 0 := v W0 -> 1
003E 86 0D [3] 00058 stx W0
0040 85 0E [3] 00059 sta W0+1
and the better result:
Code: Select all
00031 ; W0 := W0 + 2;
00032
00033 ; ; 0 := v W0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v W0 -> 3
00037 ; ; 3 + c 2
00038
00039
00040 ; 2 L v W0 -> 3
00041 ; 3 + c 2
0029 18 [2] 00042 clc
002A A5 0D [3] 00043 lda W0
002C 69 02 [2] 00044 adc #2
002E 85 0D [3] 00045 sta W0
0030 90 02 (0034) [2/3] 00046 bcc 2f
0032 E6 0E [5] 00047 inc W0+1
0034 00048 2:
00049 ; 1 L r 2
00050 ; 0 := v W0 -> 1
The strange comments are diagnostics showing the data structures used by the code generator.
Re: Code optimization technique
Posted: Thu Aug 27, 2020 8:11 pm
by BillG
From the Kill Two Birds with One Stone department, the load of a signed number which has to be done anyway, is also used for sign extension, saving two bytes and three cycles.
This code to add a signed byte to a two-byte number:
Code: Select all
00031 ; W0 := S1 + W2;
00032
00033 ; ; 0 := v W0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v S1 -> 3
00037 ; ; 3 + v W2
00038
00039
00040 ; 2 L v S1 -> 3
00041 ; 3 + v W2
0029 18 [2] 00042 clc
002A A5 16 [3] 00043 lda S1
002C 65 11 [3] 00044 adc W2
002E AA [2] 00045 tax
002F A5 16 [3] 00046 lda S1
0031 09 7F [2] 00047 ora #$7F
0033 30 02 (0037) [2/3] 00048 bmi 2f
0035 A9 00 [2] 00049 lda #0
0037 00050 2:
0037 65 12 [3] 00051 adc W2+1
00052 ; 1 L r 2
00053 ; 0 := v W0 -> 1
0039 86 0D [3] 00054 stx W0
003B 85 0E [3] 00055 sta W0+1
becomes:
Code: Select all
003D 18 [2] 00057 clc
003E A0 00 [2] 00058 ldy #0
0040 A5 16 [3] 00059 lda S1 ; Kill two birds with one stone
0042 10 01 (0045) [2/3] 00060 bpl 2f
0044 88 [2] 00061 dey
0045 00062 2:
0045 65 11 [3] 00063 adc W2
0047 AA [2] 00064 tax
0048 98 [2] 00065 tya
0049 65 12 [3] 00066 adc W2+1
004B 86 0D [3] 00067 stx W0
004D 85 0E [3] 00068 sta W0+1
And this code to add two signed bytes:
Code: Select all
00031 ; W0 := S1 + S2;
00032
00033 ; ; 0 := v W0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v S1 -> 3
00037 ; ; 3 + v S2
00038
00039
00040 ; 2 L v S1 -> 3
00041 ; 3 + v S2
0029 18 [2] 00042 clc
002A A5 16 [3] 00043 lda S1
002C 65 17 [3] 00044 adc S2
002E AA [2] 00045 tax
002F A0 00 [2] 00046 ldy #0
0031 24 16 [3] 00047 bit S1
0033 10 01 (0036) [2/3] 00048 bpl 2f
0035 88 [2] 00049 dey
0036 00050 2:
0036 24 17 [3] 00051 bit S2
0038 10 01 (003B) [2/3] 00052 bpl 2f
003A 88 [2] 00053 dey
003B 00054 2:
003B 98 [2] 00055 tya
003C 69 00 [2] 00056 adc #0
00057 ; 1 L r 2
00058 ; 0 := v W0 -> 1
003E 86 0D [3] 00059 stx W0
0040 85 0E [3] 00060 sta W0+1
becomes:
Code: Select all
0042 18 [2] 00063 clc
0043 A0 00 [2] 00064 ldy #0
0045 A5 16 [3] 00065 lda S1 ; Kill two birds with one stone
0047 10 01 (004A) [2/3] 00066 bpl 2f
0049 88 [2] 00067 dey
004A 00068 2:
004A 65 17 [3] 00069 adc S2
004C AA [2] 00070 tax
004D 24 17 [3] 00071 bit S2
004F 10 01 (0052) [2/3] 00072 bpl 2f
0051 88 [2] 00073 dey
0052 00074 2:
0052 98 [2] 00075 tya
0053 69 00 [2] 00076 adc #0
0055 86 0D [3] 00077 stx W0
0057 85 0E [3] 00078 sta W0+1
Edit: the branch in the first example should be bpl, not bmi.
Re: Code optimization technique
Posted: Sat Aug 29, 2020 4:08 pm
by BillG
More code generation, this time adding to a value already in registers:
Code: Select all
00031 ; B0 := B1 + B2 + 2;
00032
00033 ; ; 0 := v B0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v B1 -> 3
00037 ; ; 3 + v B2 -> 4
00038 ; ; 4 + c 2
00039
00040
00041 ; 1 L r 2
00042 ; 2 L v B1 -> 3
00043 ; 3 + v B2 -> 4
0029 18 [2] 00044 clc
002A A5 1A [3] 00045 lda B1
002C 65 1B [3] 00046 adc B2
00047 ; 4 + c 2
002E 18 [2] 00048 clc
002F 69 02 [2] 00049 adc #2
00050 ; 0 := v B0 -> 1
0031 85 19 [3] 00051 sta B0
It knows when not to bother to add:
Code: Select all
00031 ; B0 := B1 + B2 + 512;
00032
00033 ; ; 0 := v B0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v B1 -> 3
00037 ; ; 3 + v B2 -> 4
00038 ; ; 4 + c 512
00039
00040
00041 ; 1 L r 2
00042 ; 2 L v B1 -> 3
00043 ; 3 + v B2 -> 4
0029 18 [2] 00044 clc
002A A5 1A [3] 00045 lda B1
002C 65 1B [3] 00046 adc B2
00047 ; 4 + c 512
00048 ; 0 := v B0 -> 1
002E 85 19 [3] 00049 sta B0
Likewise for two-byte values:
Code: Select all
00031 ; W0 := W1 + W2 + 513;
00032
00033 ; ; 0 := v W0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v W1 -> 3
00037 ; ; 3 + v W2 -> 4
00038 ; ; 4 + c 513
00039
00040
00041 ; 1 L r 2
00042 ; 2 L v W1 -> 3
00043 ; 3 + v W2 -> 4
0029 18 [2] 00044 clc
002A A5 0F [3] 00045 lda W1
002C 65 11 [3] 00046 adc W2
002E AA [2] 00047 tax
002F A5 10 [3] 00048 lda W1+1
0031 65 12 [3] 00049 adc W2+1
00050 ; 4 + c 513
0033 18 [2] 00051 clc
0034 A8 [2] 00052 tay
0035 8A [2] 00053 txa
0036 69 01 [2] 00054 adc #1
0038 AA [2] 00055 tax
0039 98 [2] 00056 tya
003A 69 02 [2] 00057 adc #2
00058 ; 0 := v W0 -> 1
003C 86 0D [3] 00059 stx W0
003E 85 0E [3] 00060 sta W0+1
This is a major win:
Code: Select all
00031 ; W0 := W1 + W2 + 512;
00032
00033 ; ; 0 := v W0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v W1 -> 3
00037 ; ; 3 + v W2 -> 4
00038 ; ; 4 + c 512
00039
00040
00041 ; 1 L r 2
00042 ; 2 L v W1 -> 3
00043 ; 3 + v W2 -> 4
0029 18 [2] 00044 clc
002A A5 0F [3] 00045 lda W1
002C 65 11 [3] 00046 adc W2
002E AA [2] 00047 tax
002F A5 10 [3] 00048 lda W1+1
0031 65 12 [3] 00049 adc W2+1
00050 ; 4 + c 512
0033 18 [2] 00051 clc
0034 69 02 [2] 00052 adc #2
00053 ; 0 := v W0 -> 1
0036 86 0D [3] 00054 stx W0
0038 85 0E [3] 00055 sta W0+1
This is ugly, but it has to be done. Can it be improved?
Code: Select all
00031 ; W0 := W1 + W2 + S0;
00032
00033 ; ; 0 := v W0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v W1 -> 3
00037 ; ; 3 + v W2 -> 4
00038 ; ; 4 + v S0
00039
00040
00041 ; 1 L r 2
00042 ; 2 L v W1 -> 3
00043 ; 3 + v W2 -> 4
0029 18 [2] 00044 clc
002A A5 0F [3] 00045 lda W1
002C 65 11 [3] 00046 adc W2
002E AA [2] 00047 tax
002F A5 10 [3] 00048 lda W1+1
0031 65 12 [3] 00049 adc W2+1
00050 ; 4 + v S0
0033 18 [2] 00051 clc
0034 85 25 [3] 00052 sta BTp00_
0036 8A [2] 00053 txa
0037 65 15 [3] 00054 adc S0
0039 AA [2] 00055 tax
003A A5 15 [3] 00056 lda S0
003C 09 7F [2] 00057 ora #$7F
003E 30 02 (0042) [2/3] 00058 bmi 2f
0040 A9 00 [2] 00059 lda #0
0042 00060 2:
0042 65 25 [3] 00061 adc BTp00_
00062 ; 0 := v W0 -> 1
0044 86 0D [3] 00063 stx W0
0046 85 0E [3] 00064 sta W0+1
Re: Code optimization technique
Posted: Sat Aug 29, 2020 5:28 pm
by BigEd
If you added a byte, and discovered it's negative, I think you only need to decrement the high byte of the accumulator: it feels like there will be a cheaper way to do that.
Re: Code optimization technique
Posted: Sat Aug 29, 2020 6:45 pm
by dp11
You can move stx up in the code then peep hole can spot tax stx and just do sta.
Re: Code optimization technique
Posted: Sat Aug 29, 2020 8:16 pm
by BillG
If you added a byte, and discovered it's negative, I think you only need to decrement the high byte of the accumulator: it feels like there will be a cheaper way to do that.
Yes, I have the same feeling...
I am going to look into something like this:
Code: Select all
adc S0
tax
bit S0
bpl 2f
dey
2:
tya
adc #0
Re: Code optimization technique
Posted: Sat Aug 29, 2020 8:20 pm
by BillG
You can move stx up in the code then peep hole can spot tax stx and just do sta.
I do not have a peephole optimizer yet. At this point, the code generator does not know that adding S0 is the last operation before storing the number into a variable.
Re: Code optimization technique
Posted: Sat Aug 29, 2020 11:44 pm
by BillG
This is ugly, but it has to be done. Can it be improved?
Yes, indeed, here is an improvement. Any more?
Code: Select all
00031 ; W0 := W1 + W2 + S0;
00032
00033 ; ; 0 := v W0 -> 1
00034 ; ; 1 L r 2
00035
00036 ; ; 2 L v W1 -> 3
00037 ; ; 3 + v W2 -> 4
00038 ; ; 4 + v S0
00039
00040
00041 ; 1 L r 2
00042 ; 2 L v W1 -> 3
00043 ; 3 + v W2 -> 4
0029 18 [2] 00044 clc
002A A5 0F [3] 00045 lda W1
002C 65 11 [3] 00046 adc W2
002E AA [2] 00047 tax
002F A5 10 [3] 00048 lda W1+1
0031 65 12 [3] 00049 adc W2+1
00050 ; 4 + v S0
0033 18 [2] 00051 clc
0034 A8 [2] 00052 tay
0035 8A [2] 00053 txa
0036 65 15 [3] 00054 adc S0
0038 AA [2] 00055 tax
0039 24 15 [3] 00056 bit S0
003B 10 01 (003E) [2/3] 00057 bpl 2f
003D 88 [2] 00058 dey
003E 00059 2:
003E 98 [2] 00060 tya
003F 69 00 [2] 00061 adc #0
00062 ; 0 := v W0 -> 1
0041 86 0D [3] 00063 stx W0
0043 85 0E [3] 00064 sta W0+1
Re: Code optimization technique
Posted: Thu Oct 08, 2020 4:10 pm
by BillG
A new module called the code depessimizer has been added to my ever growing pile of compiler technology. The name is partially out of my amusement with the term "pessimizing compiler" and a belief that a compiler cannot hope to generate the "best code" a human programmer cannot beat, at least on older classic processors. It makes improvements to the tree used when generating assembly language instructions.
An example of it in action:
Code: Select all
00031 * W0 := 1 + W1 + 2 + W2 - 3;
00032
00033 * * 0 := v W0 -> 1
00034 * * 1 L r 2
00035
00036 * * 2 L c 1 -> 3
00037 * * 3 + v W1 -> 4
00038 * * 4 + c 2 -> 5
00039 * * 5 + v W2 -> 6
00040 * * 6 - c 3
00041
00042
00043 * delete * 4 + c 2 -> 5
00044 * delete * 6 - c 3
00045 * merge * 3 + v W1 -> 5
00046
00047 * * 0 := v W0 -> 1
00048 * * 1 L r 2
00049
00050 * * 2 L v W1 -> 5
00051 * * 5 + v W2
00052
00053
00054 * 1 L r 2
00055 * 2 L v W1 -> 5
00056 * 5 + v W2
0029 D6 10 [3] 00057 ldab W1+1
002B DB 12 [3] 00058 addb W2+1
002D 96 0F [3] 00059 ldaa W1
002F 99 11 [3] 00060 adca W2
00061 * 0 := v W0 -> 1
0031 97 0D [4] 00062 staa W0
0033 D7 0E [4] 00063 stab W0+1
Re: Code optimization technique
Posted: Fri Oct 09, 2020 8:56 pm
by BillG
More work on the depessimizer. Now variables can cancel out...
Code: Select all
00320 ; 00013 W0 := W1 + W2 - W1;
00321 * * 0 := v W0 -> 1
00322 * * 1 L r 2
00323
00324 * * 2 L v W1 -> 3
00325 * * 3 + v W2 -> 4
00326 * * 4 - v W1
00327 * delete * 4 - v W1
00328 * change * 2 L c 0 -> 3
00329 * merge * 3 + v W2
00330 * delete * 2 L v W2
00331
00332 * * 0 := v W0 -> 1
00333 * * 1 L v W2
00334
00335
00336 * 1 L v W2
0106 DE 40 [4] 00337 ldx W2
00338 * 0 := v W0 -> 1
0108 DF 38 [5] 00339 stx W0
While somebody is not likely to write code like this, other transformations may give this as a result.
Re: Code optimization technique
Posted: Mon Oct 12, 2020 1:51 am
by BillG
Perhaps this complicated expression will illustrate the process.
Matching additions and subtractions of a variable are found; the second is deleted and the first converted to a constant zero. These constant zeroes are later folded together.
Code: Select all
00320 ; 00013 W0 := 2 - W1 + W2 - W1 + W2 + W1 - W2 - 1;
00321 * * 0 := v W0 -> 1
00322 * * 1 L r 2
00323
00324 * * 2 L c 2 -> 3
00325 * * 3 - v W1 -> 4
00326 * * 4 + v W2 -> 5
00327 * * 5 - v W1 -> 6
00328 * * 6 + v W2 -> 7
00329 * * 7 + v W1 -> 8
00330 * * 8 - v W2 -> 9
00331 * * 9 - c 1
00332 * delete * 7 + v W1 -> 8
00333 * change * 3 - c 0 -> 4
00334 * delete * 8 - v W2 -> 9
00335 * change * 4 + c 0 -> 5
00336 * delete * 3 - c 0 -> 4
00337 * delete * 4 + c 0 -> 5
00338 * delete * 9 - c 1
00339
00340 * * 0 := v W0 -> 1
00341 * * 1 L r 2
00342
00343 * * 2 L c 1 -> 5
00344 * * 5 - v W1 -> 6
00345 * * 6 + v W2
00346
00347
00348 * 1 L r 2
00349 * 2 L c 1 -> 5
00350 * 5 - v W1 -> 6
0106 4F [2] 00351 clra
0107 C6 01 [2] 00352 ldab #1
0109 D0 35 [3] 00353 subb W1+1
010B 92 34 [3] 00354 sbca W1
00355 * 6 + v W2
010D DB 41 [3] 00356 addb W2+1
010F 99 40 [3] 00357 adca W2
00358 * 0 := v W0 -> 1
0111 97 38 [4] 00359 staa W0
0113 D7 39 [4] 00360 stab W0+1
Re: Code optimization technique
Posted: Thu Oct 15, 2020 12:02 pm
by BillG
Some more work has been done on the generation of 6800 code for subtraction. Because subtraction is not commutative, there are many variations to cover for sign extension.
Code: Select all
00320 ; 00013 W0 := S1 - S2;
00321 * * 0 := v W0 -> 1
00322 * * 1 L r 2
00323
00324 * * 2 L v S1 -> 3
00325 * * 3 - v S2
00326 * 1 L r 2
00327 * 2 L v S1 -> 3
00328 * 3 - v S2
0106 86 7F [2] 00329 ldaa #$7F ; Thanks Mike B!
0108 16 [2] 00330 tab
0109 91 36 [3] 00331 cmpa S1
010B 82 7F [2] 00332 sbca #$7F
010D D1 3A [3] 00333 cmpb S2
010F 89 00 [2] 00334 adca #0
0111 D6 36 [3] 00335 ldab S1
0113 D0 3A [3] 00336 subb S2
0115 82 00 [2] 00337 sbca #0
00338 * 0 := v W0 -> 1
0117 97 38 [4] 00339 staa W0
0119 D7 39 [4] 00340 stab W0+1
Code: Select all
00341 ; 00014 W0 := W1 - S2;
00342 * * 0 := v W0 -> 1
00343 * * 1 L r 2
00344
00345 * * 2 L v W1 -> 3
00346 * * 3 - v S2
00347 * 1 L r 2
00348 * 2 L v W1 -> 3
00349 * 3 - v S2
011B 96 34 [3] 00350 ldaa W1
011D C6 7F [2] 00351 ldab #$7F
011F D1 3A [3] 00352 cmpb S2
0121 89 00 [2] 00353 adca #0
0123 D6 35 [3] 00354 ldab W1+1
0125 D0 3A [3] 00355 subb S2
0127 82 00 [2] 00356 sbca #0
00357 * 0 := v W0 -> 1
0129 97 38 [4] 00358 staa W0
012B D7 39 [4] 00359 stab W0+1
Code: Select all
00360 ; 00015 W0 := S1 - W2;
00361 * * 0 := v W0 -> 1
00362 * * 1 L r 2
00363
00364 * * 2 L v S1 -> 3
00365 * * 3 - v W2
00366 * 1 L r 2
00367 * 2 L v S1 -> 3
00368 * 3 - v W2
012D 86 7F [2] 00369 ldaa #$7F ; Thanks Mike B!
012F D6 36 [3] 00370 ldab S1
0131 11 [2] 00371 cba
0132 82 7F [2] 00372 sbca #$7F
0134 D0 41 [3] 00373 subb W2+1
0136 92 40 [3] 00374 sbca W2
00375 * 0 := v W0 -> 1
0138 97 38 [4] 00376 staa W0
013A D7 39 [4] 00377 stab W0+1
Code: Select all
00378 ; 00016 W0 := 2 + W1 - S2;
00379 * * 0 := v W0 -> 1
00380 * * 1 L r 2
00381
00382 * * 2 L c 2 -> 3
00383 * * 3 + v W1 -> 4
00384 * * 4 - v S2
00385 * 1 L r 2
00386 * 2 L c 2 -> 3
00387 * 3 + v W1 -> 4
013C 96 34 [3] 00388 ldaa W1
013E D6 35 [3] 00389 ldab W1+1
0140 CB 02 [2] 00390 addb #2
0142 89 00 [2] 00391 adca #0
00392 * 4 - v S2
0144 D7 00 [4] 00393 stab Scratch
0146 C6 7F [2] 00394 ldab #$7F ; Thanks Mike B!
0148 91 3A [3] 00395 cmpa S2
014A 89 00 [2] 00396 adca #0
014C D6 00 [3] 00397 ldab Scratch
014E D0 3A [3] 00398 subb S2
0150 82 00 [2] 00399 sbca #0
00400 * 0 := v W0 -> 1
0152 97 38 [4] 00401 staa W0
0154 D7 39 [4] 00402 stab W0+1
This is where a third byte-sized register would have been useful. The depessimizer will have to try to get rid of the need for the scratch variable by reordering operations. Unfortunately, the TST instruction does not have a direct mode variation and the BIT instruction is not like that on the 6502 which sets the sign flag based only on the contents of a memory location.
Re: Code optimization technique
Posted: Sun Oct 18, 2020 7:07 pm
by BillG
Borland Turbo Pascal permits AND, OR and XOR operations for many compatible types.
This is the beginning of code generation for OR and XOR.
For operations with two variables, OR and XOR are very similar.
Of particular interest is the first assignment. Unlike addition and subtraction, ORing and XORing extended signs never accumulate or cancel. It is sufficient to determine the result of operating on the two bytes and sign extend that.
Code: Select all
00320 ; 00013 W0 := S1 or S2;
0106 D6 36 [3] 00321 ldab S1
0108 DA 3A [3] 00322 orab S2
010A 86 7F [2] 00323 ldaa #$7F ; Thanks Mike B!
010C 11 [2] 00324 cba
010D 82 7F [2] 00325 sbca #$7F
010F 97 38 [4] 00326 staa W0
0111 D7 39 [4] 00327 stab W0+1
00328 ; 00014 W0 := W1 or S2;
0113 D6 3A [3] 00329 ldab S2
0115 86 7F [2] 00330 ldaa #$7F ; Thanks Mike B!
0117 11 [2] 00331 cba
0118 82 7F [2] 00332 sbca #$7F
011A DA 35 [3] 00333 orab W1+1
011C 9A 34 [3] 00334 oraa W1
011E 97 38 [4] 00335 staa W0
0120 D7 39 [4] 00336 stab W0+1
00337 ; 00015 W0 := S1 or W2;
0122 D6 36 [3] 00338 ldab S1
0124 86 7F [2] 00339 ldaa #$7F ; Thanks Mike B!
0126 11 [2] 00340 cba
0127 82 7F [2] 00341 sbca #$7F
0129 DA 41 [3] 00342 orab W2+1
012B 9A 40 [3] 00343 oraa W2
012D 97 38 [4] 00344 staa W0
012F D7 39 [4] 00345 stab W0+1
00346 ; 00016 W0 := B1 or B2;
0131 D6 18 [3] 00347 ldab B1
0133 DA 1A [3] 00348 orab B2
0135 4F [2] 00349 clra
0136 97 38 [4] 00350 staa W0
0138 D7 39 [4] 00351 stab W0+1
00352 ; 00017 W0 := W1 or B2;
013A D6 35 [3] 00353 ldab W1+1
013C DA 1A [3] 00354 orab B2
013E 96 34 [3] 00355 ldaa W1
0140 97 38 [4] 00356 staa W0
0142 D7 39 [4] 00357 stab W0+1
00358 ; 00018 W0 := B1 or W2;
0144 D6 18 [3] 00359 ldab B1
0146 DA 41 [3] 00360 orab W2+1
0148 96 40 [3] 00361 ldaa W2
014A 97 38 [4] 00362 staa W0
014C D7 39 [4] 00363 stab W0+1
00364 ; 00019 W0 := W1 or W2;
014E D6 35 [3] 00365 ldab W1+1
0150 DA 41 [3] 00366 orab W2+1
0152 96 34 [3] 00367 ldaa W1
0154 9A 40 [3] 00368 oraa W2
0156 97 38 [4] 00369 staa W0
0158 D7 39 [4] 00370 stab W0+1
00371 ; 00020 B0 := B1 or B2;
015A 96 18 [3] 00372 ldaa B1
015C 9A 1A [3] 00373 oraa B2
015E 97 16 [4] 00374 staa B0
00375 ; 00021 W0 := S1 xor S2;
0160 D6 36 [3] 00376 ldab S1
0162 D8 3A [3] 00377 eorb S2
0164 86 7F [2] 00378 ldaa #$7F ; Thanks Mike B!
0166 11 [2] 00379 cba
0167 82 7F [2] 00380 sbca #$7F
0169 97 38 [4] 00381 staa W0
016B D7 39 [4] 00382 stab W0+1
The depessimizer needed a little bit of adjustment.
B1 or B3 - B1 is not necessarily the same as B3.
Code: Select all
00320 ; 00013 B0 := B1 or B3 - B1;
00321 * * 0 := v B0 -> 1
00322 * * 1 L r 2
00323
00324 * * 2 L v B1 -> 3
00325 * * 3 | v B3 -> 4
00326 * * 4 - v B1
00327 * 1 L r 2
00328 * 2 L v B1 -> 3
00329 * 3 | v B3 -> 4
0106 96 18 [3] 00330 ldaa B1
0108 9A 1C [3] 00331 oraa B3
00332 * 4 - v B1
010A 90 18 [3] 00333 suba B1
00334 * 0 := v B0 -> 1
010C 97 16 [4] 00335 staa B0
However, variables in subexpressions on either side of an OR operation are allowed to cancel out:
Code: Select all
00337 ; 00014 B0 := B2 + B1 - B2 or B3 + B1 + B2 - B1;
00338 * * 0 := v B0 -> 1
00339 * * 1 L r 2
00340
00341 * * 2 L v B2 -> 3
00342 * * 3 + v B1 -> 4
00343 * * 4 - v B2 -> 5
00344 * * 5 | v B3 -> 6
00345 * * 6 + v B1 -> 7
00346 * * 7 + v B2 -> 8
00347 * * 8 - v B1
00348 * delete * 4 - v B2 -> 5
00349 * change * 2 L c 0 -> 3
00350 * delete * 8 - v B1
00351 * change * 6 + c 0 -> 7
00352 * delete * 6 + c 0 -> 7
00353 * merge * 3 + v B1 -> 5
00354
00355 * * 0 := v B0 -> 1
00356 * * 1 L r 2
00357
00358 * * 2 L v B1 -> 5
00359 * * 5 | v B3 -> 7
00360 * * 7 + v B2
00361
00362
00363 * 1 L r 2
00364 * 2 L v B1 -> 5
00365 * 5 | v B3 -> 7
0110 96 18 [3] 00366 ldaa B1
0112 9A 1C [3] 00367 oraa B3
00368 * 7 + v B2
0114 DB 1A [3] 00369 addb B2
00370 * 0 := v B0 -> 1
0116 97 16 [4] 00371 staa B0
Next up is generating code to OR and XOR a variable with a constant. This will present many opportunities to save a few cycles or bytes here and there.
Edit: Found and fixed a bug in B0 := B1 or B3 - B1 example
Re: Code optimization technique
Posted: Mon Oct 19, 2020 2:36 am
by Chromatix
While you're thinking about Boolean logic, one of my side projects generated a set of operators covering all of the non-trivial bitwise operations involving two arguments:
Code: Select all
0 0 0 1 1 0 1 1 Equiv. C
--------------------------------------------------
| OR 0 1 1 1 A | B
& AND 0 0 0 1 A & B
^ XOR 0 1 1 0 A ^ B
~| NOR 1 0 0 0 ~(A | B)
~& NAND 1 1 1 0 ~(A & B)
~^ EQUIV 1 0 0 1 ~(A ^ B)
&~ ANDC 0 0 1 0 A & ~B
|~ ORC 1 0 1 1 A | ~B
~&~ NANDC 1 1 0 1 ~(A & ~B)
~|~ NORC 0 1 0 0 ~(A | ~B)
There are six further possible outcomes which result in constant outputs (0,1) or which depend on only one of the arguments (A, ~A, B, ~B).
I mention this because it's probably more efficient to combine expressions resembling those on the right into a single composite operation than to execute the components sequentially.
Re: Code optimization technique
Posted: Mon Oct 19, 2020 3:16 am
by BillG
While you're thinking about Boolean logic, one of my side projects generated a set of operators covering all of the non-trivial bitwise operations involving two arguments:
Thanks. That table is handy.