Here are 2 quick benchmarks.
\ using built-in DO
: x cc@ 10000 0 do i drop loop cc@ d- d. ;
: x2 cc@ 10000 0 do i drop 2 +loop cc@ d- d. ;
\ using replacement DO
: y cc@ 10000 0 do i drop loop cc@ d- d. ;
: y2 cc@ 10000 0 do i drop 2 +loop cc@ d- d. ;
x runs in 1130236 cycles (about 113 cycles per loop)
y runs in 1110245 cycles (about 111 cycles per loop, about 2% faster)
x2 runs in 565236 cycles (about 113 cycles per loop)
y2 runs in 745245 cycles (about 149 cycles per loop, about 32% slower)
Code:
Tali Forth 2 kernel for 65816s (5 Jan 2020)
Tali Forth 2 for the 65c02
Version 1.0 24. Jan 2020
Copyright 2014-2020 Scot W. Stevenson
Tali Forth 2 comes with absolutely NO WARRANTY
Type 'bye' to exit
A=0002 X=0076 Y=0000 S=01F9 EnvMXdIzc D=0000 B=00 4A90FB02 00e014 lsr a
.@f_tali_emul.txt
.g
\ 65816s emulator stuff for Tali ok
ok
hex ok
6 nc-limit ! ok
true strip-underflow ! ok
ok
: ic@ [ ( -- d ) \ get emulator instruction count ok
f602 , \ f6 cop \ get emulator instruction count in YA ok
caca , caca , \ dex dex dex dex ok
0295 , eb c, 0395 , \ 2 sta.dx xba 3 sta.dx ok
0094 , 0174 , \ 0 sty.dx 1 stz.dx ok
] ; ok
ok
: cc@ [ ( -- d ) \ get emulator cycle count ok
f502 , \ f5 cop ok
caca , caca , \ dex dex dex dex ok
0295 , eb c, 0395 , \ 2 sta.dx xba 3 sta.dx ok
0094 , 0174 , \ 0 sty.dx 1 stz.dx ok
] ; ok
ok
decimal ok
eof
ok
\ using built-in DO ok
: x cc@ 10000 0 do i drop loop cc@ d- d. ; ok
: x2 cc@ 10000 0 do i drop 2 +loop cc@ d- d. ; ok
ok
ok
A=0002 X=0076 Y=0000 S=01F9 EnvMXdIzc D=0000 B=00 4A90FB02 00e014 lsr a
.@talido2.txt
.g
\ Replacement DO words for Tali that generate smaller inline code ok
ok
decimal ok
5 nc-limit ! \ inline code up to 5 bytes ok
true strip-underflow ! \ remove underflow checking ok
ok
\ Labels for unlabelled Tali internal variables ok
0 constant User0 ok
User0 36 + constant tmp1 \ temporary storage ok
User0 38 + constant tmp2 \ temporary storage ok
User0 40 + constant tmp3 \ temporary storage (especially for print) ok
ok
hex ok
ok
: Branch, ( addr opcode -- ) \ compile a short or long 6502 branch compiled
over 1- 1- here - dup ff7f u> if compiled
swap c, c, drop \ opcode & displacement compiled
else compiled
drop 20 xor c, 3 c, \ opcode with reversed sense to branch around jmp compiled
4c c, , \ jmp compiled
then compiled
; ok
ok
: Do_Run [ ( limit start -- ) ( R: -- limit start ) \ DO runtime ok
7a68 , \ pla ply \ pop rts addr ok
1a c, 01d0 , c8 c, \ inc.a 1 bne iny ok
85 c, tmp2 c, 84 c, tmp2 1+ c, \ tmp2 sta.z tmp2 1+ sty.z ok
38 c, 00a9 , 02f5 , a8 c, \ sec 0 lda.# 2 sbc.zx tay \ push limit ok
80a9 , 03f5 , 85 c, tmp3 c, 5a48 , \ 80 lda.# 3 sbc.zx tmp3 sta.z pha phy ok
1898 , 0075 , a8 c, \ tya clc 0 adc.zx tay \ push index ok
01b5 , 65 c, tmp3 c, 5a48 , \ 1 lda.zx tmp3 adc.z pha phy ok
e8e8 , e8e8 , \ inx inx inx inx \ discard params ok
6c c, tmp2 , \ tmp2 jmp.i ok
] ; ok
ok
variable Do_Leave \ current LEAVE chain anchor ok
ok
: Do compiled
\ https://forth-standard.org/standard/core/DO compiled
Do_Leave @ \ save old leave ptr compiled
0 Do_Leave ! \ init compiled
['] Do_Run compile, compiled
here \ save do_addr to patch LOOP later compiled
; immediate compile-only redefined do ok
ok
: ?Do compiled
abort \ incomplete compiled
; immediate compile-only redefined ?do ok
ok
: Leave compiled
4c c, here Do_Leave @ , Do_Leave ! ; immediate compile-only redefined leave ok
ok
: +Loop_Run [ ( n -- ) \ +LOOP runtime ok
bada , a88a , fa c, \ phx tsx txa tay plx \ Y=S-1 ok
b918 , 104 , 0075 , 99 c, 104 , \ clc 104 lda.y 0 adc.zx 104 sta.y \ add to index ok
b9 c, 105 , 0175 , 99 c, 105 , \ 105 lda.y 1 adc.zx 105 sta.y ok
e8e8 , \ inx inx \ drop n ok
] ; never-native ok
ok
: Loop_Run [ \ LOOP runtime ok
bada , \ phx tsx ok
bd18 , 104 , 0169 , 9d c, 104 , \ clc 104 lda.x 1 adc.# 104 sta.x \ add to index ok
bd c, 105 , 0069 , 9d c, 105 , \ 105 lda.x 0 adc.# 105 sta.x ok
fa c, \ plx ok
] ; never-native ok
ok
: LoopEnd ( OldLeave do_addr -- ) compiled
50 Branch, \ bvc do_addr compiled
Do_Leave @ begin dup while \ patch leave chain compiled
dup @ swap here swap ! repeat drop compiled
Do_Leave ! \ restore OldLeave compiled
6868 , 6868 , \ pla pla pla pla discard loop variables compiled
; ok
: Loop compiled
['] Loop_Run compile, LoopEnd ; IMMEDIATE COMPILE-ONLY redefined loop ok
: +Loop compiled
['] +Loop_Run compile, LoopEnd ; IMMEDIATE COMPILE-ONLY redefined +loop ok
ok
: I [ ( -- n ) ok
bada , \ phx tsx ok
bd38 , 104 , fd c, 106 , a8 c, \ sec 104 lda.x 106 sbc.x tay ok
bd c, 105 , fd c, 107 , \ 105 lda.x 107 sbc.x ok
fa c, \ plx ok
caca , 0195 , 0094 , \ dex dex 1 sta.zx 0 sty.zx ok
] ; never-native redefined i ok
ok
: J [ ( -- n ) ok
bada , \ phx tsx ok
bd38 , 108 , fd c, 10a , a8 c, \ sec 108 lda.x 10a sbc.x tay ok
bd c, 109 , fd c, 10b , \ 109 lda.x 10b sbc.x ok
fa c, \ plx ok
caca , 0195 , 0094 , \ dex dex 1 sta.zx 0 sty.zx ok
] ; never-native redefined j ok
ok
decimal ok
here ' User0 - . \ display size of this part 598 ok
eof
\ using replacement DO ok
: y cc@ 10000 0 do i drop loop cc@ d- d. ; ok
: y2 cc@ 10000 0 do i drop 2 +loop cc@ d- d. ; ok
ok
x -1130236 ok
y -1110245 ok
ok
x2 -565236 ok
y2 -745245 ok
ok
see x
nt: 836 xt: 83F
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 120
083F 20 26 08 20 88 93 10 27 CA CA 74 00 74 01 A9 08 &. ...' ..t.t...
084F 48 A9 AD 48 38 A9 00 F5 02 95 02 A9 80 F5 03 95 H..H8... ........
085F 03 48 B5 02 48 18 B5 00 75 02 95 00 B5 01 75 03 .H..H... u.....u.
086F 48 B5 00 48 E8 E8 E8 E8 CA CA 86 2A BA 38 BD 01 H..H.... ...*.8..
087F 01 FD 03 01 A8 BD 02 01 FD 04 01 A6 2A 95 01 94 ........ ....*...
088F 00 E8 E8 20 8A 97 18 68 75 00 A8 B8 68 75 01 48 ... ...h u...hu.H
089F 98 48 E8 E8 70 03 4C 77 08 68 68 68 68 68 68 20 .H..p.Lw .hhhhhh
08AF 26 08 20 2E 8A 20 D4 8C &. .. ..
83F 826 jsr cc@
842 9388 jsr 10000
845 27 bpl
847 dex 0
848 dex
849 0 stz.zx
84B 1 stz.zx
84D 8 lda.# do
84F pha
850 AD lda.#
852 pha
853 sec
854 0 lda.#
856 2 sbc.zx
858 2 sta.zx
85A 80 lda.#
85C 3 sbc.zx
85E 3 sta.zx
860 pha
861 2 lda.zx
863 pha
864 clc
865 0 lda.zx
867 2 adc.zx
869 0 sta.zx
86B 1 lda.zx
86D 3 adc.zx
86F pha
870 0 lda.zx
872 pha
873 inx
874 inx
875 inx
876 inx
877 dex i
878 dex
879 2A stx.z
87B tsx
87C sec
87D 101 lda.x
880 103 sbc.x
883 tay
884 102 lda.x
887 104 sbc.x
88A 2A ldx.z
88C 1 sta.zx
88E 0 sty.zx
890 inx drop
891 inx
892 978A jsr loop
895 clc
896 pla
897 0 adc.zx
899 tay
89A clv
89B pla
89C 1 adc.zx
89E pha
89F tya
8A0 pha
8A1 inx
8A2 inx
8A3 3 bvs
8A5 877 jmp
8A8 pla
8A9 pla
8AA pla
8AB pla
8AC pla
8AD pla
8AE 826 jsr cc@
8B1 8A2E jsr d-
8B4 8CD4 jsr d.
ok
see y
nt: B9E xt: BA7
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 37
0BA7 20 26 08 20 88 93 10 27 20 23 A7 20 ED 09 20 65 &. ...' #. .. e
0BB7 0B E8 E8 20 C3 0A 50 F6 68 68 68 68 20 26 08 20 ... ..P. hhhh &.
0BC7 2E 8A 20 D4 8C .. ..
BA7 826 jsr cc@
BAA 9388 jsr 10000
BAD 27 bpl
BAF A723 jsr 0
BB2 9ED jsr do
BB5 B65 jsr i
BB8 inx drop
BB9 inx
BBA AC3 jsr loop
BBD F6 bvc
BBF pla
BC0 pla
BC1 pla
BC2 pla
BC3 826 jsr cc@
BC6 8A2E jsr d-
BC9 8CD4 jsr d.
ok
see x2
nt: 8B8 xt: 8C2
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 120
08C2 20 26 08 20 88 93 10 27 CA CA 74 00 74 01 A9 09 &. ...' ..t.t...
08D2 48 A9 30 48 38 A9 00 F5 02 95 02 A9 80 F5 03 95 H.0H8... ........
08E2 03 48 B5 02 48 18 B5 00 75 02 95 00 B5 01 75 03 .H..H... u.....u.
08F2 48 B5 00 48 E8 E8 E8 E8 CA CA 86 2A BA 38 BD 01 H..H.... ...*.8..
0902 01 FD 03 01 A8 BD 02 01 FD 04 01 A6 2A 95 01 94 ........ ....*...
0912 00 E8 E8 20 23 A3 18 68 75 00 A8 B8 68 75 01 48 ... #..h u...hu.H
0922 98 48 E8 E8 70 03 4C FA 08 68 68 68 68 68 68 20 .H..p.L. .hhhhhh
0932 26 08 20 2E 8A 20 D4 8C &. .. ..
8C2 826 jsr cc@
8C5 9388 jsr 10000
8C8 27 bpl
8CA dex 0
8CB dex
8CC 0 stz.zx
8CE 1 stz.zx
8D0 9 lda.# do
8D2 pha
8D3 30 lda.#
8D5 pha
8D6 sec
8D7 0 lda.#
8D9 2 sbc.zx
8DB 2 sta.zx
8DD 80 lda.#
8DF 3 sbc.zx
8E1 3 sta.zx
8E3 pha
8E4 2 lda.zx
8E6 pha
8E7 clc
8E8 0 lda.zx
8EA 2 adc.zx
8EC 0 sta.zx
8EE 1 lda.zx
8F0 3 adc.zx
8F2 pha
8F3 0 lda.zx
8F5 pha
8F6 inx
8F7 inx
8F8 inx
8F9 inx
8FA dex i
8FB dex
8FC 2A stx.z
8FE tsx
8FF sec
900 101 lda.x
903 103 sbc.x
906 tay
907 102 lda.x
90A 104 sbc.x
90D 2A ldx.z
90F 1 sta.zx
911 0 sty.zx
913 inx drop
914 inx
915 A323 jsr 2
918 clc +loop
919 pla
91A 0 adc.zx
91C tay
91D clv
91E pla
91F 1 adc.zx
921 pha
922 tya
923 pha
924 inx
925 inx
926 3 bvs
928 8FA jmp
92B pla
92C pla
92D pla
92E pla
92F pla
930 pla
931 826 jsr cc@
934 8A2E jsr d-
937 8CD4 jsr d.
ok
see y2
nt: BCD xt: BD7
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 40
0BD7 20 26 08 20 88 93 10 27 20 23 A7 20 ED 09 20 65 &. ...' #. .. e
0BE7 0B E8 E8 20 23 A3 20 9A 0A 50 F3 68 68 68 68 20 ... #. . .P.hhhh
0BF7 26 08 20 2E 8A 20 D4 8C &. .. ..
BD7 826 jsr cc@
BDA 9388 jsr 10000
BDD 27 bpl
BDF A723 jsr 0
BE2 9ED jsr do
BE5 B65 jsr i
BE8 inx drop
BE9 inx
BEA A323 jsr 2
BED A9A jsr +loop
BF0 F3 bvc
BF2 pla
BF3 pla
BF4 pla
BF5 pla
BF6 826 jsr cc@
BF9 8A2E jsr d-
BFC 8CD4 jsr d.
ok