Page 2 of 2
Re: Faster loops
Posted: Sat May 01, 2021 1:03 am
by JimBoyd
To say it's superior at all times and in all circumstances is
rather a sweeping statement.

No way do
I endorse talk like that. Perhaps it's a case of the Vocal Minority making the most noise. I hope no-one makes the mistake of supposing
all Forth adherents are afflicted with such an unreasoning excess of zeal.
I certainly never made such a statement. I merely took exception to the perceived implication that Forth was not good for anything.
Re: Faster loops
Posted: Sat May 01, 2021 12:37 pm
by Druzyek
To say it's superior at all times and in all circumstances is
rather a sweeping statement.

No way do
I endorse talk like that. Perhaps it's a case of the Vocal Minority making the most noise. I hope no-one makes the mistake of supposing
all Forth adherents are afflicted with such an unreasoning excess of zeal.
I certainly never made such a statement. I merely took exception to the perceived implication that Forth was not good for anything.
Right, I don't remember you saying anything like that. On the other hand, I don't remember anyone (including myself) ever saying here that Forth is not good for anything. As I pointed out in the thread on it, the article makes clear at the very beginning that I was doing the comparison so I could build a graphing calculator with 100s of kilobytes of RAM. I don't think Forth is a good choice for that project, but that doesn't imply that Forth is not good for anything. The other calculator I'm working on for the 6507 contest does use Forth for the user interface since I'm trying to squeeze the firmware into an 8k ROM and only have 2k of RAM. Obviously in that case I do think Forth is good for something.
Re: Faster loops
Posted: Mon May 10, 2021 10:30 pm
by JimBoyd
Rather than replace Tali2's do-loop words, these words are meant to compliment them. They use the Y register similar to how one might in assembly to count down to zero, so only the upper limit is given, and it's limited to 8-bits (but you can get 256 loops by specifying 0 as the starting value). Because these words use the Y register, I just made word names with a "Y" prefix. After writing this, I realized that it doesn't really matter which register is used, and A could have been used just as well (and sometimes is used because Tali2 has a PUSH-A macro to get A on the top of the Forth data stack.
' testingyyij cycle_test CYCLES: 5420625 ok
' testingdodoij cycle_test CYCLES: 11053940 ok
I implemented YLOOPS in Fleet Forth, an ITC Forth. Since it is for a Commodore 64, it uses an NMOS version of the 6502, the 6510. There are no PLY and PHY instructions.
In my first attempt to code (YLOOP) , I had to clear the Y index register to zero because the words BRANCH and ?BRANCH need it cleared. ( NEXT leaves zero in the Y index register).
Code: Select all
CODE (YLOOP)
PLA TAY DEY
0= NOT IF
TYA PHA 0 # LDY
' BRANCH @ JMP
THEN
' ?BRANCH @ 8 + JMP END-CODE
I think this version of (YLOOP) for the 6510 is better.
Code: Select all
CODE (YLOOP)
PLA SEC 1 # SBC
0= NOT IF
PHA
' BRANCH @ JMP
THEN
' ?BRANCH @ 8 + JMP END-CODE
Here is all of it.
Code: Select all
CODE (YDO) ( B -- )
0 ,X LDA PHA POP JMP END-CODE
: YDO
COMPILE (YDO)
<MARK NEGATE ; IMMEDIATE
CODE (YLOOP)
PLA SEC 1 # SBC
0= NOT IF
PHA
' BRANCH @ JMP
THEN
' ?BRANCH @ 8 + JMP END-CODE
: YLOOP
COMPILE (YLOOP)
NEGATE <RESOLVE ; IMMEDIATE
CODE YI ( -- B )
PLA PHA
APUSH JMP END-CODE
CODE YJ ( -- B )
XSAVE STX TSX
$102 ,X LDA
XSAVE LDX
APUSH JMP END-CODE
CODE YK ( -- B )
XSAVE STX TSX
$103 ,X LDA
XSAVE LDX
APUSH JMP END-CODE
: TESTING
5 YDO
3 YDO
CR ." YI=" YI . ." YJ=" YJ .
YLOOP
YLOOP ;
: TESTING2
2 YDO
5 YDO
3 YDO
CR ." YI=" YI . ." YJ=" YJ . ." YK=" YK .
YLOOP
YLOOP
YLOOP ;
: TESTINGYYIJ
255 YDO 255 YDO YI YJ 2DROP YLOOP YLOOP ;
: TESTINGDODOIJ
255 0 DO 255 0 DO I J 2DROP LOOP LOOP ;
I included YK because Fleet Forth has K in the system loader. I don't think triply nested DO LOOPs are a good idea, but may be helpful during the prototyping phase to flesh out an idea.
I noticed that in your results testingyyij took slightly less than half the time of testingdodoij .
Since Fleet Forth is an ITC Forth, my results were that TESTINGYYIJ took about 83-84 percent the time of TESTINGDODOIJ .
Re: Faster loops
Posted: Sat Jan 21, 2023 12:56 am
by JimBoyd
didn't someone find out TaliForth 2 was pushing a 1 on the stack then falling through to +LOOP? It might be worth looking at 8 and 16-bit versions that don't do this if you're looking at new DO...LOOP versions.
If that's true, it might be partially my fault, since ISTR mentioning an idea like that to Scot a few years back, without considering the potential performance hit.
[edit: yep,
guilty!]
I presented an idea for DO LOOPS in an STC Forth where LOOP does NOT fall through to +LOOP .
leepivonka shows how to enhance the performance a couple of posts down.
Now that I think about it, leepivonka has posted about improving TaliForth's DO LOOP's elsewhere on this forum.