6502.org • View topic - Branching with flags

View unanswered posts | View active topics

Board index » 6502.org Users Forum » Forth

All times are UTC

Branching with flags

Page 1 of 1

[ 7 posts ]

Previous topic | Next topic

Author

Message

IamRob

Post subject: Branching with flags

Posted: Thu Feb 10, 2022 9:43 am

Joined: Sun Apr 26, 2020 3:08 am
Posts: 357

By putting the TAY in this code, the branching for the comparisons become extremely easy.

Code:

BUMP   INC NEXT+1
   INC NEXT+1

NEXT   LDY #0000   ; IP pointer
   STY W+1

   INC NEXT+1
   INC NEXT+1

   TAY
W   JMP ($0000)

Did any one else get these extremely compact comparisons?

Code:

* : 0> ( n -- flag )
ZGRT   BMI CLRFLAG
        BPL ZNEQ

* : <> ( n1 n2 -- flag )
NEQ     CMP $00,X
      PHP
      INX
      INX
      PLP

* : 0<> ( n -- flag )
ZNEQ   BNE SETFLAG
   BEQ CLRFLAG

* : = ( n1 n2 -- flag )
EQU     CMP $00,X
       PHP
       INX
       INX
       PLP

* : 0= ( n -- flag )
ZEQU   BEQ SETFLAG

CLRFLAG   LDA #0000
   BEQ NEXT

* : 0< ( n -- flag ) 
ZLESS   BPL CLRFLAG

SETFLAG   LDA #$FFFF
   BNE NEXT

Top

GARTHWILSON

Post subject: Re: Branching with flags

Posted: Thu Feb 10, 2022 10:02 am

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8433
Location: Southern California

That is quite compact. So part of the strategy is that your headers are not kept inline in the dictionary, but separate, so program flow can go from one dictionary definition right into the next in many cases. How are you handling that? Will they be inline for things compiled on the fly, as opposed to what you provide in the kernel?

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?

Top

barrym95838

Post subject: Re: Branching with flags

Posted: Thu Feb 10, 2022 10:26 am

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1928
Location: Sacramento, CA, USA

Just to provide a bit of context, you're coding for an '802/816, E|M|X=0, ITC, TOS in A, right?

TAY is an efficient way to test TOS, but how often do its two cycles get "wasted" by putting it in NEXT instead of inside every word that uses it? It's easy to see that you'll save a small amount of space and waste a small number of cycles, and I usually default toward saving space unless instructed otherwise, so I'm on board.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)

Top

GARTHWILSON

Post subject: Re: Branching with flags

Posted: Thu Feb 10, 2022 10:52 am

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8433
Location: Southern California

Quote:

Code:

   INC NEXT+1
   INC NEXT+1

A 16-bit INC DP takes 7 cycles (assuming DP is page-aligned); so you have 14 cycles above. If you do this instead:

Code:

    LDA  NEXT+1
    INA
    INA
    STA  NEXT+1

you'll save a couple of cycles, bringing your NEXT down to 26 cycles (which is what mine is).

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?

Top

IamRob

Post subject: Re: Branching with flags

Posted: Thu Feb 10, 2022 2:27 pm

Joined: Sun Apr 26, 2020 3:08 am
Posts: 357

GARTHWILSON wrote:

Can't you just be happy with the compactness without getting ahead of ourselves?

But in reality, looking ahead is one of the things that makes code efficient at the current position, so I have already looked into this.

I am working on changing the header around for a much faster search as well as the ability to create self-running applications without the headers. Since once the code is compiled, the headers can be discarded.

The headers can be stored in either the upper part of the 64k of memory or, a different bank and built downwards. The header looks like this in memory:
<hi-memory> zero-zero CFA name SB LFA CFA name SB LFA CFA name SB LFA ..etc <lo-memory>

where SB is the Status Byte & name Length, LFA is the Link Field Address pointing to the LFA of the next name and CFA is the Code Field Address pointing into the compact code. Where to store each header is calculated very easily, since WORD already stores the length byte at HERE when doing the search. When adding a Header to memory then is just subtracting: lo-memory minus word length/Status byte minus 2 bytes for CFA minus 2 bytes for LFA.. If we preserve the previous LFA before the subtraction, then the new LFA just points to the previous LFA. No back-filling the LFA like the old method.

LATEST actually points to the LFA at the bottom of memory and does the search up to hi-memory. Having the LFA at the beginning of each word also greatly speeds up the search as the routine no longer needs to do a TRAVERSE through the file name to get the next LFA. The search routine as it points to each LFA, just takes a quick look at the SB (at LFA+2), and if the word lengths are not the same or characters do not match, it is just a matter of doing:

Code:

      LDA (N)
      STA N

to get the next LFA. There is no traversing through the filename.

AND, this method still allows the CONTEXT and CURRENT Vocabularies to be used.

Top

IamRob

Post subject: Re: Branching with flags

Posted: Thu Feb 10, 2022 2:31 pm

Joined: Sun Apr 26, 2020 3:08 am
Posts: 357

GARTHWILSON wrote:

Quote:

Code:

   INC NEXT+1
   INC NEXT+1

A 16-bit INC DP takes 7 cycles (assuming DP is page-aligned); so you have 14 cycles above. If you do this instead:

Code:

    LDA  NEXT+1
    INA
    INA
    STA  NEXT+1

you'll save a couple of cycles, bringing your NEXT down to 26 cycles (which is what mine is).

The Accumulator holds the stack value, so can't be used.

Also since the Acc is used in just about every word definition, you will have to do a LDA $00,X, which is what? (7 cycles)

The only way to know for sure is to count the number of LDA $00,X in your system.

Last edited by IamRob on Thu Feb 10, 2022 3:24 pm, edited 2 times in total.

Top

IamRob

Post subject: Re: Branching with flags

Posted: Thu Feb 10, 2022 2:48 pm

Joined: Sun Apr 26, 2020 3:08 am
Posts: 357

barrym95838 wrote:

I am only half done and already am at the break-even point where cycles saved equals cycles used by the TAY. There are times when doing this has saved 10 cycles within some primitive definitions. How often are these comparisons used I just listed? Quite a bit. Just the use of one of these comparisons in a word definition equals the 2 extra cycles for each word in a 10 word definition.

Meaning if a word definition has 10 words, the use of the TAY would use 20 cycles. Just one use of any comparison in a word definition will save 20 cycles, and we are already at the break-even point for using the TAY, with just one word. If any other words in the definition also save a few cycles, then we are way ahead.

The second advantage of using the TAY is, I have already saved over a page of code that I would have needed by preserving or reloading the Accumulator first in some of the primitives, or by needing a branch flag as in the comparisons listed above.

The answer is, YES, it saves a lot of cycles to use the TAY at a cost of 2 cycles per each word compiled, as well as greatly reduces the code size. Win-Win.

I should re-write this to say:

There are fewer cycles wasted than are gained in each of the word definitions that are called. You have to remember that the NEXT routine is not just used in compiling, it is also used for the run-time. You may only need to compile once, but you may end up running some routines over and over again, before re-compiling.

Therefore the 2 cycles lost at compile-time doesn't really add up, because they are only wasted once.

Then, just concentrating on the 2 cycles sometimes wasted at run-time, the alternative would be to put the TAY in each of the word definitions that require it.

Most of the words that require it are the ones that are more often used, so the percentage of waste drops dramatically. Therefore it comes down to about a 10-15% increase in speed by having the TAY in each word definition that requires it compared to saving 1-byte in each of the words that require it. So far, about half of the words would require it. So a forth having 250 words would save 125 bytes.

But wait there is more. 125 bytes is nothing to sneeze at as those byte savings can also mean the difference of being able to use BRA NEXT instead of JMP NEXT. One can fit a lot of code in the 127 bytes within a branch range.

It is extremely hard to calculate the savings when a lot of words can be affected indirectly as well. But the fact is that 1-byte in NEXT allows one to pack a lot of code within range to allow for branching.

Top

Page 1 of 1

[ 7 posts ]

Board index » 6502.org Users Forum » Forth

All times are UTC

Who is online

Users browsing this forum: No registered users and 2 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum