6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Apr 27, 2024 8:10 pm

All times are UTC




Post new topic Reply to topic  [ 40 posts ]  Go to page Previous  1, 2, 3
Author Message
PostPosted: Sat Aug 23, 2014 4:49 am 
Offline

Joined: Fri Nov 09, 2012 6:52 am
Posts: 16
That code should work. Be aware, however that if the macro is used inside a different scope than the destination label, the parameter 'address' might not be resolved to what you expect. I think this may be the problem.


Top
 Profile  
Reply with quote  
PostPosted: Sat Aug 23, 2014 5:20 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8144
Location: Midwestern USA
Druzyek wrote:
I tried something along these lines but it doesn't work if address is defined after the macro. What do you do?
.maco BEQL address
.local label
BNE label
JMP address
label
.endmacro

Generally speaking (depending on the assembler you're using), the labels used as dummy parameters in the macro declaration should be local to the macro to avoid collisions with global labels. For example, this is from my 65C816 string library:

Code:
strpad   .macro .s1,.s2,.l,.j,.pc ;copy, pad & justify string
         pea .pc
         pea .j
         per .l
         per .s2
         per .s1
         jsr strpad
         .endm

In the above, the scope of the dummy labels in the macro declaration, .s1, .s2, etc., is the macro itself. That is, they are known only to the macro, even though the macro invocation itself may be made with global labels, e.g.:

Code:
         strpad padbuf,strgbuf,padlen,.pl,.fc

In the above, .pl and .fc are local to the subroutine in which this particular macro invocation is made. The other parameters are global. When the assembler expands the macro it will effectively replace the dummy parameters in the declaration with the ones passed in the invocation, i.e., padbuf,strgbuf,padlen,.pl,.fc. Hence the macro invocation effectively becomes:

Code:
         pea .fc
         pea .pl
         per padlen
         per strgbuf
         per padbuf
         jsr strpad
         .endm

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Mar 21, 2020 1:20 pm 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
After 5+ years since my original post, I think I have macros to do exactly what I want. (Just an illustration. The following could be written more simply.)
Code:
FUNC DrawPixel
   BEGIN_ARGS
      BYTE xpos, ypos, color
   BEGIN_VARS
      WORD gfx_ptr
   END_VARS

   CALL CalcXY, xpos, ypos
   MOV.W ret_val, gfx_ptr
   MOV.B color, (gfx_ptr,X)
END_FUNC

BYTE and WORD reserve space on the X-based stack in zero page and create symbols for the variables. BEGIN_ARGs marks the variables as incoming arguments and BEGIN_VARS marks them as local variables. END_VARS does the actual allocation on the stack by pushing X then does a series of DEXs or uses TXA and SBC to adjust the stack pointer depending on how much memory is needed. END_FUNC restores the stack pointer with PLX.

CALL copies the value of xpos and ypos to the memory allocated with BEGIN_ARGS in CalcXY. This is the neat part. Each function defined with FUNC adds information to a string that stores the number and type of arguments the function takes. With this info, the CALL macro knows that xpos and ypos are X-based arguments and generates "LDA xpos,X" when it copies it to the argument memory of CalcXY. It also knows whether those arguments are bytes or words, so if CalcXY expects its X and Y arguments to be words but xpos is a byte, the macro will zero the top byte of the argument. On the other hand, if X and Y are bytes but xpos and ypos were words, the macro would recognize this and only copy the lower byte of the word to the incoming argument.

MOV.W and MOV.B also know the type of their arguments and generate code depending on that. ret_val is a zero page address where all functions can return a value. It's up to the caller to copy that value if needed (here it could be used as a pointer but copying it to gfx_ptr as an example).

In addition to BYTE and WORD there is also ZPBYTE and ZPWORD that copy some memory out of zero page to the hardware stack so the addresses will be free to use when the extra overhead of doing so makes sense. Here's my function for clearing the screen:

Code:
FUNC clrscr
      BEGIN_ARGS
         BYTE color
      BEGIN_VARS
         ZPWORD gfx_ptr
         BYTE rows
      END_VARS
      
      MOV.B #128,rows
      MOV.W #SCREEN_ADDRESS,gfx_ptr
      LDA color,X
      LDY #0
      .loop_outer:
         .loop_inner:
            STA (gfx_ptr),Y
            DEY
         BNE .loop_inner
         INC gfx_ptr+1 ;ie gfx_ptr+=256; which happens to be screen width
         DEC rows,X
      BNE .loop_outer
   END_FUNC

   CALL clrscr, #COLOR_BLUE

Edit: Another thing is a macro called "halt" that optionally prints the value of some variables then does BRK. Like the other macros, it knows whether to print a byte or word and also gives the name of the variable.


Top
 Profile  
Reply with quote  
PostPosted: Sat Mar 21, 2020 7:32 pm 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8428
Location: Southern California
Druzyek, I might not be understanding you correctly (I've been trying to figure this out, looking at the last page), but in your talking about pushing parameters on a ZP stack, it looks like you're not taking full advantage of what such a stack can do. A function should not have to receive parameters and push them onto a stack. It should be able to receive them already on the stack, left by the previous function(s) which left them there, and work with them there without exercising any overhead of copying them from one place to another. The exception would be if the routine needs local variables that are used internally and not part of the input or output.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sat Mar 21, 2020 11:00 pm 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
Hi Garth, I'm not sure I see what you mean there. When you have parameters on a stack in Forth or assembly treating the stack in the same way, you will still have to copy them eventually to do anything useful with them. They get copied from the stack to the stack with something like DUP or OVER, and in my example they are also copied from the stack to the stack. Here is a slightly modified example (The screen is 256x128 with one byte per pixel so the X,Y to address calculation is just SCREEN_ADDRESS+(y<<8)+x):
Code:
FUNC DrawPixel
   BEGIN_ARGS
      BYTE xpos, ypos, color
   BEGIN_VARS
      WORD gfx_ptr
   END_VARS

   MOV.W #SCREEN_ADDRESS, gfx_ptr
   LDA xpos,X
   STA gfx_ptr,X ;256x128 screen so low byte is x coord
   LDA gfx_ptr+1,X
   CLC
   ADC ypos,X
   STA gfx_ptr+1,X ;high byte is #>SCREEN_ADDRESS+ypos
   MOV.B color,(gfx_ptr,X)
END_FUNC

CALL DrawPixel, #20, #30, #COLOR_BLUE

This disassembles into something like this:
Code:
001:   FUNC DrawPixel
                    DrawPixel:
002:      BEGIN_ARGS
003:         BYTE xpos, ypos, color
                    xpos set 0
                    ypos set 1
                    color set 2
004:      BEGIN_VARS
005:         WORD gfx_ptr
                    gfx_ptr set 3
006:      END_VARS
                    PHX ;save stack pointer
                    DEX ;room on stack for xpos
                    DEX ;room on stack for ypos
                    DEX ;room on stack for color
                    DEX ;room on stack for gfx_ptr (low byte)
                    DEX ;room on stack for gfx_ptr (high byte)
007:      MOV.W #SCREEN_ADDRESS, gfx_ptr
                    LDA #<SCREEN_ADDRESS
                    STA 3,X ;3=gfx_ptr
                    LDA #>SCREEN_ADDRESS
                    STA 4,X ;4=gfx_ptr+1
008:      LDA xpos,X
                    LDA 0,X
009:      STA gfx_ptr,X
                    STA 3,X
010:      LDA gfx_ptr+1,X
                    LDA 4,X
011:      CLC
012:      ADC ypos,X
                    ADC 1,X
013:      STA gfx_ptr+1,X
                    STA 4,X
014:      MOV.B color,(gfx_ptr,X)
                    LDA 2,X ;2=color
                    STA (3,X)
015:   END_FUNC
                    PLX ;restore stack
                    RTS

016:   CALL DrawPixel, #20, #30, #COLOR_BLUE
                    LDA #20
                    STA -5,X
                    LDA #30
                    STA -4,X
                    LDA #COLOR_BLUE
                    STA -3,X
                    ;-1,X and -2,X left for gfx_ptr
                    JSR DrawPixel

In Forth it would be like this:
Code:
: DrawPixel ( x y color -- )
   -ROT 8 LSHIFT SCREEN_ADDRESS + + c! ;
20 30 COLOR_BLUE DrawPixel

The code to copy the literals to the stack at 016 knows to copy the first value 5 bytes below the stack pointer since it knows that DrawPixel will adjust the stack pointer down by 5 at 006. This means the value of #20 will be at 0,X after the adjustment. 0,X in DrawPixel is xpos, which is where we want #20 to be.

Note that the arguments don't have to be immediates. If DrawPixel were called with arguments that were variables inside a function defined with BEGIN_VARS, then the LDA / STA pairs at 016 would be copying from zp,X to zp,X like you have with DUP and OVER.

I haven't figured out how many cycles the Forth version would take but just the overhead of DEX / INX would make it less efficient. I think the difference is much more noticeable in cases where you need to juggle many variables in something stack-based and those variables are accessed many times in a function/word. For example, keeping track of five counters like the following is unwieldy in Forth, whereas you get a big speedup by making room on the stack for them then not touching the stack pointer while you loop through possibly thousands of characters in the string you're searching.
Code:
   FUNC CountPunctuation       ;void CountPunctuation(char *str_ptr) {
      BEGIN_ARGS
         WORD str_ptr
      BEGIN_VARS
         BYTE commas           ;   char commas;
         BYTE periods          ;   char periods;
         BYTE semicolons       ;   char semicolons;
         BYTE exclamations     ;   char exclamations;
         BYTE questions        ;   char questions;
      END_VARS
      
      STZ commas,X             ;   commas=0;
      STZ periods,X            ;   periods=0;
      STZ semicolons,X         ;   semicolons=0;
      STZ exclamations,X       ;   exclamations=0;
      STZ questions,X          ;   questions=0;
      
      while_loop:              ;   while(*str_ptr) {
         LDA (str_ptr,X)       ;      A=*str_ptr;
         BEQ .done
         INC str_ptr,X         ;      str_ptr++;
         BNE .no_carry
            INC str_ptr+1,X
         .no_carry:
         CMP #','              ;      if (A==',')
         BNE .not_comma
            INC commas,X       ;         commas++;
            BRA while_loop
         .not_comma:
         CMP #'.'              ;      else if (A==',')
         BNE .not_period
            INC periods,X      ;         periods++;
            BRA while_loop
         .not_period:
         CMP #';'              ;      else if (A==';')
         BNE .not_semicolon
            INC semicolons,X   ;         semicolons++;
            BRA while_loop
         .not_semicolon:
         CMP #'!'              ;      else if (A=='!')
         BNE .not_exclamation
            INC exclamations,X ;         exclamations++;
            BRA while_loop
         .not_exclamation:
         CMP #'?'              ;      else if (A=='?')
         BNE .not_question
            INC questions,X    ;         questions++;
            BRA while_loop
         .not_question:
         BRA while_loop        ;   }
      .done:
      
      CALL PrintPunctuation, commas, periods, semicolons, exclamations, questions
                               ;PrintPunctuation(commas,periods,semicolons,exclamations,questions);
   END_FUNC                    ;}

   CALL CountPunctuation, test_str
   JMP *
   test_str:
   FCB "Hi! Three commas, a period, a question mark, and two exlamations. Wow! Right?",0


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 22, 2020 12:06 am 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8428
Location: Southern California
If you're pulling the coordinates and color out of the blue, then yes, you'd put them on the data stack before calling the function; but in real life, these will probably have been derived in a previous function that left them on the data stack, so there's no extra transferring to do.

Druzyek wrote:
In Forth it would be like this:
Code:
: DrawPixel ( x y color -- )
   -ROT 8 LSHIFT SCREEN_ADDRESS + + c! ;
20 30 COLOR_BLUE DrawPixel

How about:
Code:
: DrawPixel ( x y color -- )
   -ROT  >< +  SCREEN_ADDRESS +  C!  ;

>< is a Forth word that swaps the bytes of a 16-bit cell much more quickly than 8 LSHIFT can move the low byte to the high byte; so for example 00A3 becomes A300. The 65816 even has an instruction for it, so >< becomes:
Code:
        LDA  0,X
        XBA
        STA  0,X

Also, doing the first + sooner keeps the stack shallower.

If you ever have interrupts serviced in Forth (or Forth-like assembly language, using the data stack), you'll want to avoid doing things like -4,X, since that area could get overwritten by an ISR cutting in at unpredictable times. Also, there may be some differences with the 65816, especially if DP is not starting at 0000.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 22, 2020 1:51 am 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
GARTHWILSON wrote:
If you're pulling the coordinates and color out of the blue, then yes, you'd put them on the data stack before calling the function; but in real life, these will probably have been derived in a previous function that left them on the data stack, so there's no extra transferring to do.
Actually, there is extra transferring to do unless you consume the arguments, so the overhead is not less than the way I'm doing it. Consider this:
Code:
: HorizLine ( x y color length -- )
  0 do 3dup DrawPixel rot 1+ -rot loop 3drop ;
The 3dup here is the same as 016 in my example. I think the only advantage for the stack based version happens when the original parameters don't need to be preserved and can be consumed AND the overhead of rearranging the stack for the consuming call is less than the overhead it takes to make the copy.

Quote:
Druzyek wrote:
In Forth it would be like this:
Code:
: DrawPixel ( x y color -- )
   -ROT 8 LSHIFT SCREEN_ADDRESS + + c! ;
20 30 COLOR_BLUE DrawPixel

How about:
Code:
: DrawPixel ( x y color -- )
   -ROT  >< +  SCREEN_ADDRESS +  C!  ;

>< is a Forth word that swaps the bytes of a 16-bit cell much more quickly than 8 LSHIFT can move the low byte to the high byte; so for example 00A3 becomes A300. The 65816 even has an instruction for it, so >< becomes:
Code:
        LDA  0,X
        XBA
        STA  0,X

Also, doing the first + sooner keeps the stack shallower.
Neat! That is a handy one.


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 22, 2020 3:03 am 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8428
Location: Southern California
I thought the Forth stuff was just a parallel, an illustration, since the topic title is "Functions in assembly." When you're doing it in assembly (including writing custom primitives for Forth), you can use things on the ZP data stack without DUPing or DROPping them. You could make DrawPixel not consume the arguments, so you can call it over and over without the continual overhead. You could also write a primitive that does your "rot 1+ -rot" with nothing but an INC 5,X, a single assembly-language instruction. If it were a Forth primitive, I might write it something like
Code:
CODE  INC_3OS     ( a b c -- a+1 b c )
   INC  5,X
   JMP  NEXT

or, if it were STC Forth, just inline the single assembly-language instruction. This is for the '816 with A in 16-bit mode, so the '02 will take a little more if there's a need to go far enough that the low byte would roll over and you have to increment the high byte too. I take it that that would not be necessary though for drawing a horizontal line like you show. Forth lets you form the language to what you want it to be (in far more ways than this).

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 22, 2020 4:11 am 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
GARTHWILSON wrote:
[color=#000000]I thought the Forth stuff was just a parallel, an illustration, since the topic title is "Functions in assembly." When you're doing it in assembly (including writing custom primitives for Forth), you can use things on the ZP data stack without DUPing or DROPping them. You could make DrawPixel not consume the arguments, so you can call it over and over without the continual overhead.
Right, Forth is just an illustration. The way I'm proposing seems a little more efficient than either Forth or "Functions in assembly" using a Forth-style stack. Maybe I don't see what you mean about there not being continual overhead. Isn't the overhead similar whether you consume the arguments or not? For example:
Code:
: DrawPixel ( x y color -- )
   -ROT  >< +  SCREEN_ADDRESS +  C! ;
: HorizLine ( x y color length -- )
  0 do 3dup DrawPixel INC_3OS loop 3drop ;
vs
Code:
: DrawPixel ( x y color -- x+1 y color)
   3dup -ROT  >< +  SCREEN_ADDRESS +  C! INC_3OS ;
: HorizLine ( x y color length -- )
  0 do DrawPixel loop 3drop ;

How do you reduce the overhead (in assembly)?


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 22, 2020 4:58 am 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8428
Location: Southern California
Hmmm... several possibilities. See if this one would work. It's for '02 (not '816). Data stack cells are assumed to be two bytes each, even if you're only using the low byte. It does use self-modifying code, where an STA-absolute instruction's operand is a variable that's written to before you get there. :D There's the "SCREEN_ADR-1" because the byte does not get stored when Y gets down to 0. If it's a problem (like you really do need to be able to do 256 dots, from Y=255 down to 0, inclusive), another instruction could be added to take care of it.
Code:
HorizLine: ( x y color length -- )   ; I assume X is where it starts, and you keep the length short enough to not overrun the end.
     CLC
     LDA  7,X               ; Get X val, 8-bit val, ignoring high byte of data stack cell,
     ADC  #>(SCREEN_ADR-1)  ; and add it to the screen array ADL.
     STA  1$ + 1            ; Low byte byte first.

     LDA  5,X               ; Get Y val
     ADC  #<(SCREEN_ADR-1)  ; and add it to the screen array ADH.
     STA  1$+2              ; Store high byte.

     LDA  3,X               ; Get color in A.
     LDY  1,X               ; Use Y for the looping control, so we can leave X as the data stack pointer.
 1$: STA  $1234,Y           ; SMC!  The operand got fixed above.  Store the color in the pixel byte of the array.
     DEY
     BNE  1$

     TXA                    ; Remove stack items.
     CLC
     ADC  #8
     TAX

     RTS
 ;-------------

It doesn't give a separate DrawPixel function, but overall it's quite a bit shorter than the DrawPixel-HorizLine combination, even without including Forth headers, and of course it's much faster. It could be made a Forth primitive with very little modification too. Actually, for STC Forth, it may not require any modification at all.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 40 posts ]  Go to page Previous  1, 2, 3

All times are UTC


Who is online

Users browsing this forum: dmsc and 32 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: