Functions in assembly
Re: Functions in assembly
That code should work. Be aware, however that if the macro is used inside a different scope than the destination label, the parameter 'address' might not be resolved to what you expect. I think this may be the problem.
- BigDumbDinosaur
- Posts: 9427
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Functions in assembly
Druzyek wrote:
I tried something along these lines but it doesn't work if address is defined after the macro. What do you do?
.maco BEQL address
.local label
BNE label
JMP address
label
.endmacro
.maco BEQL address
.local label
BNE label
JMP address
label
.endmacro
Code: Select all
strpad .macro .s1,.s2,.l,.j,.pc ;copy, pad & justify string
pea .pc
pea .j
per .l
per .s2
per .s1
jsr strpad
.endmCode: Select all
strpad padbuf,strgbuf,padlen,.pl,.fcCode: Select all
pea .fc
pea .pl
per padlen
per strgbuf
per padbuf
jsr strpad
.endmx86? We ain't got no x86. We don't NEED no stinking x86!
Re: Functions in assembly
After 5+ years since my original post, I think I have macros to do exactly what I want. (Just an illustration. The following could be written more simply.)
BYTE and WORD reserve space on the X-based stack in zero page and create symbols for the variables. BEGIN_ARGs marks the variables as incoming arguments and BEGIN_VARS marks them as local variables. END_VARS does the actual allocation on the stack by pushing X then does a series of DEXs or uses TXA and SBC to adjust the stack pointer depending on how much memory is needed. END_FUNC restores the stack pointer with PLX.
CALL copies the value of xpos and ypos to the memory allocated with BEGIN_ARGS in CalcXY. This is the neat part. Each function defined with FUNC adds information to a string that stores the number and type of arguments the function takes. With this info, the CALL macro knows that xpos and ypos are X-based arguments and generates "LDA xpos,X" when it copies it to the argument memory of CalcXY. It also knows whether those arguments are bytes or words, so if CalcXY expects its X and Y arguments to be words but xpos is a byte, the macro will zero the top byte of the argument. On the other hand, if X and Y are bytes but xpos and ypos were words, the macro would recognize this and only copy the lower byte of the word to the incoming argument.
MOV.W and MOV.B also know the type of their arguments and generate code depending on that. ret_val is a zero page address where all functions can return a value. It's up to the caller to copy that value if needed (here it could be used as a pointer but copying it to gfx_ptr as an example).
In addition to BYTE and WORD there is also ZPBYTE and ZPWORD that copy some memory out of zero page to the hardware stack so the addresses will be free to use when the extra overhead of doing so makes sense. Here's my function for clearing the screen:
Edit: Another thing is a macro called "halt" that optionally prints the value of some variables then does BRK. Like the other macros, it knows whether to print a byte or word and also gives the name of the variable.
Code: Select all
FUNC DrawPixel
BEGIN_ARGS
BYTE xpos, ypos, color
BEGIN_VARS
WORD gfx_ptr
END_VARS
CALL CalcXY, xpos, ypos
MOV.W ret_val, gfx_ptr
MOV.B color, (gfx_ptr,X)
END_FUNC
CALL copies the value of xpos and ypos to the memory allocated with BEGIN_ARGS in CalcXY. This is the neat part. Each function defined with FUNC adds information to a string that stores the number and type of arguments the function takes. With this info, the CALL macro knows that xpos and ypos are X-based arguments and generates "LDA xpos,X" when it copies it to the argument memory of CalcXY. It also knows whether those arguments are bytes or words, so if CalcXY expects its X and Y arguments to be words but xpos is a byte, the macro will zero the top byte of the argument. On the other hand, if X and Y are bytes but xpos and ypos were words, the macro would recognize this and only copy the lower byte of the word to the incoming argument.
MOV.W and MOV.B also know the type of their arguments and generate code depending on that. ret_val is a zero page address where all functions can return a value. It's up to the caller to copy that value if needed (here it could be used as a pointer but copying it to gfx_ptr as an example).
In addition to BYTE and WORD there is also ZPBYTE and ZPWORD that copy some memory out of zero page to the hardware stack so the addresses will be free to use when the extra overhead of doing so makes sense. Here's my function for clearing the screen:
Code: Select all
FUNC clrscr
BEGIN_ARGS
BYTE color
BEGIN_VARS
ZPWORD gfx_ptr
BYTE rows
END_VARS
MOV.B #128,rows
MOV.W #SCREEN_ADDRESS,gfx_ptr
LDA color,X
LDY #0
.loop_outer:
.loop_inner:
STA (gfx_ptr),Y
DEY
BNE .loop_inner
INC gfx_ptr+1 ;ie gfx_ptr+=256; which happens to be screen width
DEC rows,X
BNE .loop_outer
END_FUNC
CALL clrscr, #COLOR_BLUE- GARTHWILSON
- Forum Moderator
- Posts: 8774
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Functions in assembly
Druzyek, I might not be understanding you correctly (I've been trying to figure this out, looking at the last page), but in your talking about pushing parameters on a ZP stack, it looks like you're not taking full advantage of what such a stack can do. A function should not have to receive parameters and push them onto a stack. It should be able to receive them already on the stack, left by the previous function(s) which left them there, and work with them there without exercising any overhead of copying them from one place to another. The exception would be if the routine needs local variables that are used internally and not part of the input or output.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: Functions in assembly
Hi Garth, I'm not sure I see what you mean there. When you have parameters on a stack in Forth or assembly treating the stack in the same way, you will still have to copy them eventually to do anything useful with them. They get copied from the stack to the stack with something like DUP or OVER, and in my example they are also copied from the stack to the stack. Here is a slightly modified example (The screen is 256x128 with one byte per pixel so the X,Y to address calculation is just SCREEN_ADDRESS+(y<<8)+x):
This disassembles into something like this:
In Forth it would be like this:
The code to copy the literals to the stack at 016 knows to copy the first value 5 bytes below the stack pointer since it knows that DrawPixel will adjust the stack pointer down by 5 at 006. This means the value of #20 will be at 0,X after the adjustment. 0,X in DrawPixel is xpos, which is where we want #20 to be.
Note that the arguments don't have to be immediates. If DrawPixel were called with arguments that were variables inside a function defined with BEGIN_VARS, then the LDA / STA pairs at 016 would be copying from zp,X to zp,X like you have with DUP and OVER.
I haven't figured out how many cycles the Forth version would take but just the overhead of DEX / INX would make it less efficient. I think the difference is much more noticeable in cases where you need to juggle many variables in something stack-based and those variables are accessed many times in a function/word. For example, keeping track of five counters like the following is unwieldy in Forth, whereas you get a big speedup by making room on the stack for them then not touching the stack pointer while you loop through possibly thousands of characters in the string you're searching.
Code: Select all
FUNC DrawPixel
BEGIN_ARGS
BYTE xpos, ypos, color
BEGIN_VARS
WORD gfx_ptr
END_VARS
MOV.W #SCREEN_ADDRESS, gfx_ptr
LDA xpos,X
STA gfx_ptr,X ;256x128 screen so low byte is x coord
LDA gfx_ptr+1,X
CLC
ADC ypos,X
STA gfx_ptr+1,X ;high byte is #>SCREEN_ADDRESS+ypos
MOV.B color,(gfx_ptr,X)
END_FUNC
CALL DrawPixel, #20, #30, #COLOR_BLUECode: Select all
001: FUNC DrawPixel
DrawPixel:
002: BEGIN_ARGS
003: BYTE xpos, ypos, color
xpos set 0
ypos set 1
color set 2
004: BEGIN_VARS
005: WORD gfx_ptr
gfx_ptr set 3
006: END_VARS
PHX ;save stack pointer
DEX ;room on stack for xpos
DEX ;room on stack for ypos
DEX ;room on stack for color
DEX ;room on stack for gfx_ptr (low byte)
DEX ;room on stack for gfx_ptr (high byte)
007: MOV.W #SCREEN_ADDRESS, gfx_ptr
LDA #<SCREEN_ADDRESS
STA 3,X ;3=gfx_ptr
LDA #>SCREEN_ADDRESS
STA 4,X ;4=gfx_ptr+1
008: LDA xpos,X
LDA 0,X
009: STA gfx_ptr,X
STA 3,X
010: LDA gfx_ptr+1,X
LDA 4,X
011: CLC
012: ADC ypos,X
ADC 1,X
013: STA gfx_ptr+1,X
STA 4,X
014: MOV.B color,(gfx_ptr,X)
LDA 2,X ;2=color
STA (3,X)
015: END_FUNC
PLX ;restore stack
RTS
016: CALL DrawPixel, #20, #30, #COLOR_BLUE
LDA #20
STA -5,X
LDA #30
STA -4,X
LDA #COLOR_BLUE
STA -3,X
;-1,X and -2,X left for gfx_ptr
JSR DrawPixelCode: Select all
: DrawPixel ( x y color -- )
-ROT 8 LSHIFT SCREEN_ADDRESS + + c! ;
20 30 COLOR_BLUE DrawPixelNote that the arguments don't have to be immediates. If DrawPixel were called with arguments that were variables inside a function defined with BEGIN_VARS, then the LDA / STA pairs at 016 would be copying from zp,X to zp,X like you have with DUP and OVER.
I haven't figured out how many cycles the Forth version would take but just the overhead of DEX / INX would make it less efficient. I think the difference is much more noticeable in cases where you need to juggle many variables in something stack-based and those variables are accessed many times in a function/word. For example, keeping track of five counters like the following is unwieldy in Forth, whereas you get a big speedup by making room on the stack for them then not touching the stack pointer while you loop through possibly thousands of characters in the string you're searching.
Code: Select all
FUNC CountPunctuation ;void CountPunctuation(char *str_ptr) {
BEGIN_ARGS
WORD str_ptr
BEGIN_VARS
BYTE commas ; char commas;
BYTE periods ; char periods;
BYTE semicolons ; char semicolons;
BYTE exclamations ; char exclamations;
BYTE questions ; char questions;
END_VARS
STZ commas,X ; commas=0;
STZ periods,X ; periods=0;
STZ semicolons,X ; semicolons=0;
STZ exclamations,X ; exclamations=0;
STZ questions,X ; questions=0;
while_loop: ; while(*str_ptr) {
LDA (str_ptr,X) ; A=*str_ptr;
BEQ .done
INC str_ptr,X ; str_ptr++;
BNE .no_carry
INC str_ptr+1,X
.no_carry:
CMP #',' ; if (A==',')
BNE .not_comma
INC commas,X ; commas++;
BRA while_loop
.not_comma:
CMP #'.' ; else if (A==',')
BNE .not_period
INC periods,X ; periods++;
BRA while_loop
.not_period:
CMP #';' ; else if (A==';')
BNE .not_semicolon
INC semicolons,X ; semicolons++;
BRA while_loop
.not_semicolon:
CMP #'!' ; else if (A=='!')
BNE .not_exclamation
INC exclamations,X ; exclamations++;
BRA while_loop
.not_exclamation:
CMP #'?' ; else if (A=='?')
BNE .not_question
INC questions,X ; questions++;
BRA while_loop
.not_question:
BRA while_loop ; }
.done:
CALL PrintPunctuation, commas, periods, semicolons, exclamations, questions
;PrintPunctuation(commas,periods,semicolons,exclamations,questions);
END_FUNC ;}
CALL CountPunctuation, test_str
JMP *
test_str:
FCB "Hi! Three commas, a period, a question mark, and two exlamations. Wow! Right?",0- GARTHWILSON
- Forum Moderator
- Posts: 8774
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Functions in assembly
If you're pulling the coordinates and color out of the blue, then yes, you'd put them on the data stack before calling the function; but in real life, these will probably have been derived in a previous function that left them on the data stack, so there's no extra transferring to do.
How about:
>< is a Forth word that swaps the bytes of a 16-bit cell much more quickly than 8 LSHIFT can move the low byte to the high byte; so for example 00A3 becomes A300. The 65816 even has an instruction for it, so >< becomes:
Also, doing the first + sooner keeps the stack shallower.
If you ever have interrupts serviced in Forth (or Forth-like assembly language, using the data stack), you'll want to avoid doing things like -4,X, since that area could get overwritten by an ISR cutting in at unpredictable times. Also, there may be some differences with the 65816, especially if DP is not starting at 0000.
Druzyek wrote:
In Forth it would be like this:
Code: Select all
: DrawPixel ( x y color -- )
-ROT 8 LSHIFT SCREEN_ADDRESS + + c! ;
20 30 COLOR_BLUE DrawPixelHow about:
Code: Select all
: DrawPixel ( x y color -- )
-ROT >< + SCREEN_ADDRESS + C! ;>< is a Forth word that swaps the bytes of a 16-bit cell much more quickly than 8 LSHIFT can move the low byte to the high byte; so for example 00A3 becomes A300. The 65816 even has an instruction for it, so >< becomes:
Code: Select all
LDA 0,X
XBA
STA 0,XAlso, doing the first + sooner keeps the stack shallower.
If you ever have interrupts serviced in Forth (or Forth-like assembly language, using the data stack), you'll want to avoid doing things like -4,X, since that area could get overwritten by an ISR cutting in at unpredictable times. Also, there may be some differences with the 65816, especially if DP is not starting at 0000.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: Functions in assembly
GARTHWILSON wrote:
If you're pulling the coordinates and color out of the blue, then yes, you'd put them on the data stack before calling the function; but in real life, these will probably have been derived in a previous function that left them on the data stack, so there's no extra transferring to do.
Code: Select all
: HorizLine ( x y color length -- )
0 do 3dup DrawPixel rot 1+ -rot loop 3drop ;Quote:
Druzyek wrote:
In Forth it would be like this:
Code: Select all
: DrawPixel ( x y color -- )
-ROT 8 LSHIFT SCREEN_ADDRESS + + c! ;
20 30 COLOR_BLUE DrawPixelCode: Select all
: DrawPixel ( x y color -- )
-ROT >< + SCREEN_ADDRESS + C! ;Code: Select all
LDA 0,X
XBA
STA 0,X- GARTHWILSON
- Forum Moderator
- Posts: 8774
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Functions in assembly
I thought the Forth stuff was just a parallel, an illustration, since the topic title is "Functions in assembly." When you're doing it in assembly (including writing custom primitives for Forth), you can use things on the ZP data stack without DUPing or DROPping them. You could make DrawPixel not consume the arguments, so you can call it over and over without the continual overhead. You could also write a primitive that does your "rot 1+ -rot" with nothing but an INC 5,X, a single assembly-language instruction. If it were a Forth primitive, I might write it something like
or, if it were STC Forth, just inline the single assembly-language instruction. This is for the '816 with A in 16-bit mode, so the '02 will take a little more if there's a need to go far enough that the low byte would roll over and you have to increment the high byte too. I take it that that would not be necessary though for drawing a horizontal line like you show. Forth lets you form the language to what you want it to be (in far more ways than this).
Code: Select all
CODE INC_3OS ( a b c -- a+1 b c )
INC 5,X
JMP NEXTor, if it were STC Forth, just inline the single assembly-language instruction. This is for the '816 with A in 16-bit mode, so the '02 will take a little more if there's a need to go far enough that the low byte would roll over and you have to increment the high byte too. I take it that that would not be necessary though for drawing a horizontal line like you show. Forth lets you form the language to what you want it to be (in far more ways than this).
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: Functions in assembly
GARTHWILSON wrote:
I thought the Forth stuff was just a parallel, an illustration, since the topic title is "Functions in assembly." When you're doing it in assembly (including writing custom primitives for Forth), you can use things on the ZP data stack without DUPing or DROPping them. You could make DrawPixel not consume the arguments, so you can call it over and over without the continual overhead.
Code: Select all
: DrawPixel ( x y color -- )
-ROT >< + SCREEN_ADDRESS + C! ;
: HorizLine ( x y color length -- )
0 do 3dup DrawPixel INC_3OS loop 3drop ;Code: Select all
: DrawPixel ( x y color -- x+1 y color)
3dup -ROT >< + SCREEN_ADDRESS + C! INC_3OS ;
: HorizLine ( x y color length -- )
0 do DrawPixel loop 3drop ;- GARTHWILSON
- Forum Moderator
- Posts: 8774
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Functions in assembly
Hmmm... several possibilities. See if this one would work. It's for '02 (not '816). Data stack cells are assumed to be two bytes each, even if you're only using the low byte. It does use self-modifying code, where an STA-absolute instruction's operand is a variable that's written to before you get there.
There's the "SCREEN_ADR-1" because the byte does not get stored when Y gets down to 0. If it's a problem (like you really do need to be able to do 256 dots, from Y=255 down to 0, inclusive), another instruction could be added to take care of it.
It doesn't give a separate DrawPixel function, but overall it's quite a bit shorter than the DrawPixel-HorizLine combination, even without including Forth headers, and of course it's much faster. It could be made a Forth primitive with very little modification too. Actually, for STC Forth, it may not require any modification at all.
Code: Select all
HorizLine: ( x y color length -- ) ; I assume X is where it starts, and you keep the length short enough to not overrun the end.
CLC
LDA 7,X ; Get X val, 8-bit val, ignoring high byte of data stack cell,
ADC #>(SCREEN_ADR-1) ; and add it to the screen array ADL.
STA 1$ + 1 ; Low byte byte first.
LDA 5,X ; Get Y val
ADC #<(SCREEN_ADR-1) ; and add it to the screen array ADH.
STA 1$+2 ; Store high byte.
LDA 3,X ; Get color in A.
LDY 1,X ; Use Y for the looping control, so we can leave X as the data stack pointer.
1$: STA $1234,Y ; SMC! The operand got fixed above. Store the color in the pixel byte of the array.
DEY
BNE 1$
TXA ; Remove stack items.
CLC
ADC #8
TAX
RTS
;-------------It doesn't give a separate DrawPixel function, but overall it's quite a bit shorter than the DrawPixel-HorizLine combination, even without including Forth headers, and of course it's much faster. It could be made a Forth primitive with very little modification too. Actually, for STC Forth, it may not require any modification at all.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?