Functions in assembly

Druzyek · Post by **Druzyek** » Sat Jul 05, 2014 6:05 am

Quote:

DrawLine needs X1, Y1, X2, Y2 and color. Five parameters. BUT three of them are the same as DrawPixel. So we can re-use those same locations:

I see how doing clever things like that can help you a lot. When I start a program I always try to ask myself what might possibly change, like if I suddenly needed to draw trapezoids instead of rectangles. That would be a fairly trivial change to make in a C program but it looks like I would just have to do a major rewrite if I did it in assembly.

Quote:

If a subroutine calls only level-0 ("leaf" subroutines, it is a level-1 subroutine.

Right, the problem I see is that once a level 2 subroutine calls a level 1 subroutine, that level 1 subroutine cannot call any subroutine that might be used at level 2. I suppose that function could go ahead and save a few things when it is called since it doesn't know which level it is being called from. It sounds like you just have to do a lot of mental bookkeeping to keep stuff like that straight.

Quote:

A little bit, but you also save some by not having to step over return addresses to handle data.

I see how convenient that would be. It seems like it would make up for what you lose.

joe7 · Post by **joe7** » Thu Jul 10, 2014 7:42 pm

It's pretty easy to make your own stack, there is a little overhead but it's not bad. The bonus is that you can make a 256-position stack but with 16/24/32 bits pretty easily, if your application needs larger numbers.

In the java compiler (java_grinder) I used a 16-bit stack with the low bytes at 0xC000, and high bytes at 0xC100. A zero-page location is the "stack pointer", and is loaded into the Y register for indexing into the stack. Then the processor stack is left available for jsr/rts, and used for nothing else.

Macros from java_grinder:

Code: Select all

#define PUSH_LO \
  fprintf(out, "; PUSH_LO\n"); \
  fprintf(out, "  ldy SP\n"); \
  fprintf(out, "  sta stack_lo,y\n")

#define PUSH_HI \
  fprintf(out, "; PUSH_HI\n"); \
  fprintf(out, "  ldy SP\n"); \
  fprintf(out, "  sta stack_hi,y\n"); \
  fprintf(out, "  dec SP\n")

#define POP_HI \
  fprintf(out, "; POP_HI\n"); \
  fprintf(out, "  inc SP\n"); \
  fprintf(out, "  ldy SP\n"); \
  fprintf(out, "  lda stack_hi,y\n"); \

#define POP_LO \
  fprintf(out, "; POP_LO\n"); \
  fprintf(out, "  ldy SP\n"); \
  fprintf(out, "  lda stack_lo,y\n")

There is not that much overhead, even with loading the "stack pointer" from memory. Many variations of this idea are possible.

Druzyek · Post by **Druzyek** » Sun Aug 17, 2014 7:47 pm

I've been trying different ways of organizing the zero page and passing parameters using macros. What do you think of a system like this? Any suggestions for improvement?

;ZERO PAGE
;========
;S0-S15: one byte registers for general use inside subroutines
;A0-A7: one byte registers for passing one byte arguments
;D0-D3: two byte registers for passing two byte arguments

.proc main
;Copy x1, x2 ,y1, y2 to A0-A3
SetArg x1, x2, y1, y2
;Copy #<SCREEN_ADDRESS to D0
;Copy #>SCREEN_ADDRESS to D0+1
;Copy color to D1, color+1 to D1+1
SetDbl #SCREEN_ADDRESS, color
JSR DrawLine
.endproc

.proc DrawLine
;Push A, X, and Y
;Assign aliases but don't push values
;x1, y1, x2, y2 refer to S0-S3
;x_pos, y_pos refer to S4-S5
;scr_ptr refers to S6 (S7 skipped)
;color refers to S8 (S9 skipped)
;Push a 0 on to the stack
AliasOnly x1, x2, y1, y2, x_pos, y_pos, scr_ptr, ,color,
;Copy A0-A3 to x1, x2, y1, y2 (S0-S3)
GetArg x1, x2, y1, y2
;Copy D0 to scr_ptr, D0+1 to scr_ptr
;Copy D1 to color, D1+1 to color
GetDbl scr_ptr, color

;Do calculations

;Copy x_pos(S4) to A0, y_pos(S5) to A1
SetArg x_pos, y_pos
;Copy color(2 bytes) to D0, scr_ptr(2 bytes) to D1
SetDbl color, scr_ptr
JSR DrawPixel
;Pop number of pushed arguments from AliasOnly(0 here)
;Pop Y, X, and A
;RTS
Ret
.proc

.proc DrawPixel
;Push A, X, and Y
;Push S registers as needed and assign aliases
;Push S0 and S1 to stack
;calculated_address refers to S0 (S1 skipped)
;Push S2 to stack
;counter refers to S2
;Push a 3 on to the stack because 3 bytes pushed
AliasPush calculated_address,,counter
;x_pos refers to A0, y_pos refers to A1
;Skip copying to S registers and just use A registers
;directly since DrawPixel doesn't pass anything.
GetArgRef x_pos, y_pos
;scr_ptr refers to D0, color refers to D1
GetDblRef scr_ptr, color
;Pop number of pushed arguments from AliasPush(3 here) to S0-S2
;Pop Y, X, and A
;RTS
Ret
.endproc

teamtempest · Post by **teamtempest** » Tue Aug 19, 2014 2:06 pm

I guess my first reaction is that while the concept of aliasing a block of memory with names convenient to whatever you happen to be doing at the moment is in principle a fine idea, as implemented your macros - I assume they are macros - don't seem to be saving you much work. First it appears the aliasing has to be done separately for each procedure, and second it seems they have to be heavily commented to remind yourself what they do.

Unless the commenting is for our benefit, not yours. But in that case it might be better to show us the definitions of the macros, which could be as heavily commented as anyone could like.

The breakdown into subtasks seems reasonable, with DrawPixel at the bottom and DrawLine built on top of that. It's not quite clear where the actual work gets done, though.

Druzyek · Post by **Druzyek** » Tue Aug 19, 2014 5:54 pm

Quote:

First it appears the aliasing has to be done separately for each procedure

Yes, each procedure would have it's own variable names. That isn't what you would expect?

Quote:

It's not quite clear where the actual work gets done, though.

I thought you would get the point that this is just an example but I went ahead and added some ellipses (...) to show where other work would be done. Does that make it clearer for you now?

Quote:

Unless the commenting is for our benefit, not yours.

The commenting is for you. Maybe this is more readable:

.proc main
;Passing arguments
SetArg x1, x2, y1, y2
SetDbl #SCREEN_ADDRESS, color
JSR DrawLine
...
.endproc

.proc DrawLine
;Declaring variables
AliasOnly x1, x2, y1, y2, x_pos, y_pos, scr_ptr, ,color,
;Fetching arguments
GetArg x1, x2, y1, y2
GetDbl scr_ptr, color
...
;Passing arguments
SetArg x_pos, y_pos
SetDbl color, scr_ptr
JSR DrawPixel
...
Ret
.proc

.proc DrawPixel
;Push registers and declare variables
AliasPush calculated_address,,counter
;Get references to passed arguments
GetArgRef x_pos, y_pos
GetDblRef scr_ptr, color
...
Ret
.endproc

Quote:

But in that case it might be better to show us the definitions of the macros

They are fairly trivial, which is why I just commented the code, but here you go if you would like to see:

;Push the S registers used in subroutines and
;assign aliases(identifiers) to them
.macro AliasPush v0,v1,v2,v3,v4,v5,v6,v7
PHA
PHX
PHY
;X counts how many bytes were pushed
LDX #$00
.if (.paramcount>0)
.ifnblank v0
v0=S0
.endif
LDA S0
PHA
;Increment X because one byte pushed
INX
.endif

...

.if (.paramcount>7)
.ifnblank v7
v7=S7
.endif
LDA S7
PHA
INX
.endif
;Push the count of bytes pushed so Ret macro can pop them
PHX
.endmacro

;Assign aliases to S registers used in subroutines
.macro AliasOnly v0,v1,v2,v3,v4,v5,v6,v7
PHA
PHX
PHY
.ifnblank v0
v0=S0
.endif

...

.ifnblank v7
v7=S7
.endif
;Push a zero to the stack so that the Ret macro does not try
;to pop any registers, since we didn't push any
LDA #$00
PHA
.endmacro

;Restore S register pushed by AliasPush
;Also restores A, X, and Y
.macro Ret
.local Loop
.local Done
;Get the count of registers to pop. Y will be counter.
PLY
Loop:
;Stop looping when no registers left
BEQ Done
DEY
;Pop register
PLA
STA S0,Y
CPY #$00
BNE Loop
Done:
PLY
PLX
PLA
RTS
.endmacro

;Copy values into A registers used for passing 1-byte arguments
.macro SetArg v0,v1,v2,v3,v4,v5,v6,v7
;Save a copy of A
PHA
;Transfer v0 to A0 (argument register)
.ifnblank v0
LDA v0
STA A0
.endif

...

.ifnblank v7
LDA v7
STA A7
.endif
;Restore A to what it was before macro
PLA
.endmacro

;Copy values into D registers used for passing 2-byte arguments
.macro SetDbl v0,v1,v2,v3
;Save a copy of A
PHA
.ifnblank v0
;If v0 is an immediate
.if (.match(.left(1,{v0}),#))
;Store the low byte of v0 in D0(2 byte argument register)
LDA #<(.right(.tcount({v0})-1,{v0}))
STA D0
;Store the high byte of v0 in D0+1
LDA #>(.right(.tcount({v0})-1,{v0}))
STA D0+1
;else, v0 is not an immediate
.else
LDA v0
STA D0
LDA v0+1
STA D0+1
.endif
.endif

...

.ifnblank v3
.if (.match(.left(1,{v3}),#))
LDA #<(.right(.tcount({v3})-1,{v3}))
STA D3
LDA #>(.right(.tcount({v3})-1,{v3}))
STA D3+1
.else
LDA v3
STA D3
LDA v3+1
STA D3+1
.endif
.endif
;Restore A to what it was before macro
PLA
.endmacro

;Assign variable names to the argument registers directly.
;This is faster than pushing S registers and copying A registers
.macro GetArgRef v0,v1,v2,v3,v4,v5,v6,v7
.ifnblank v0
v0=A0
.endif

...

.ifnblank v7
v7=A7
.endif
.endmacro

;Copy argument registers to pre-existing variables
.macro GetArg v0,v1,v2,v3,v4,v5,v6,v7
.ifnblank v0
LDA A0
STA v0
.endif

...

.ifnblank v7
LDA A7
STA v7
.endif
.endmacro

;Assign variable names to the argument registers directly.
;This is faster than pushing S registers and copying D registers
.macro GetDblRef v0,v1,v2,v3
.ifnblank v0
v0=D0
.endif

...

.ifnblank v3
v3=D3
.endif
.endmacro

;Copy argument registers to pre-existing variables
.macro GetDbl v0,v1,v2,v3
.ifnblank v0
LDA D0
STA v0
LDA D0+1
STA v0+1
.endif

...

.ifnblank v3
LDA D3
STA v3
LDA D3+1
STA v3+1
.endif
.endmacro

These ellipses also stand for code that I left out on purpose. I hope that is not confusing for you.

In general, I was hoping to hear what you guys think of this as a concept to organize memory. Do you do something similar with macros or do you have a better system?

EDIT: Formatting

GARTHWILSON · Post by **GARTHWILSON** » Tue Aug 19, 2014 10:25 pm

Although I have not done anything in graphics, my own experience started similarly, as I wanted to have certain ZP general variables, plus ZP spaces for passing parameters. I would get too many to fit in ZP and of course had to use non-ZP ones for some, but then had to move them to ZP for certain address modes. Running out of space, I would try to double-up on some, giving two or more different variable names to the same address, being careful to make sure the two uses were not needed at the same time. It led to bugs, especially in multitasking, because you forget and then change something, and one routine ends up stepping on data that another one is not finished with yet. I would tend to get tied up in knots, producing confusing code that was very hard to maintain.

This problem was eliminated by having the separate page-1 return stack and ZP data stack in Forth, as mentioned earlier. I can still seamlessly integrate anything I want to in assembly where maximum performance is needed, but as they say, "A little assembly goes a long way," meaning that getting nearly maximum overall performance for the application usually does not require doing very much of your application in optimized assembly.

I was going to try to present a pattern to follow, but I'll have to think about it more. Meanwhile, you'll probably learn the most by just doing it, while always reaching for higher levels of abstraction. For example, even in your code above, you can condense it a lot with macros which for example replace

Code: Select all

        LDA  v0
        STA  D0
        LDA  v0+1
        STA  D0+1

with

Code: Select all

        COPY2  v0, to, D0

which assembles exactly the same thing but makes the source code a lot shorter and more intuitive.

I have an article on using macros to do program structures, at http://wilsonminesco.com/StructureMacros/index.html. (It also mentions two other forum members' assemblers that they offer which have the capability already integrated).

teamtempest · Post by **teamtempest** » Tue Aug 19, 2014 11:47 pm

Quote:

In general, I was hoping to hear what you guys think of this as a concept to organize memory. Do you do something similar with macros or do you have a better system?

Mmm,generally I myself tend to save and restore memory only as a last resort, not a first resort. Your way does have the advantage of removing a lot of mental overhead - once the macros are properly set up, you don't have to think about it - at the expense of additional time and space taken by the expanded code. It's a trade-off programmers have to make for themselves.

Just as a comment on one of your macros, trivial as it may be:

Code: Select all

;Restore S register pushed by AliasPush
;Also restores A, X, and Y
.macro Ret
.local Loop
.local Done
;Get the count of registers to pop. Y will be counter.
PLY
Loop:
;Stop looping when no registers left
BEQ Done
DEY
;Pop register
PLA
STA S0,Y
CPY #$00
BNE Loop
Done:
PLY
PLX
PLA
RTS
.endmacro

You might consider re-writing it something like this:

Code: Select all

;Restore S register pushed by AliasPush
;Also restores A, X, and Y
.macro Ret
.local Loop
.local Done
;Get the count of registers to pop. X will be counter.
PLX
;anything to do?
BEQ Done
Loop:
;Pop register
PLA
;Adjust index (and set Z-flag)
DEX
;zero page X-indexing is slightly shorter and faster than absolute Y-indexing
STA S0,X
;'STA' does not change Z-flag, so no additional test is necessary
BNE Loop
Done:
PLY
PLX
PLA
RTS
.endmacro

Movax12 · Post by **Movax12** » Thu Aug 21, 2014 6:24 pm

An example of what I have working in ca65 (the macro language is quite powerful once you really get the hang of it..):
Needs a bit of polishing, but works well as is.

In the 'header' file: (this file would have to be included in both the library, and the main source (like in C) )

Code: Select all

declarefunc writeVRAMbufferwithAddr, { sourceaddrLow x, sourceAddrHi y, dataLength a, destAddress .addr }

In the library/implementation:
The macro here will do a basic check that the parameters take up the same space.

Code: Select all

func writeVRAMbufferwithAddr , { sourceaddrLow x, sourceAddrHi y, dataLength a, destAddress .addr }

    ; destAddress is the address in PPU memory space (intended to be nametable)
    
    locals
        sourceaddress   .addr
        saveDataLen      .byte
    endlocals

    sta local::saveDataLen
    stx local::sourceaddress
    sty local::sourceaddress + 1
    
    tay
    
    ; length sent only is data, need to add dest address (2) and length(1)

    mb dataLength := dataLength + vramBufferIndex + #3

    ; write FF to end of buffer now since we are at the end of the buffer
    tax

    mb  vramBuffer[ x ] := #$FF    
    stx vramBufferIndex ; set index
    
    dey
    
    repeat ; count back from end of string
        dex
        mb vramBuffer[ x ] := (local::sourceaddress)[ y ] 
    until dey == negative
    
    lda local::saveDataLen 
    
    mb  vramBuffer[ x - 1 ] := a
    mb  vramBuffer[ x - 2 ] := param::destAddress[ 0 ]
    mb  vramBuffer[ x - 3 ] := param::destAddress[ 1 ]
    
    return
    
endfunc

In use, something like:

Code: Select all

call writeVRAMbufferwithAddr, { sourceaddrLow: #<string_label , sourceAddrHi: #>string_label, dataLength: #( .strlen(_string_) + 3 ), destAddress: #screenLoc}

Everything uses user defined zeropage space for parameters and local variables.

barrym95838 · Post by **barrym95838** » Fri Aug 22, 2014 2:22 am

Maybe it's just a personal limitation of mine, or a personal matter of taste, but I'm happier without all the bells and whistles. If an assembler can store labels, do a bit of operand math, and resolve forward references, I'm satisfied. If I want to do something fancy, I use a fancier language, or comment the standard assembly language as liberally as I feel is necessary.

Mike

BigDumbDinosaur · Post by **BigDumbDinosaur** » Fri Aug 22, 2014 4:43 am

barrym95838 wrote:

Maybe it's just a personal limitation of mine, or a personal matter of taste, but I'm happier without all the bells and whistles. If an assembler can store labels, do a bit of operand math, and resolve forward references, I'm satisfied. If I want to do something fancy, I use a fancier language, or comment the standard assembly language as liberally as I feel is necessary.

Dunno what you consider to be bells and whistles, but a strong macro language can automate repetitive coding tasks and in the case of function calls that take a lot of parameters, help to insulate you from the uncouthness of the raw assembly language required to make the call. Also, macros take some of the tedium out of writing lengthy assembly language programs. I couldn't imagine not using macros.

barrym95838 · Post by **barrym95838** » Fri Aug 22, 2014 5:37 am

BigDumbDinosaur wrote:

... a strong macro language can automate repetitive coding tasks and in the case of function calls that take a lot of parameters, help to insulate you from the uncouthness of the raw assembly language required to make the call.

That's just it, big guy. I don't want to be insulated when I'm writing assembly, with the few exceptions that I mentioned. To me, that's the difference between an assembler and a compiler. I know that the line between them can be a bit blurry, but I still like to keep them separate in my work. If I need a compiled or interpreted language to express my ideas more easily, that's what I employ. When I want that "bare-metal" feel, I whip out the assembler and get to work.

Quote:

Also, macros take some of the tedium out of writing lengthy assembly language programs. I couldn't imagine not using macros.

A good source editor, with a usable select, copy, cut, and paste help me quite a bit, since I'm not a particularly strong typist. Of course, I've never written any source larger than about 4KB, so my experience with lengthy programs is a bit limited.

Like I said, my opinion is probably in the minority here, but when I want low-level, that's exactly what I do, machine instruction for machine instruction. Productivity might be an issue, but only if it was a job rather than a hobby.

Mike

[Edit: I should have said that I've never written anything that assembles to larger than about 4KB in binary form ... ]

GARTHWILSON · Post by **GARTHWILSON** » Fri Aug 22, 2014 6:17 am

Being hooked on macros myself, I can never pass up the opportunity to put in a plug for them. Going to a higher-level language brings a performance hit. Macros don't (assuming they're done right). Macros don't take away any of your control either—they just make it so you don't have to tell it every single time how to do something. They let you write code that's just as efficient, but do it faster and with fewer bugs, and end up with something that's easier to maintain later. Speaking of bugs, one landed on the monitor right near the word as I typed it.

Druzyek · Post by **Druzyek** » Fri Aug 22, 2014 3:11 pm

I also think macros are heaven sent. You said you are good with copy and paste. Macros are basically a more advanced form of that.

teamtempest, thanks for the heads up about X vs Y registers. It looks like your macros even have data types. That's really impressive.

BigDumbDinosaur · Post by **BigDumbDinosaur** » Fri Aug 22, 2014 5:25 pm

barrym95838 wrote:

I don't want to be insulated when I'm writing assembly, with the few exceptions that I mentioned.

When I use the word "insulated," I'm referring to avoiding having to spell out every little detail multiple times in a large program.

GARTHWILSON wrote:

[Macros] let you write code that's just as efficient, but do it faster and with fewer bugs, and end up with something that's easier to maintain later.

Let me give you an example of what Garth means.

One of the functions in my 65C816 string manipulation library is one that pads and justifies a character string. It takes five parameters, which have to be pushed to the stack in a certain order. The macro that may be used to call the function is invoked as follows:

Code: Select all

         strpad string1,string2,l,j,fc
         bcs error

where strpad is the macro name (and the name of the subroutine that the macro calls). In lieu of the macro, I would have to write the following for each call to strpad:

Code: Select all

         pea #fc            ;pad character
         pea #j             ;justification type
         per l              ;pointer to desired length
         per s2ptr          ;STRING2's pointer (source)
         per s1ptr          ;STRING1's pointer (destination)
         jsr strpad         ;execute function
         bcs error

The above is what the macro expands to when assembled. Needless to say, I use the macro whenever I can. It's still assembly language, but doesn't demand as much of the programmer, and doesn't bloat the program.

Druzyek · Post by **Druzyek** » Sat Aug 23, 2014 2:47 am

I tried something along these lines but it doesn't work if address is defined after the macro. What do you do?

.maco BEQL address
.local label
BNE label
JMP address
label
.endmacro

Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly

Re: Functions in assembly