6502.org • View topic - 65C816 PROGRAMMING TIPS & TRICKS

View unanswered posts | View active topics

Board index » 6502.org Users Forum » Programming

All times are UTC

65C816 PROGRAMMING TIPS & TRICKS

Page 1 of 4

[ 58 posts ]

Go to page 1, 2, 3, 4 Next

Previous topic | Next topic

Author

Message

BigDumbDinosaur

Post subject: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Mon Feb 16, 2015 6:19 am

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA

65C816 PROGRAMMING TIPS & TRICKS

Although not as well documented as the 6502 and 65C02, quite a bit of information is available on the 65C816, especially in the Eyes and Lichty programming manual that is available for download from WDC’s website. If you don’t have a copy of this manual it is highly recommended that you obtain one, even if you are primarily interested in programming the 65(c)02.

Over the period of time in which I’ve been working with my POC units, I’ve come to realize that most of the available 65C816 information doesn’t really cover programming “tricks of the trade” that use the 65C816’s capabilities to the fullest, especially in the realm of operating system programming. All too often, it seems that tutorials and code samples have been adapted from eight bit material and tend to treat the 65C816 as little more than an overgrown 65C02—which it isn’t (even the Eyes and Lichty manual does this to some extent). This is really a disservice to the ’816, as it is far more powerful and flexible than its eight-bit brethren.

This lamentable state of affairs led me to write a tutorial on 65C816 native mode interrupt processing, in the spirit of Garth Wilson’s lucid 6502 interrupt article, and also led me to suggest to Garth that a tips-and-tricks sticky topic for the 65C816 be created here. Garth agree that it would be a good idea, while noting that it won’t initially be sticky. However, if enough material collects to make the topic worthwhile and it gets enough views, that could change. So I’ll start it off by posting the first of what I hope will be many tips and tricks.

NOTE: We want to keep this topic specific to programming the 65C816 while in its native mode. Please don’t muddy the waters with off-topic posts, such as SNES and/or hardware minutia. Comparisons to the 65C02, of course, can be of value, especially in illustrating how a 65C02 algorithm can be simplified and/or made faster when reworked to be specific to the 65C816’s native mode operation.

Anyhow, here goes with the first of what I hope will be many tips and tricks.

TAX

TXA

Ergo

$AD

LDA

Code:

W65C816S NATIVE MODE STATUS REGISTER DEFINITIONS

   nvmxdizc
   ||||||||
   |||||||+———> 1 = carry set/generated
   ||||||+————> 1 = result zero
   |||||+—————> 1 = IRQ disabled
   ||||+——————> 0 = binary arithmetic
   ||||         1 = decimal arithmetic
   |||+———————> 0 = 16 bit .X & .Y
   |||          1 = 8 bit .X & .Y
   ||+————————> 0 = 16 bit accumulator & memory
   ||           1 = 8 bit accumulator & memory
   |+—————————> 1 = sign overflow
   +——————————> 1 = result negative

REP

SEP

REP

SEP

REP #%11111111

SEP #%11111111

SEP #%00000000

%00100000

%00010000

%00110000

REP

SEP

REP #%00101001

REP

SEP

SEP #%00001000

SED

SEP #%00010000

Code:

;   Register Size Macros
;
longa    .macro                ;16 bit accumulator & memory
         rep #%00100000
         .endm
;
longr    .macro                ;16 bit all registers
         rep #%00110000
         .endm
;
longx    .macro                ;16 bit index registers
         rep #%00010000
         .endm
;
shorta   .macro                ;8 bit accumulator & memory
         sep #%00100000
         .endm
;
shortr   .macro                ;8 bit all registers
         sep #%00110000
         .endm
;
shortx   .macro                ;8 bit index registers
         sep #%00010000
         .endm

REP #%00100000

longa

$00

Code:

         rep #%00110000      ;16 bit .A, .X & .Y
         ldx #$1234          ;load .X w/16 bit constant
         sep #%00010000      ;8 bit .X & .Y
         txa                 ;make a copy of .X

         ...do some other stuff...

         rep #%00010000      ;16 bit .X & .Y
         tax                 ;restore .X...so we think!

$0034

$1234

SEP #%00010000

$00

TXA

$0034

$1234

* * *
————————————————————
EDIT: Updated macro examples to reflect improvements made to the Kowalski assembler by Daryl rictor (8Bit).

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Last edited by BigDumbDinosaur on Mon Nov 20, 2023 11:14 pm, edited 2 times in total.

Top

Martin_H

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Tue Feb 17, 2015 12:58 am

Joined: Wed Jan 08, 2014 3:31 pm
Posts: 578

Thanks for posting this. I never worked with the 65816, and someday I plan to build a retro computer with one to remedy that. So these sort of posts keep me curious.

Top

granati

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Tue Feb 17, 2015 5:42 pm

Joined: Mon Jun 24, 2013 8:18 am
Posts: 83
Location: Italy

In some cases can be useful to simulate an JSL [$XXXX] istruction (jump soubroutine long indirect), where $XXXX is a location in bank 0 that hold the (long) address of a soubroutine in the usual order ($XXXX, $XXXX+1 -> address, $XXXX+2 -> bank):

Code:

; JSL [$XXXX] simulation
PHK
PEA #RETADDR-1
JML [$XXXX]

RETADDR:
....

Ofcourse the soubroutine will return with RTL instruction.

_________________
http://65xx.unet.bz/ - Hardware & Software 65XX family

Top

BigDumbDinosaur

Post subject: 65C816 PROGRAMMING TIPS & TRICKS: Tres Acumuladores

Posted: Tue Feb 17, 2015 6:35 pm

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA

The 6502 and 65C02 have one eight bit accumulator. The 65C816 has three.

What, you say? It's true...sort of. The 65C816 can have two 8 bit accumulators and a 16 bit accumulator—three accumulators en toto! :lol:

Like the 65C02, the 65C816 has an A-accumulator, which I will refer to as .A to avoid having to type a lot. The '816 also has a B-accumulator, aka .B, a design feature that Bill Mensch, Jr. evidently borrowed from the Motorola 6800. Both accumulators are 8 bits wide, just like the lone accumulator in the 65C02. How these accumulators appear to the rest of the system depends on the state of the status register's m bit.

When m is 1, which is the default state after power-on or reset, an instruction such as LDA $1234 will load an 8 bit value from $1234 into .A, just like the 65C02—the load will have no effect on .B. Setting m to 0 will cause the 65C816 to gang .B to .A, creating a new 16-bit wide accumulator that is referred to as .C. In this case, LDA $1234 will cause the byte at $1234—the least significant byte or LSB—to be loaded into .A and the byte at $1235—the most significant byte or MSB—to be loaded into .B, resulting in a 16 bit load. An extra clock cycle is used to load the MSB, which is characteristic of all 16 bit loads and stores.

Similarly, when the accumulator is set to 16 bits, instructions that act directly on it, such as ASL or DEC, act on all 16 bits. For example, consider the following:

Code:
rep #%00100000 ;16 bit accumulator & memory
lda #%0000000010000000 ;$0080
asl a ;left shift a bit
sta $1234 ;save result

When the above is executed the value $00 will be stored at $1234 and $01 will be stored at $1235, as the ASL A instruction shifted all 16 bits.

Also, consider the following:

Code:

         rep #%00100000        ;16 bit accumulator & memory
         lda #$0000            ;16 bit load
         dec a                 ;decrement accumulator

When the above is executed .C will contain $FFFF.

You may well be wondering what happens if the m bit is changed to 1 after a 16 bit load. The accumulator reverts to being an 8 bit register, now referred to as .A, but .B is not affected and retains whatever was in the MSB of the value in .C. For example:

Code:

         rep #%00100000        ;16 bit accumulator & memory
         lda #$BBAA            ;load .C with $BBAA
         sep #%00100000        ;8 bit accumulator & memory

When the above is executed, .A will contain $AA and despite the accumulator having been changed to 8 bits, .B will contain $BB. Hence your program can perform 16 bit loads, but act on the data 8 bits at a time, as is often required during I/O operations.

Even though .B is "hidden" when m is 1, it is still accessible by using the XBA (eXchange B with A) instruction. Viewed logically, XBA makes an internal copy of .B, writes .A into .B and then writes the internal copy of .B into .A. As with all other instructions that load the accumulator, XBA will affect the N and Z bits in the status register. Hence the value that was in .B when XBA is executed is what will set or clear N and Z.

XBA can be used whether the accumulator is 8 or 16 bits wide. In the latter case, XBA can be used to reverse the endianess of a word stored in memory, e.g.:

Code:

         rep #%00100000        ;16 bit accumulator & memory
         lda #$1234            ;random 16 bit number
         sta $5678             ;store it
         ...do other stuff
         lda $5678             ;load $1234 into .C
         xba                   ;swap bytes &...
         sta $5678             ;save

When the above is executed, $5678 will contain $12 and $5679 will contain $34.

This 16 bit business also applies to read-modify-write (R-M-W) instructions, such as INC $1234, or ROL $89AB, when the m bit is 0. For example, consider this contrived code:

Code:

         rep #%00100000        ;16 bit accumulator & memory
         lda #$7fff
         sta $1234
         inc $1234

When the above is executed, $1234 will contain $00 and $1235 will contain $80, since the INCrement instruction acted on a word, not a byte. Needless to say, be careful if you use a R-M-W instruction on a chip register, as you may inadvertently change an adjacent register if m is 0.

Similarly, using BIT on memory when m is 0 causes bits 14 and 15 to correspond to the V and N status register bits, respectively, not bits 6 and 7, as would be the case when m is 1. Also, BIT immediate takes a 16 bit operand, e.g., BIT #%0011000000000000.

There are a few instructions that will ignore the m bit and generate a 16 bit transfer in all cases. They are:

TCD: copies .C to DP (the 16 bit direct page register).
TCS: copies to .C to SP (the 16 bit stack pointer).
TDC: copies DP to .C.
TSC: copies SP to .C.

TDC and TSC can be sneaky, as they will overwrite .B even though m is 1. :shock:

Needless to say, you can easily introduce a bug if you forget about this behavior.

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Last edited by BigDumbDinosaur on Tue Dec 04, 2018 6:25 pm, edited 3 times in total.

Top

BigDumbDinosaur

Post subject: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Tue Feb 17, 2015 6:46 pm

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA

granati wrote:

Code:

; JSL [$XXXX] simulation
PHK
PEA #RETADDR-1
JML [$XXXX]

RETADDR:
....

Ofcourse the soubroutine will return with RTL instruction.

For the benefit of those that are just learning the 65C816 assembly language, JML [$XXXX], which is also written as JMP [$XXXX] in assemblers that support both syntactical styles, is like JMP ($XXXX), except, as noted by granati, the location $XXXX contains a 24 bit address, not a 16 bit address. JML is a mnemonic for JuMp Long, meaning jump to code in a different bank than the one in which the instruction is located.

The PEA #RETADDR-1 instruction pushes the return address -1 to the stack. Since the synthesized JML [$XXXX] instruction takes the '816 to another bank, the return bank also has to be pushed, which is what the PHK instruction does.

granati's trick also illustrates some of the 65C816's abilities in using the stack. Future tips and tricks will probably discuss the ease at which the '816 can perform stack acrobatics.

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Top

scotws

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Wed Feb 18, 2015 9:04 am

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany

Fantastic stuff, guys. Thank you!

Top

theGSman

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Wed Feb 18, 2015 9:38 am

Joined: Mon Jan 26, 2015 6:19 am
Posts: 85

Having 16-bit X and Y registers is set to make for much more powerful addressing modes.

Assuming that the addresses $64:$65 and the Y register point to the same memory location (and we are confining ourselves to the one 64K page), we can replace code such as

Code:

LDY #5
LDA ($64),Y

with the much simpler

Code:

LDA 5,Y

Of course, indirect-indexed addressing modes can still be used with the 16-bit Y register which is a very powerful addressing mode indeed. Re-entrancy and PIC is set to be much easier to achieve with the 65C816.

Top

barrym95838

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Wed Feb 18, 2015 10:03 am

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA

Nice point, theGSman. A 16-bit-indexed LDA 5,Y is one of the 68xx qualities that I find very useful and endearing, but the 65xx is still my first love, and I'm glad that the 65c816 makes it possible. I personally find the '816 mode bits to be rather irksome, but it's clear that it was the most compact path to backward compatibility at the machine-code level.

Mike B.

Top

granati

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Wed Feb 18, 2015 10:50 am

Joined: Mon Jun 24, 2013 8:18 am
Posts: 83
Location: Italy

theGSman wrote:

Code:

LDY #5
LDA ($64),Y

with the much simpler

Code:

LDA 5,Y

Of course not need that the effective address will be confined just in the current data bank, because the absolute indirect access can propagate on the next bank without changing data bank register (and can be used the X register too, of course). Just in direct page addressing the effective address will be confined to bank 0.
The powerful of the 16 bit indirect addressing can be exploited in the manipulation of parameters passed on the stack (or in locals variables on the stack): the stack frame pointer can be passed to nested soubroutine for easy access to parameters/locals, as in high level languages.

Code:

; assuming cpu is in 16 bit mode
sec
tsc
sbc #SIZE ; create a local variable of size SIZE bytes
tcs
tax
inx
; now X register point to beginning of local variable on the stack
; and this pointer can be passed to any soubroutine
; of course data bank register must point to bank 0 !
jsr somewhere
...
tsc ; restore stack pointer
clc
adc #SIZE ; clean locals
tcs
...
rts

somewhere:
....
lda !0,x ; access locals 
...
rts

Another powerful way to access parameters/locals on stack is the direct page addressing:

Code:

; assuming cpu is in 16 bit mode
sec
tsc
sbc #SIZE ; create a local variable of size SIZE bytes
tcs
inc a
phd ; save current direct page register
tcd  ; now direct page register point to base of locals
....
jsr somewhere
...
pld ; restore direct page register
...

somewhere:
; here not need that data bank register point to bank 0
lda <0 ; access locals with direct page addressing 
...
rts 

These methods are useful when access with stack relative addressing (i.e. LDA $XX,S) is not useful (example: a string/array in local). The manipulation of the stack is very powerful for reentrancy.

_________________
http://65xx.unet.bz/ - Hardware & Software 65XX family

Top

granati

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Wed Feb 18, 2015 11:23 am

Joined: Mon Jun 24, 2013 8:18 am
Posts: 83
Location: Italy

Passing parameters on stack like in C-style functions is easy:

Code:

; assuming CPU is in 16 bit mode
pea #SOMEWHAT ; parameter N
pei ($XX) ; parameterr N-1
...
...
pha ; parameter 1
; SIZE = number of bytes pushed on stack
jsr somewhere
tsc
clc
adc #SIZE ; clean stack
tcs
...

somewhere:
; at stack offset +1 and +2 we have the return address of calling routine
; so parameters begin at stack offset +3
lda 3,s ; access parameter 1

If the soubroutine is located in another bank and we call with JSL instruction, parameters begin at stack offset +4.
Clean the stack in called soubroutine is more complex.

_________________
http://65xx.unet.bz/ - Hardware & Software 65XX family

Top

granati

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Wed Feb 18, 2015 1:39 pm

Joined: Mon Jun 24, 2013 8:18 am
Posts: 83
Location: Italy

A possible way to cleanup stack in the called subroutine.

Code:

; assuming the calling routine push on the stack M bytes before 
; make a JSL (inter-bank) call to subroutine
; called subroutine must save in the stack P and DBR registers
; 
; stack frame (S is the stack pointer register)
;   ---------
;   |  PRM  |   S+06+M-1
;   ---------
;       ...
;   ---------
;   |  PRM  |   S+06
;   ---------
;   |  PBR  |   S+05
;   ---------
;   |  PCH  |   S+04
;   ---------   
;   |  PCL  |   S+03
;   ---------
;   |   P   |   S+02
;   ---------
;   |  DBR  |   S+01
;   ---------
;
; PBR is the program bank register saved by JSL istrunction
; PCL & PCH is the return address
; PRM is the beginning of parameters
; parameters can be accessed at stack offset +06

subroutine:
   php   ; need to save the 8/16 bit status of registers
   phb   ; need to save current data bank register
   ...
   lda   $06,s   
   ...
; for stack cleanup we move K=5 bytes from S+05 to S+01, to S+06+M-1
   rep   #$31          ; A,X,Y 16 bit + clear carry
   tsc         ; C = stack pointer
   adc   #K           ; 5 in this case
   tax         ; source pointer for data move  
   adc   #M      ; add params bytes count
   tay         ; dest pointer for data move
   lda   #K-1      ; move bytes count-1
   mvp   #0, #0   ; move previous
   tya         ; new stack pointer
   tcs
   plb
   plp
   rtl

We need to take in account the count M of bytes pushed on stack as parameters, and the count K of bytes pushed on stack after parameters. This epilogue code is a few complex but work fine. The mvp instruction leave the data bank register modified (=$00), so need to save old DBR.

_________________
http://65xx.unet.bz/ - Hardware & Software 65XX family

Top

granati

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Wed Feb 18, 2015 6:26 pm

Joined: Mon Jun 24, 2013 8:18 am
Posts: 83
Location: Italy

As example of application of the above tips, here a possible implementation of the function strlen() for high level Language.
The LONGA and LONGI directives need to assembler for assemble 16 bit immediate costants.

Code:

; implementation of the C-like function strlen(char *ptr)
; the function return in C register the lenght of the 
; string zero-terminated pointed by ptr (long pointer 24 bit)
; function save the register Y (16 bit) - X is not used
; for hold the result the accumulator C too will be pushed
; the long pointer ptr is passed on stack
;
; stack frame at beginning of function
;   ---------
;   | ptr+2 |   S+06
;   ---------
;   | ptr+1 |   S+05
;   ---------
;   |  ptr  |   S+04
;   ---------
;   |  PBR  |   S+03
;   ---------
;   |  PCH  |   S+02
;   ---------   
;   |  PCL  |   S+01
;   ---------

strlen:
   php      ; save status for registers size
   phb      ; save current data bank register
   rep   #$30   ; A,X,Y -> 16 bit
   pha      ; for hold the result
   phy      ; save Y (16 bit)
   sep   #$20   ; A -> 8 bit, X,Y -> 16 bit
   
; stack frame after pushing P, DBR, and A,Y register (16 bit)
;   ---------
;   | ptr+2 |   S+0C
;   ---------
;   | ptr+1 |   S+0B
;   ---------
;   |  ptr  |   S+0A
;   ---------
;   |  PBR  |   S+09
;   ---------
;   |  PCH  |   S+08
;   ---------   
;   |  PCL  |   S+07
;   ---------
;   |   P   |   S+06
;   ---------
;   |  DBR  |   S+05
;   ---------
;   |   B   |   S+04
;   ---------
;   |   A   |   S+03
;   ---------
;   |  YH   |   S+02
;   ---------
;   |  YL   |   S+01
;   ---------
; 
; note that M = 3 (count of parameters bytes) and N = 9 
; N is the count of total bytes pushed on stack after parameters 
; equates for access stack parameter:

M   .SET   3   ; parameters bytes count
N   .SET   9   ; bytes pushed on stack after parameters
ptr   .SET   $0A   ; is the offset of long pointer
creg   .SET   $03   ; for store result

; now we can start
   .LONGI
   lda   ptr+2,s   ; the bank that old string
   pha
   plb      ; set the right data bank register
   ldy   #0   ; string index
_loop:   lda   (ptr,s),y   ; access to byte of string
   beq   _end   ; end of string
   iny
   bne   _loop
   dey      ; max. length = $FFFF
_end:   rep   #$31   ; A,X,Y 16 bit - clear carry
   tya
   sta   creg,s  ; save result
   
; epilogue code
   .LONGA
   tsc
   adc   #N
   tax      ; source address of move
   adc   #M
   tay      ; destination address of move
   lda   #N-1   ; count of bytes to move - 1
   mvp   #0, #0   ; move previous
   tya      ; the new stack pointer
   tcs
   ply      ; restore registers and status
   pla      ; this contain the result
   plb
   plp
   rtl
   

; for call the function (assuming CPU in 8 bit mode)
; the starting long address of string is bank:address
   lda   #bank
   pha
   pea   #address
   jsl   strlen
   ...
   
; now C register contain the lenght of the string

_________________
http://65xx.unet.bz/ - Hardware & Software 65XX family

Top

BigDumbDinosaur

Post subject: 65C816 PROGRAMMING TIPS & TRICKS: Stack Tricks

Posted: Wed Feb 18, 2015 6:42 pm

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA

granati's posts about using the 65C816 hardware stack as a "scratchpad" is somewhat more advanced than what has been previously written. However, they help to illustrate just how much of a step up the 65C816 is over the 65C02.

I don't want to get too far ahead of the path I have been following in this topic, but I will post an excerpt from one of the functions in my string processing library to show you how relatively complex stack management is possible with the '816 without having to wear down the end of your fingers whilst typing. This except includes stack frame definitions, preamble code required to create stack workspace, and postamble code to clean up the stack when processing has been completed. Much of it is automatic.

Code:

;================================================================================
;
;strsub: COPY SUBSTRING INTO STRING: strsub STRING1,STRING2,I,N
;
;   ————————————————————————————————————————————————————————————————————————
;   This function copies N characters from STRING1 to STRING2, starting at
;   index I & overwriting STRING2.
;   ————————————————————————————————————————————————————————————————————————

   ...some text edited out...

;   Calling syntax: PER NPTR          ;N's pointer
;                   PER IPTR          ;I's pointer
;                   PER S2PTR         ;STRING2's pointer
;                   PER S1PTR         ;STRING1's pointer
;                   JSR STRSUB        ;execute function

   ...some text edited out...

;—————————————————————————————————————————————————————————
;EPHEMERAL DEFINITIONS
;
.s_byte  =1                    ;size of a byte
.s_word  =2                    ;size of a word
;
;
;   65C816 register sizes...
;
.s_mpudb =.s_byte              ;data bank
.s_mpudp =.s_word              ;direct page
.s_mpupb =.s_byte              ;program bank
.s_mpupc =.s_word              ;program counter
.s_mpusp =.s_word              ;stack pointer
.s_mpusr =.s_byte              ;status
;
;
;   status register bits...
;
.sr_car  =@00000001            ;C — carry
.sr_bdm  =@00001000            ;D — decimal
.sr_irq  =@00000100            ;I — IRQ
.sr_neg  =@10000000            ;N — result negative
.sr_ovl  =@01000000            ;V — sign overflow
.sr_zer  =@00000010            ;Z — result zero
.sr_amw  =@00100000            ;m — accumulator/memory size
.sr_ixw  =@00010000            ;x — index sizes
;
;
;   stack definitions...
;
.sfbase  .= 1                  ;base stack index
.sfidx   .= .sfbase            ;workspace index
;
;—————————> workspace stack frame start <—————————
.nchar   =.sfidx               ;chars to shift
.sfidx   .= .sfidx+.s_word
.s1len   =.sfidx               ;STRING1's length
.sfidx   .= .sfidx+.s_word
;—————————> workspace stack frame end <—————————
;
.s_wsf   =.sfidx-.sfbase       ;stack workspace size
.w_wsf   =.s_wsf/.s_word       ;stack workspace words
.sfbase  .= .sfidx
;
;—————————> register stack frame start <—————————
.reg_c   =.sfidx               ;.C
.sfidx   .= .sfidx+.s_word
.reg_x   =.sfidx               ;.X
.sfidx   .= .sfidx+.s_word
.reg_y   =.sfidx               ;.Y
.sfidx   .= .sfidx+.s_word
.reg_db  =.sfidx               ;DB
.sfidx   .= .sfidx+.s_mpudb
.reg_sr  =.sfidx               ;SR
.sfidx   .= .sfidx+.s_mpusr
.reg_pc  =.sfidx               ;PC
.sfidx   .= .sfidx+.s_mpupc
;—————————> register stack frame end <—————————
;
.s_rsf   =.sfidx-.sfbase       ;register stack frame size
.sfbase  .= .sfidx
;
;—————————> parameter stack frame start <—————————
.s1ptr    =.sfidx              ;STRING1's pointer
.sfidx   .= .sfidx+.s_word
.s2ptr    =.sfidx              ;STRING2's pointer
.sfidx   .= .sfidx+.s_word
.iptr    =.sfidx               ;I's pointer
.sfidx   .= .sfidx+.s_word
.nptr    =.sfidx               ;N's pointer
.sfidx   .= .sfidx+.s_word
;—————————> parameter stack frame end <—————————
;
.s_psf   =.sfidx-.sfbase       ;parameter stack frame size
;
;
;   error flags...
;
.er_bol  =.sr_zer              ;bank span
.er_idx  =.sr_ovl              ;index range
.er_stl  =.sr_neg              ;string length
.er_bits =.er_bol|.er_idx|.er_stl ;mask
;—————————————————————————————————————————————————————————

The above excerpt shows how local constants are created and the stack frames are defined. As each function in the string library is capable of being used without reference to any other string library function, all definitions have to be self-contained. Incidentally, the .= operator in the Kowalski simulator's assembler defines a symbol whose value can be changed with a subsequent .= assignment, this being a very useful feature that all assemblers should have.

The next excerpt is the code that defines the local workspace on the stack:

Code:

         rep #.er_bits|.sr_car ;initialize MPU status &...
         php                   ;save machine state
         longr                 ;16 bit registers
         phb
         phy
         phx
         pha
         .if .def(.s_wsf)      ;if workspace is defined...
             sec
             tsc               ;get current stack pointer, ...
             sbcw .s_wsf       ;create workspace &...
             tcs               ;set new stack pointer
         .endif

The variable .s_wsf was set in the stack frame definitions and since this function uses local workspace, .s_wsf will be non-zero, causing the code bounded by .if and .endif to be assembled. The instruction sbcw .s_wsf uses a macro (sbcw) to synthesize 16 bit immediate mode subtraction—the Kowalski assembler doesn't know anything about the 65C816.

Following the above, the body of the function does its work. When ready to exit, the following code is executed to clean up the stack and restore state prior to returning. The stack cleanup removes the local workspace and then realigns the stack to discard the call parameters stack frame that was created before invoking the function.

Code:

.done    longr                 ;common exit point
         .if .s_wsf || .s_psf  ;clean up stack as necessary
             clc
             tsc
             .if .s_wsf        ;if workspace was defined...
                 adcw .s_wsf   ;discard it by...
                 tcs           ;adjusting stack pointer
             .endif
             .if .s_psf        ;if a call parameter frame...
                 adcw .s_rsf   ;was defined...
                 tax
                 adcw .s_psf   ;remove it &...
                 tay
                 ldaw .s_rsf-1 ;replace it with...
                 mvp 0,0       ;the register frame &...
                 tya           ;
                 tcs           ;return address
             .endif
         .endif
         pla                   ;restore MPU state &...
         plx
         ply
         plb
         plp
         rts                   ;return to caller

The instructions adcw and ldaw again are macros that synthesize 16 bit immediate mode, due to the assembler not being 16 bit capable.

In the above code, the stack pointer is adjusted upward to discard the local workspace and then the MVP (block copy positive) instruction, which is unique to the '816, is used to copy the register stack frame up by the number of bytes that were pushed prior to calling the function. This sequence discards the user parameter stack frame and correctly positions the register frame so that when the registers are pulled from the stack, the return address will be the last thing on the stack and a normal return will occur.

As i frequently employ this structure in code that I write, I have a function skeleton that I use to avoid all that typing. Here it is:

Code:

;===============================================================================
;
;<funcname>: <SUBROUTINE TITLE>
;
;   ————————————————————————————————————————————————————————————————————————
;   Calling syntax: 
;
;   Exit registers: .A: 
;                   .B: 
;                   .X: 
;                   .Y: 
;                   DB: 
;                   DP: 
;                   PB: 
;                   SR: NVmxDIZC
;                       ||||||||
;                       |||||||+———> 
;                       ||||||+————> 
;                       |||||+—————> 
;                       ||||+——————> 
;                       |||+———————> 
;                       ||+————————> 
;                       |+—————————> 
;                       +——————————> 
;
;   Notes: 
;
;   Examples: 
;   ————————————————————————————————————————————————————————————————————————
;
funcname ;*** this line intentionally has no code ***;
;
;—————————————————————————————————————————————————————————
;LOCAL DECLARATIONS
;
.s_byte  =1                    ;size of a byte
.s_word  =2                    ;size of a word
.s_dword =4                    ;size of a double word
;
;
;   65C816 register sizes...
;
.s_mpudb =.s_byte              ;data bank
.s_mpudp =.s_word              ;direct page
.s_mpupb =.s_byte              ;program bank
.s_mpupc =.s_word              ;program counter
.s_mpusp =.s_word              ;stack pointer
.s_mpusr =.s_byte              ;status
;
;
;   status register bits...
;
.sr_car  =@00000001            ;C — carry
.sr_zer  =@00000010            ;Z — result zero
.sr_irq  =@00000100            ;I — IRQ
.sr_bdm  =@00001000            ;D — decimal
.sr_amw  =@00100000            ;m — accumulator/memory size
.sr_ixw  =@00010000            ;x — index sizes
.sr_ovl  =@01000000            ;V — sign overflow
.sr_neg  =@10000000            ;N — result negative
;
;
;   stack definitions...
;
.sfbase  .= 1                  ;base stack index
.sfidx   .= .sfbase            ;workspace index
;
;—————————> workspace stack frame start <—————————
;
;   * * * enter workspace definitions here * * *
;
;——————————> workspace stack frame end <——————————
;
.s_wsf   =.sfidx-.sfbase       ;workspace stack frame size
.sfbase  .= .sfidx
;
;———————> MPU register stack frame start <————————
.reg_c   =.sfidx               ;.C
.sfidx   .= .sfidx+.s_word
.reg_x   =.sfidx               ;.X
.sfidx   .= .sfidx+.s_word
.reg_y   =.sfidx               ;.Y
.sfidx   .= .sfidx+.s_word
.reg_db  =.sfidx               ;DB
.sfidx   .= .sfidx+.s_mpudb
.reg_dp  =.sfidx               ;DP
.sfidx   .= .sfidx+.s_mpudp
.reg_sr  =.sfidx               ;SR
.sfidx   .= .sfidx+.s_mpusr
.reg_pc  =.sfidx               ;PC
.sfidx   .= .sfidx+.s_mpupc
;————————> MPU register stack frame end <—————————
;
.s_rsf   =.sfidx-.sfbase       ;register stack frame size
.sfbase  .= .sfidx
;
;—————————> parameter stack frame start <—————————
;
;* * * enter call parameter definitions here * * *
;
;——————————> parameter stack frame end <——————————
;
.s_psf   =.sfidx-.sfbase       ;parameter stack frame size
;—————————————————————————————————————————————————————————
;
         php                   ;save MPU state
         phd                   ;DP
         phb                   ;PB
         longr
         phy
         phx
         pha
         cld                   ;ensure binary mode
         .if .s_wsf            ;create workspace if defined
             sec
             tsc
             sbcw .s_wsf
             tcs
         .endif
;
.main    ;*** code body goes here ***;
;
.done    longr                 ;common exit point
         .if .s_wsf || .s_psf  ;clean up stack as necessary
             cld               ;ensure binary mode
             clc
             tsc
             .if .s_wsf        ;if workspace was defined...
                 adcw .s_wsf   ;discard it by...
                 tcs           ;adjusting stack pointer
             .endif
             .if .s_psf        ;if a call parameter frame...
                 adcw .s_rsf   ;was defined...
                 tax
                 adcw .s_psf   ;remove it &...
                 tay
                 ldaw .s_rsf-1 ;replace it with...
                 mvp 0,0       ;the register frame &...
                 tya           ;
                 tcs           ;return address
             .endif
         .endif
         pla                   ;restore MPU state
         plx
         ply
         plb
         pld
         plp
         rts                   ;return to caller
;
   .end

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Top

BigDumbDinosaur

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Wed Feb 18, 2015 6:55 pm

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA

granati wrote:

; for call the function (assuming CPU in 8 bit mode)
; the starting long address of string is bank:address

Code:

   lda   #bank
   pha
   pea   #address
   jsl   strlen
   ...

; now C register contain the lenght of the string[/code]

I would push the target data bank with PEA and just fetch the LSB from the stack to load PB. PEA #BANK is no faster than LDA #BANK -- PHA, but has the advantage of not "touching" the accumulator, or requiring the setting of the m bit if the accumulator is currently set to 16 bits. Also, pushing all parameters as words instead of bytes tends to simplify stack management within the function.

I have a pseudo-instruction called PEL that pushes a double word (DWORD) to take care of this sort of thing.

Code:

;   PEL Pseudo-Instruction
;   ————————————————————————————————————————————————————————————————————
;   PEL is an analog of PEA that pushes a  32-bit  little-endian operand
;   to the stack.  The operand is always resolved to 32 bits, regardless
;   of actual value.
;   ————————————————————————————————————————————————————————————————————
;
pel      .macro .op            ;PEL <operand>
         .byte $f4
         .word .op >> 16       ;MSW
         .byte $f4
         .word .op & $ffff     ;LSW
         .endm

PEL $12789A would take care of pushing the string address with bank in a single instruction.

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Last edited by BigDumbDinosaur on Mon Nov 20, 2023 11:26 pm, edited 1 time in total.

Top

granati

Post subject: Re: 65C816 PROGRAMMING TIPS & TRICKS

Posted: Wed Feb 18, 2015 7:58 pm

Joined: Mon Jun 24, 2013 8:18 am
Posts: 83
Location: Italy

BigDumbDinosaur

Of course i posted just few simple examples for educational use, for beginners, and for show the power of the 65816 in stack managment and direct page addressing. Obviously the best way is to use your set of macros.
About the fact if is better push Always 16 bit value, i think yes, even if 65816 don't suffer of word alignment problem. For example the PEI instruction too is useful for push 16 bit variables, especially if parameters are no constants. Anyway sometimes happen that variable parameters are pushed with registers (for example local variables that are on stack). Anyway the fact that stack can be addressed with direct page addressing too make the PEI istruction very powerful.

_________________
http://65xx.unet.bz/ - Hardware & Software 65XX family

Top

Page 1 of 4

[ 58 posts ]

Go to page 1, 2, 3, 4 Next

Board index » 6502.org Users Forum » Programming

All times are UTC

Who is online

Users browsing this forum: No registered users and 9 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum