This thread is to document the conversion of that C code to a generic 65c02 build, so the details don't get lost. I'll be posting snippets of code and eventually a complete listing here; please feel free to jump in if I'm doing anything silly or looking lost. I believe that once I have a basic structure it should drop together pretty easily.
The code won't be the fastest, but I'm aiming for simplicity; it can be optimised later.
The assembler I am using is as65 from http://www.kingswood-consulting.co.uk/assemblers/ and I invoke it: as65 tiny.asm -h0 -ltiny.lst -s2 -x to produce a listing, which I find handy to check what code is actually being generated; as65 has an automatic internal optimiser which for example will replace 'jmp' with 'bra' if the target is in range, and a few other things.
API
Because I want to copy the C program, I feel it will be easiest to use a c-style API for calling functiond. It makes heavy use of the stack, but I don't think too heavy, so I don't have a separate stack to manage (at the moment; that's always subject to change)
- Functions can take one or more eight or sixteen bit arguments, and return zero or one eight or sixteen bit results in A or Y:A.
- Function arguments are counted from the left:
Code: Select all
result = divide (first, second) ; divide first by second - Sixteen bit arguments are passed in Y:A and on the stack if necessary; if there is more than one argument they are pushed on the stack from the left, with the final argument being placed in Y:A.
- All functions must preserve the X register; it is used as a stack frame pointer for the calling routine. Other registers may be thumped as required.
- Local variables are pushed as required to the stack, generally as sixteen bit words.
- The X register is used as a frame pointer; constant offsets are provided for convenient reference
Code: Select all
c16c : libadd16: ; add TOS to y:a
c16c : da phx ; save caller's stack frame
c16d : ba tsx ; set local frame pointer
c16e : 8556 sta maths1 ; we hold the high byte in y
c170 : 18 clc
c171 : 7d0401 adc CALLER_LO,x
c174 : 8556 sta maths1
c176 : 98 tya
c177 : 7d0501 adc CALLER_HI,x
c17a : a8 tay ; high result back in y
c17b : a556 lda maths1 ; collect low result
c17d : fa plx ; clean the stack
c17e : 60 rts
c17f : libsub16: ; subtract y:a from TOS
c17f : da phx
c180 : ba tsx
c181 : 5a phy
c182 : 48 pha ; build stack frame
c183 : 38 sec
c184 : bd0401 lda CALLER_LO,x ; we need to subtract LOCAL from CAL
c187 : f5ff sbc LOCAL_LO,x
c189 : 95ff sta LOCAL_LO,x
c18b : bd0501 lda CALLER_HI,x
c18e : fd0001 sbc LOCAL_HI,x
c191 : 9d0001 sta LOCAL_HI,x
c194 : 68 pla ; load y:a and restore stack
c195 : 7a ply
c196 : fa plx
c197 : 60 rts
Code: Select all
second argument (2)
return address (2)
x register (1)
local argument (2)
stack pointer here --->
Code: Select all
00ff = LOCAL_LO equ 0xff
0100 = LOCAL_HI equ 0x100
0104 = CALLER_LO equ 0x104
0105 = CALLER_HI equ 0x105
provide an x offset to the argument. Note that the function can see (and directly alter) variables in the caller's stack; I strongly caution against allowing it to do so! Further local variables can be accessed at 0xfe,x, 0xfd,x and so on. I wish WDC had included a stack-pointer-relative addressing mode!
Which brings me to the first question: does an zero page x-indexed memory wrap around, or go where I expect it to? I note that the optimiser has substituted a zero page instruction in the examples above...
Neil