I think "best practices" can be fairly broad with assembly language. As the 65(C)02 has some specific memory areas that are "pre-defined" (Page Zero, Stack, Reset/NMI/IRQ(BRK) vectors) there are certain limitations imposed that you need to work around.
For any assembly code I do for the 65C02, I start by defining a memory map for the space I (think) I need (RAM and ROM), loosely define page zero usage and determine how much stack space the code might require. Any I/O designs also need to be included for register space, buffers, etc.
As for coding it all up, I tend to have most constants, variables, buffers, I/O devices, etc. defined in a separate file that is included in the actual assembly. I also comment what those defines are. Here's a copy of the page zero locations used for my C02 BIOS:
Code:
; - BIOS variables, pointers, flags located at top of Page Zero
BIOS_PG0 .EQU PGZERO_ST+48 ;Start of BIOS page 0 use ($D0-$FF, 48 bytes)
;
; - BRK handler routine
PCL .EQU BIOS_PG0+00 ;Program Counter Low index
PCH .EQU BIOS_PG0+01 ;Program Counter High index
PREG .EQU BIOS_PG0+02 ;Temp Status Reg
SREG .EQU BIOS_PG0+03 ;Temp Stack ptr
YREG .EQU BIOS_PG0+04 ;Temp Y Reg
XREG .EQU BIOS_PG0+05 ;Temp X Reg
AREG .EQU BIOS_PG0+06 ;Temp A Reg
;
; - 28L92 IRQ handler pointers and status
ICNT_A .EQU BIOS_PG0+07 ;Input buffer count
IHEAD_A .EQU BIOS_PG0+08 ;Input buffer head pointer
ITAIL_A .EQU BIOS_PG0+09 ;Input buffer tail pointer
OCNT_A .EQU BIOS_PG0+10 ;Output buffer count
OHEAD_A .EQU BIOS_PG0+11 ;Output buffer head pointer
OTAIL_A .EQU BIOS_PG0+12 ;Output buffer tail pointer
;
ICNT_B .EQU BIOS_PG0+13 ;Input buffer count
IHEAD_B .EQU BIOS_PG0+14 ;Input buffer head pointer
ITAIL_B .EQU BIOS_PG0+15 ;Input buffer tail pointer
OCNT_B .EQU BIOS_PG0+16 ;Output buffer count
OHEAD_B .EQU BIOS_PG0+17 ;Output buffer head pointer
OTAIL_B .EQU BIOS_PG0+18 ;Output buffer tail pointer
UART_IRT .EQU BIOS_PG0+19 ;SC28L92 Interrupt Status byte
;
; - Real-Time Clock variables
; These are repurposed for adding a Realtime clock chip (DS1501/DS1511)
; The Ticks, Seconds, Minutes and Hours remain the same in function.
; The 16-bit Days variable is replaced however.
; - The DAY_DATE is a new variable. To minimize Page Zero usage, it has two functions
; Bits 0-4 represent the days of the Month 1-31
; Bits 5-7 represent the Day of the Week, 1-7 (Saturday=1)
; The Months are handled by the upper 4 bits of the MONTH_YEAR variable
; The Century is handled by a the Year (0-255) and the lower 4 bits of the MONTH_YEAR variable
;
TICKS .EQU BIOS_PG0+20 ;Number of timer countdowns = 1 second (100)
SECS .EQU BIOS_PG0+21 ;Seconds: 0-59
MINS .EQU BIOS_PG0+22 ;Minutes: 0-59
HOURS .EQU BIOS_PG0+23 ;Hours: 0-23
DAY_DATE .EQU BIOS_PG0+24 ;Day: (bits 5-7) Date: (bits 0-4)
MONTH_CENTURY .EQU BIOS_PG0+25 ;Month: (bits 4-7) Century: (bits 0-3)
YEAR .EQU BIOS_PG0+26 ;Century 0-255 plus 4 bits as noted above
RTC_TEMP .EQU BIOS_PG0+27 ;Temp work byte for updating shared variables
;
; - Delay Timer variables
MSDELAY .EQU BIOS_PG0+28 ;Timer delay countdown byte (255 > 0)
SETMS .EQU BIOS_PG0+29 ;Set timeout for delay routines - BIOS use only
DELLO .EQU BIOS_PG0+30 ;Delay value BIOS use only
DELHI .EQU BIOS_PG0+31 ;Delay value BIOS use only
;
; - Count variables for 10ms benchmark timing
MS10_CNT .EQU BIOS_PG0+32 ;10ms Count variable
SECL_CNT .EQU BIOS_PG0+33 ;Seconds Low byte count
SECH_CNT .EQU BIOS_PG0+34 ;Seconds High byte count
;
; - Address and pointers for IDE Interface
LBA_ADDR_LOW .EQU BIOS_PG0+35 ;LBA Transfer Address low byte
LBA_ADDR_HIGH .EQU BIOS_PG0+36 ;LBA Transfer Address high byte
;
LBA_XFER_CNT .EQU BIOS_PG0+37 ;LBA Transfer Count 1-xx (check RAM space!)
LBA_LOW_BYTE .EQU BIOS_PG0+38 ;LBA Block number bits 0-7
LBA_HIGH_BYTE .EQU BIOS_PG0+39 ;LBA Block number bits 8-15
LBA_EXT_BYTE .EQU BIOS_PG0+40 ;LBA Block number bits 16-23
;
BIOS_XFERL .EQU BIOS_PG0+41 ;BIOS Move Routine low byte
BIOS_XFERH .EQU BIOS_PG0+42 ;BIOS Move Routine high byte
BIOS_XFERC .EQU BIOS_PG0+43 ;BIOS Block Count moved (needs to be set)
;
IDE_STATUS_RAM .EQU BIOS_PG0+44 ;IDE RAM-Based Status
;
SPARE_B0 .EQU BIOS_PG0+45 ;Spare byte 0
SPARE_B1 .EQU BIOS_PG0+46 ;Spare byte 1
;
; - Timer/Counter Match flag for Delay/Benchmark
MATCH .EQU BIOS_PG0+47 ;Bit7 used for Delay, Bit6 used for Benchmark
;Bits 3,2,1 used for IDE Interrupt Handler
;
In the Monitor code, some variables are marked as TEMP1, TEMP2, etc., but also includes comments for which routines use them, as in some cases they are shared to conserve page zero space. You as the programmer get to make sure that routines used don't clobber things that are in use by other routines... which really isn't difficult... shown below:
Code:
;
PGZERO_ST .EQU $A0 ;Start of Monitor Page 0 use ($A0-$CF, 48 bytes)
;
BUFF_PG0 .EQU PGZERO_ST+00 ;Default Page zero location for Monitor buffers
;
INBUFF .EQU BUFF_PG0+00 ;Input Buffer - 4 bytes ($A0-$A3)
DATABUFF .EQU BUFF_PG0+04 ;Data Buffer - 6 bytes ($A4-$A9)
;
; - 16-bit variables:
HEXDATAH .EQU PGZERO_ST+10 ;Hexadecimal input
HEXDATAL .EQU PGZERO_ST+11
BINVALL .EQU PGZERO_ST+12 ;Binary Value for HEX2ASC
BINVALH .EQU PGZERO_ST+13
COMLO .EQU PGZERO_ST+14 ;User command address
COMHI .EQU PGZERO_ST+15
INDEXL .EQU PGZERO_ST+16 ;Index for address - multiple routines
INDEXH .EQU PGZERO_ST+17
TEMP1L .EQU PGZERO_ST+18 ;Index for word temp value used by Memdump
TEMP1H .EQU PGZERO_ST+19
TEMP2L .EQU PGZERO_ST+20 ;Index for Text entry
TEMP2H .EQU PGZERO_ST+21
PROMPTL .EQU PGZERO_ST+22 ;Prompt string address
PROMPTH .EQU PGZERO_ST+23
SRCL .EQU PGZERO_ST+24 ;Source address for memory operations
SRCH .EQU PGZERO_ST+25
TGTL .EQU PGZERO_ST+26 ;Target address for memory operations
TGTH .EQU PGZERO_ST+27
LENL .EQU PGZERO_ST+28 ;Length address for memory operations
LENH .EQU PGZERO_ST+29
;
; - 8-bit variables and constants:
BUFIDX .EQU PGZERO_ST+30 ;Buffer index
BUFLEN .EQU PGZERO_ST+31 ;Buffer length
IDX .EQU PGZERO_ST+32 ;Temp Indexing
IDY .EQU PGZERO_ST+33 ;Temp Indexing
TEMP1 .EQU PGZERO_ST+34 ;Temp - Code Conversion routines
TEMP2 .EQU PGZERO_ST+35 ;Temp - Memory/EEPROM/SREC routines - Disassembler
TEMP3 .EQU PGZERO_ST+36 ;Temp - EEPROM/SREC routines
CMDFLAG .EQU PGZERO_ST+37 ;Command Flag, bit specific, used by many routines
OPXMDM .EQU PGZERO_ST+38 ;Saved Opcode/Xmodem Flag variable
;
; - Xmodem transfer variables
CRCHI .EQU PGZERO_ST+39 ;CRC hi byte (two byte variable)
CRCLO .EQU PGZERO_ST+40 ;CRC lo byte - Operand in Disassembler
CRCCNT .EQU PGZERO_ST+41 ;CRC retry count - Operand in Disassembler
PTRL .EQU PGZERO_ST+42 ;Data pointer lo byte - Mnemonic in Disassembler
PTRH .EQU PGZERO_ST+43 ;Data pointer hi byte - Mnemonic in Disassembler
BLKNO .EQU PGZERO_ST+44 ;Block number
;
; - Macro Loop Counter variables
LPCNTL .EQU PGZERO_ST+45 ;Loop Count low byte
LPCNTH .EQU PGZERO_ST+46 ;Loop Count high byte
;
; - Spare Monitor byte for future use
SPARE_M0 .EQU PGZERO_ST+47 ;Spare Monitor page zero byte
;
For overall memory mapping on the 65C02, I start with ROM from the top down, which includes a JMP table to access core routines, and buffers, soft vectors and such from the bottom up. I also put all hardware devices on page $FE for my hardware designs. This tends to give me a larger contiguous free RAM space for any other code (user code for DOS/65, etc.). For Page Zero, I define space I use for BIOS, Monitor, Other programs from the top down. This provides the largest contiguous Page Zero free space starting at $0000 for other user program usage.
Not sure if this helps much, but I also tend to heavily comment code, just so I can go back to it years later and get a quick read on what it all does. This is not the same as commenting each code instruction, but if you were to look at the C02 Pocket BIOS and Monitor source code, you'll get an idea of what I'm referring to.