Aaendi wrote:
Nintendo's collision and sprite building routines spend more time pushing and pulling registers, and passing arguments then they do actually calculating collision detection and building sprites.
What you are describing is characteristic of the object output of a C compiler. When a function (subroutine) call is made in C, space is allocated on the stack for the variables that have been declared within that function. Hence significant stack activity is to be expected. In pure assembly language, there are more ways to structure functions, so the amount of stack activity will vary according to the programmer's preferences.
Until you have experience with MPUs that handle the stack in an adroit fashion, it may be difficult to see the advantages of using the stack as a scratchpad the way C does. The 65C816 makes this a relatively painless task, as the 16-bit stack pointer can be copied to and from the (16-bit) accumulator, making it easy to perform arithmetic on the stack pointer and thus allocate temporary stack workspace. This feature can be augmented by temporarily pointing direct (zero) page at the stack workspace, viz:
Code:
;stack workspace example
;
rep #%00100000 ;16 bit .A
cld ;ensure binary mode
sec
tsc ;copy stack pointer (SP) to .A
sbc #size_of_ws ;make room for workspace
tcs ;set new SP
;
; ——————————————————————————————————————————————————————————
; At this point, SIZE_OF_WS bytes have been allocated on the
; stack, the 1st byte being at SP+1. We can make that work-
; space become a fugacious direct page by setting the direct
; page pointer (DP) to SP+1.
; ——————————————————————————————————————————————————————————
;
phd ;save current DP, SP = SP-2
inc a ;.A = .A+1, actually SP+3 at this point
tcd ;copy to DP
;
; —————————————————————————————————————————————————————————————
; Now, DP points to the 1st byte of the workspace. An instruc-
; tion such as LDA $00 would actually fetch from SP+3, instead
; of from the physical zero page.
; —————————————————————————————————————————————————————————————
;
...program continues...
;
; ——————————————————————————————————————————————————————————————————
; When processing has been completed the workspace can be discarded.
; ——————————————————————————————————————————————————————————————————
;
pld ;restore previous DP value
rep #%00100000 ;16 bit .A
clc
tsc ;current SP to .A
adc #size_of_ws ;get rid of workspace
tcs ;set new SP
;
...program continues...
Note that in the above example, the addition and subtraction operations are working on 16-bit operands.
If the above code is being run on a system with a formal operating environment, caution must be exercised in handling direct page. For example, an interrupt handler would have to be careful to preserve
DP, change it to point to the ISR's direct page, and then restore
DP before returning to the foreground.
Similarly, if the function that pointed direct page to stack workspace calls another function, that called function must be aware of the state of
DP so it doesn't accidentally "step" on the calling function's stack space by writing to what it thinks is the real direct page.