cbmeeks wrote:
sark02 wrote:
The 6502 doesn't have the grunt to drive a pixel array like this
1982 maybe. I wonder how many pixels a 14 (or even 20) MHz 65C02 could push? Even a mid-range 8 MHz 65C02 with a "simple" CPLD blitter would make for an excellent game machine IMHO.
LOL, maybe... but it would seem like a force-fit. Here's the Defender code for drawing a 10 wide by 8 high sprite - for example the regular aliens in the game:
Code:
; Draw a 10x8 object
; In:
; D -> dest
; Y -> struct obj {
; u8 cols, rols;
; u16 even_pixdata_ptr, odd_pixdata_ptr;
; u16 draw_func_ptr, erase_func_ptr;
; }
; CC.C = 0=even sprite, 1=odd sprite
; Clobber:
; D, Y, CC
__D193: 34 18 | DRAW_10X8: PSHS X,DP ; save X, DP
__D195: 10 DF 77 | STS $77 ; save S
__D198: 24 02 | BCC L_D19C ; branch if even
__D19A: 31 22 | LEAY $2,Y ; offset for odd data
__D19C: 10 EE 22 | L_D19C: LDS $2,Y ; S -> pix_data
__D19F: CB 08 | ADDB #$08 ; D -> bottom row
__D1A1: 1F 03 | TFR D,U ; U -> bottom row
__D1A3: 35 3F | PULS CC,A,B,DP,X,Y ; fetch 2x8 pix
__D1A5: 36 3F | PSHU Y,X,DP,B,A,CC ; store 2x8 pix @ 0,0
__D1A7: 33 C9 01 08 | LEAU $0108,U ; U -> next col
__D1AB: 20 9C | BRA L_D149 ; continue 8x8
...
__D149: 35 3F | L_D149: PULS CC,A,B,DP,X,Y ; fetch 2x8 pix
__D14B: 36 3F | PSHU Y,X,DP,B,A,CC ; store 2x8 pix @ 0,0
__D14D: 33 C9 01 08 | LEAU $0108,U ; U -> next col
__D151: 35 3F | PULS CC,A,B,DP,X,Y ; fetch col
__D153: 36 3F | PSHU Y,X,DP,B,A,CC ; store @ 2,0
__D155: 33 C9 01 08 | LEAU $0108,U ; U -> next col
__D159: 35 3F | L_D159: PULS CC,A,B,DP,X,Y ; fetch 2x8
__D15B: 36 3F | PSHU Y,X,DP,B,A,CC ; store @ 4,0
__D15D: 33 C9 01 08 | LEAU $0108,U ; U -> next col
__D161: 35 3F | PULS CC,A,B,DP,X,Y ; fetch 2x8
__D163: 36 3F | PSHU Y,X,DP,B,A,CC ; store @ 6,0
__D165: 10 FE A0 77 | LDS $A077 ; restore S
__D169: 35 98 | PULS DP,X,PC ; (PUL? PC=RTS)
To me that's just beautifully efficient code. Each PULS/PSHU pair is fetching 8 sprite pattern bytes from the ROM (via the S stack), and writing to the video RAM (via the U stack). That's two instructions. Note how U is incremented 258+8 bytes (LEAU $0108, U) to move to the next column. Each column stride is 256 bytes, but the PSHU has decremented U by 8 bytes, so 256+8 is needed.
The 6502 would not only need 1 LDA/STA per byte, but what addressing mode would you even use? (ZP),Y, with two ZP pointers - one for the source and one for the destination? The would 8 x { LDA, STA, INY }, with each load/store being indirect Y - all those redundant reads.... to replace 2 6809 instructions.
You _could_ do it with a fast enough 6502... but I'm not sure I'd want to