Breadboard 6502 Bad Apple!! With the Worlds Worst Video Card
Posted: Sat Sep 21, 2024 5:20 pm
Thanks everyone here at 6502.org for helping both directly and with your past posts!
Without you all I would NOT have been able to do this:
https://www.youtube.com/watch?v=0glEfLZCwmc
I'm working on a proper writeup with details but please take a look:
https://github.com/NormalLuser/Ben-Eate ... ewBeep.asm
The SD bootup routine is not speed dependent so it probably could be optimized for space.
The decoder however has been optimized for speed alone.
This decoder/setup can stream a byte off the bit-banged SD card in 47 cycles, but also can get a bit in just 4 cycles. Meaning I can read 1 bit from the SD with a LDA and then do a BEQ/BNE, then I only have to actually shift the bits I need and don't shift the 1 or 2 bits I branch on. Below is the start of the decoder and the entire reason I have most of the decoder living in the top of ZP and the bottom of the Stack.
I really like the Run Length Encoding line draw routine. It feels so silly to unroll in this way but it is a lot faster! This has been such a strange and fun challenge to be forced to code like this and squeeze every last cycle out of a cpu/program!
Please, any suggestions for speed improvements with or without hardware changes are welcome!
Now that I have this running on a 'stock' Ben Eater setup it is time to start expanding my system a little.
I'm already torn between hooking up a CF card in IDE mode so I get 8 bit reads, or doing something with a PLD and maybe a shift register to get 8 bit reads out of the SD card clocked at the VGA 10 MHz??
CF is on the way out, but is probably the easiest/best documented, and SD cards are here to stay and cheap... Ideas?
Thanks again George Foot for the SD read setup!
And the same goes for everyone at 6502.org for the wonderful resources of this place and all your posts!
Without you all I would NOT have been able to do this:
https://www.youtube.com/watch?v=0glEfLZCwmc
I'm working on a proper writeup with details but please take a look:
https://github.com/NormalLuser/Ben-Eate ... ewBeep.asm
The SD bootup routine is not speed dependent so it probably could be optimized for space.
The decoder however has been optimized for speed alone.
This decoder/setup can stream a byte off the bit-banged SD card in 47 cycles, but also can get a bit in just 4 cycles. Meaning I can read 1 bit from the SD with a LDA and then do a BEQ/BNE, then I only have to actually shift the bits I need and don't shift the 1 or 2 bits I branch on. Below is the start of the decoder and the entire reason I have most of the decoder living in the top of ZP and the bottom of the Stack.
Code: Select all
FrameLoop:
dec Block_Counter ; must count every 512 bytes read from SD card
beq BLOCK ; 256 roll-over go to BLOCK routine
;************************** >>>ENTRY POINT<<< *******************************
FrameLoopstart: ; jmp here to start and for SD card BLOCK return
; Load Control Bit 1 SD outputs Bit 8 first, bit 1 last
lda VIA_PORTA; Read bit 8 of byte from SD card
bne SkipRun ; If 1 Skip Pixels
; Load Control Bit 2
lda VIA_PORTA ; Else check bit 7
beq TriPixel ; If 0 TriPixel
; Save 3 cycles per line Can fall through to RLE now that Decode is in ZeroPage
;jmp RLE ; Else 1 New RLE self modifies, needs to be in RAM
; Whole reason all of this code is moved to ZP and the stack is to save just a few cycles
; on the RLE routine below!
RLE: ; Self Modify Run Length Draw Function
; Load 6 bits for a repeat value of up to 64
lda VIA_PORTA
asl
ora VIA_PORTA
asl
ora VIA_PORTA
asl
ora VIA_PORTA
asl
ora VIA_PORTA
asl
ora VIA_PORTA
; I could Bit shift and ADC then subtract from the length of the
; routine instead of using a table, but that would take more cycles
tax ; RLE count to x
lda RLEArray,x ; Load jmp location, faster than x3
sta RLEJump+1 ; Store low byte location, high byte is unchanged
lda PlotColor ; Last pixel color used in TriPixel
RLEJump: ; Self modify that code!
jmp RLERender ; Jump to RLE count in RLE routine
.org RLERenderstart
RLERender: ; Unrolled Line draw/Run Length Routine
; This is 8.2 cycles a pixel vrs 13.3 for a loop
;More than 60% faster Nice!
sta (Screen),y; Draw it! 64
iny ; Next pixel
sta (Screen),y; Draw it! 63
iny ; Next pixel
sta (Screen),y; Draw it! 62
iny ; Next pixel
sta (Screen),y; Draw it! 61
.....
Please, any suggestions for speed improvements with or without hardware changes are welcome!
Now that I have this running on a 'stock' Ben Eater setup it is time to start expanding my system a little.
I'm already torn between hooking up a CF card in IDE mode so I get 8 bit reads, or doing something with a PLD and maybe a shift register to get 8 bit reads out of the SD card clocked at the VGA 10 MHz??
CF is on the way out, but is probably the easiest/best documented, and SD cards are here to stay and cheap... Ideas?
Thanks again George Foot for the SD read setup!
And the same goes for everyone at 6502.org for the wonderful resources of this place and all your posts!