Concept & Design of 3.3V Parallel 16-bit VGA Boards
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
I was away for a while, but I just got back and see that you solved it. Excellent.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Thanks, Here's a pic:
Next on the agenda: Work on ellipses. Then Bresenham Lines. Then cube. Then rotate cube.
Next on the agenda: Work on ellipses. Then Bresenham Lines. Then cube. Then rotate cube.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
For the largest circle with diameter 239 placed @(319,239) it took .1mS.
Since it was easy and quick I did a test for a filled circle, by decreasing radii and plotting. Took 152mS.
K, I think I'm done for today. Whew. Time for a brew! Maybe I'll get some ideas how to optimize Daryl's code using other accumulators. Many thanks to him!
Since it was easy and quick I did a test for a filled circle, by decreasing radii and plotting. Took 152mS.
K, I think I'm done for today. Whew. Time for a brew! Maybe I'll get some ideas how to optimize Daryl's code using other accumulators. Many thanks to him!
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Glad to see it's working. I was thinking there must be an issue with the flags... but was going to re-look over the code today. Won't need to now!!!!
Daryl
Daryl
Please visit my website -> https://sbc.rictor.org/
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
In Daryl's Bresenham 65C02 circle plotting algorithm which uses 8 quadrants by computing the first 45deg, involves alot of this kind of 6502 code repetitively:
more specifically, this kind of procedure:
which would become in the 65O16.b core:
The B accumulator would be assigned to YC, a static value. Then add the value Y1, read from RAM, and store the operation in the C accumulator. In this case, the plot routine would plot the C accumulator for Y.
It would be nice if the Y1 value didn't have to be read in from RAM, but the .b core does not have the Acc+Acc function.
Code: Select all
; TGI_SETPIXEL(XC+X, YC+Y)
CLC
LDA XC
ADC X1
STA XP
PHA ; XC+X
CLC
LDA YC
ADC Y1
STA YP
JSR PLTPXLCode: Select all
LDA YC
ADC Y1
STA YPCode: Select all
ADCBopC Y1It would be nice if the Y1 value didn't have to be read in from RAM, but the .b core does not have the Acc+Acc function.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
ElEctric_EyE wrote:
...It would be nice if the Y1 value didn't have to be read in from RAM, but the .b core does not have the Acc+Acc function.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Ugh, found a typo in one of my macros (THQ) which held me up for a couple hours!
So for the first optimization, I used accumulators for XC, YC, XP, and YP:
So to plot a circle, all that is needed is something like:
edit: So over 4million cycles were saved in this situation. 4,044,455cycles to be exact.
So for the first optimization, I used accumulators for XC, YC, XP, and YP:
Code: Select all
;BRESENHAM CIRCLE COURTESY OF DARYL RICTOR @ HTTP://SBC.RICTOR.ORG/
; XP = O ACC XC = G ACC
; YP = Q ACC YC = H ACC
CIRCLE LDY #0 ;Y IS USED IN PLTPXL ROUTINE, SET TO ZERO
; INT X = 0
STY X1
LDA RA
BNE _C1
TGO ;XC = XP
THQ ;YC = YP
JMP PLTPXL
_C1
; INT Y = RADIUS
STA Y1
; INT F = 1 - RADIUS
SEC
LDA #1
STA FX ; int ddF_X = 1
SBC RA
STA FF
; INT ddF_Y = -2 * RADIUS
LDA RA
ASL
EOR #$FFFF
STA FY
INC FY ;could INC Acc after EOR #$FFFF
; TGI_SETPIXEL(XC, YC+Y)
TGO ;XC = XP
CLC
ADCHopQzp Y1 ;YC + Y1 => YP
JSR PLTPXL
; TGI_SETPIXEL(XC, YC-Y)
SEC
SBCHopQzp Y1 ;YC - Y1 => YP
JSR PLTPXL
; TGI_SETPIXEL(XC+Y, YC)
CLC
ADCGopOzp Y1 ;XC + Y1 => XP
THQ ;YC = YP
JSR PLTPXL
; TGI_SETPIXEL(XC-Y,YC)
SEC
SBCGopOzp Y1 ;XC + Y1 => XP
JSR PLTPXL
_CLOOP
; WHILE (X<Y) CALCULATE NEXT PLOT STEP
SEC
LDA X1
SBC Y1
BCC _C4 ;X<Y
RTS
_C4 LDA FF ; ** added back with change
BMI _C6 ; ** added back (branch if bit 15 set)
DEC Y1
CLC
LDA FY
ADC #$02
STA FY
CLC
ADC FF
STA FF
_C6 INC X1
CLC
LDA FX
ADC #$02
STA FX
CLC
ADC FF
STA FF
; TGI_SETPIXEL(XC+X, YC+Y)
CLC
ADCGopOzp X1 ;XC + X1 => XP
PHO ;XC + X1
CLC
ADCHopQzp Y1 ;YC + Y1 => YP
JSR PLTPXL
; TGI_SETPIXEL(XC-X, YC+Y)
SEC
SBCGopOzp X1 ;XC - X1 => XP
JSR PLTPXL
; TGI_SETPIXEL(XC-X, YC-Y)
SEC
SBCHopQzp Y1 ;YC - Y1 => YP
JSR PLTPXL
; TGI_SETPIXEL(XC+X, YC-Y)
PLO ;XC + X1
JSR PLTPXL
; TGI_SETPIXEL(XC+Y, YC+X)
CLC
ADCGopOzp Y1 ;XC + Y1 => XP
PHO ;XC + Y1
CLC
ADCHopQzp X1 ;YC + X1 => YP
JSR PLTPXL
; TGI_SETPIXEL(XC-Y,YC+X)
SEC
SBCGopOzp Y1 ;XC - Y1 => XP
JSR PLTPXL
; TGI_SETPIXEL(XC-Y, YC-X)
SEC
SBCHopQzp X1 ;YC - X1 => YP
JSR PLTPXL
; TGI_SETPIXEL(XC+Y,YC-X)
PLO ;XC + Y1
JSR PLTPXL
JMP _CLOOP
;*************************************************
; SAVE 2 CYCLES BY DOING LDY #$00 BEFORE CALLING PLTPXL
PLTPXL STOzp SCRLO ;-STO SCRLO
STQzp SCRHI ;-STQ SCRHI
STBiy SCRLO ;-STB(SCRLO),Y. B ACCUMULATOR IS ALWAYS PIXEL COLOR
RTSCode: Select all
LDWi $00EF ;-LDW #$00EF. MAX RADIUS = 239
LDGi $013F ;-LDG #319. G ACC IS XCENTER
LDHi $00EF ;-LDH #239. H ACC IS YCENTER
PLC STWzp RA ;-STW RA. RA IS RADIUS
JSR CIRCLE-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Removing all the Jsr PLTPXL's and inserting the routine itself saved another 4,350,798 cycles! for 239 circles.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
A useful feature in your hardware may be to add a programmable offset in your address calculation. This would allow you to set the origin of the circle at the beginning of the plot routine, and then the rest of the routine can just plot around (0,0) without having to keep track of the offsets.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
I like that idea!
I'll see if I can implement it soon after I'm done optimizing the circle, and keep it if it doesn't slow the project down. I would think just 2 addressable ports added to the right hand of the equations should work:
I'm focusing on giving FX and FY their own accumulator in the _C4 and _C6 loops, not quite there yet...
Down to 45,752 cycles for 1 circle radius 239.
I'll see if I can implement it soon after I'm done optimizing the circle, and keep it if it doesn't slow the project down. I would think just 2 addressable ports added to the right hand of the equations should work:
Code: Select all
always @* begin //optimize the videoRAM address for plotting (X,Y) in the (LSB,MSB) for indirect indexed
cpuABopt [20:19] <= 0; //bank bits
cpuABopt [18:10] <= cpuAB [24:16]; //Y[8:0]
cpuABopt [9:0] <= cpuAB [9:0]; //X[9:0]
endDown to 45,752 cycles for 1 circle radius 239.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
I'm done optimizing.
The software uses 6 accumulators and 4 zero page variables. Got the cycles down to 42381 cycles for a circle with radius of 239 on 640x480 pixels.
The software uses 6 accumulators and 4 zero page variables. Got the cycles down to 42381 cycles for a circle with radius of 239 on 640x480 pixels.
- Attachments
-
- circ16_65O16b.asm
- (6.38 KiB) Downloaded 183 times
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
You can do a bit more optimizing
The last STQzp SCRHI is not necessary, because the value is still the same as before. The same happens in a few other places.
Code: Select all
STQzp SCRHI ;-STQ SCRHI
STBiy SCRLO ;-STB(SCRLO),Y.
; TGI_SETPIXEL(XC-Y,YC+X)
SEC
SBCGopOzp Y1 ;XC - Y1 => XP
;PLOT
STOzp SCRLO ;-STO SCRLO
STQzp SCRHI ;-STQ SCRHI
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
That's going to be a huge gain!
Checking...
Checking...
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Down to 37,641 cycles!
That's another 300+ 239pixel radius circles at 100MHz. Awesome! Not that I really care about plotting that many, but it is a standard of comparison nonetheless.
LOL @6.66Kb download size
That's another 300+ 239pixel radius circles at 100MHz. Awesome! Not that I really care about plotting that many, but it is a standard of comparison nonetheless.
LOL @6.66Kb download size
- Attachments
-
- circ16_65O16b.b.asm
- (6.66 KiB) Downloaded 172 times