6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun May 05, 2024 12:16 am

All times are UTC




Post new topic Reply to topic  [ 609 posts ]  Go to page Previous  1 ... 23, 24, 25, 26, 27, 28, 29 ... 41  Next
Author Message
PostPosted: Sun Mar 31, 2013 7:08 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
I was away for a while, but I just got back and see that you solved it. Excellent.


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 31, 2013 7:13 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Thanks, Here's a pic:

Next on the agenda: Work on ellipses. Then Bresenham Lines. Then cube. Then rotate cube. :D


Attachments:
File comment: Circle @(200,200) radius 150. Resolution:640x480
P1000970.JPG
P1000970.JPG [ 56.9 KiB | Viewed 707 times ]

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 31, 2013 7:16 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10797
Location: England
Hurrah!


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 31, 2013 7:59 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
For the largest circle with diameter 239 placed @(319,239) it took .1mS.

Since it was easy and quick I did a test for a filled circle, by decreasing radii and plotting. Took 152mS.
K, I think I'm done for today. Whew. Time for a brew! Maybe I'll get some ideas how to optimize Daryl's code using other accumulators. Many thanks to him!


Attachments:
File comment: Concentric circles for fill.
P1000973.JPG
P1000973.JPG [ 80.9 KiB | Viewed 702 times ]

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 01, 2013 12:48 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1683
Location: Sacramento, CA
Glad to see it's working. I was thinking there must be an issue with the flags... but was going to re-look over the code today. Won't need to now!!!!

Daryl

_________________
Please visit my website -> https://sbc.rictor.org/


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 1:45 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
In Daryl's Bresenham 65C02 circle plotting algorithm which uses 8 quadrants by computing the first 45deg, involves alot of this kind of 6502 code repetitively:
Code:
; TGI_SETPIXEL(XC+X, YC+Y)
                  CLC
                  LDA XC
                  ADC X1
                  STA XP
                  PHA      ; XC+X
                  CLC
                  LDA YC
                  ADC Y1
                  STA YP
                  JSR PLTPXL

more specifically, this kind of procedure:
Code:
                  LDA YC
                  ADC Y1
                  STA YP

which would become in the 65O16.b core:
Code:
ADCBopC Y1

The B accumulator would be assigned to YC, a static value. Then add the value Y1, read from RAM, and store the operation in the C accumulator. In this case, the plot routine would plot the C accumulator for Y.

It would be nice if the Y1 value didn't have to be read in from RAM, but the .b core does not have the Acc+Acc function.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 11:58 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
ElEctric_EyE wrote:
...It would be nice if the Y1 value didn't have to be read in from RAM, but the .b core does not have the Acc+Acc function.

Not sure what I was thinking here, it does have this function and I've already made the macro's to define the opcodes!

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 3:05 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Ugh, found a typo in one of my macros (THQ) which held me up for a couple hours!

So for the first optimization, I used accumulators for XC, YC, XP, and YP:
Code:
                                 ;BRESENHAM CIRCLE COURTESY OF DARYL RICTOR @ HTTP://SBC.RICTOR.ORG/
                                 ; XP = O ACC     XC = G ACC       
                                 ; YP = Q ACC     YC = H ACC     
CIRCLE            LDY #0         ;Y IS USED IN PLTPXL ROUTINE, SET TO ZERO
                 
; INT X = 0                 
                  STY X1
                 
                  LDA RA       
                  BNE _C1
                  TGO            ;XC = XP
                  THQ            ;YC = YP
                  JMP PLTPXL
                 
_C1               
; INT Y = RADIUS
                  STA Y1         

; INT F = 1 - RADIUS
                  SEC   
                  LDA #1
                  STA FX     ; int ddF_X = 1
                  SBC RA
                  STA FF
                                                     
; INT ddF_Y = -2 * RADIUS 
                  LDA RA
                  ASL
                  EOR #$FFFF
                  STA FY   
                  INC FY            ;could INC Acc after EOR #$FFFF
               
; TGI_SETPIXEL(XC, YC+Y)
                  TGO            ;XC = XP
                  CLC
                  ADCHopQzp Y1   ;YC + Y1 => YP
                  JSR PLTPXL
           
; TGI_SETPIXEL(XC, YC-Y)
                  SEC
                  SBCHopQzp Y1   ;YC - Y1 => YP
                  JSR PLTPXL
                 
; TGI_SETPIXEL(XC+Y, YC)
                  CLC
                  ADCGopOzp Y1   ;XC + Y1 => XP
                  THQ            ;YC = YP
                  JSR PLTPXL
                 
; TGI_SETPIXEL(XC-Y,YC)
                  SEC
                  SBCGopOzp Y1   ;XC + Y1 => XP
                  JSR PLTPXL
                 
_CLOOP
; WHILE (X<Y) CALCULATE NEXT PLOT STEP
                  SEC
                  LDA X1
                  SBC Y1
                  BCC _C4       ;X<Y
                  RTS
                 
_C4               LDA FF   ; ** added back with change
                  BMI _C6   ; ** added back (branch if bit 15 set)
                 
                  DEC Y1
                  CLC
                  LDA FY
                  ADC #$02
                  STA FY
                  CLC
                  ADC FF
                  STA FF     
                 
_C6               INC X1
                  CLC
                  LDA FX
                  ADC #$02
                  STA FX
                  CLC
                  ADC FF
                  STA FF     
                 
; TGI_SETPIXEL(XC+X, YC+Y)
                  CLC
                  ADCGopOzp X1   ;XC + X1 => XP
                  PHO               ;XC + X1
                  CLC
                  ADCHopQzp Y1   ;YC + Y1 => YP
                  JSR PLTPXL
                 
; TGI_SETPIXEL(XC-X, YC+Y)
                  SEC
                  SBCGopOzp X1   ;XC - X1 => XP
                  JSR PLTPXL
                 
; TGI_SETPIXEL(XC-X, YC-Y)
                  SEC
                  SBCHopQzp Y1   ;YC - Y1 => YP
                  JSR PLTPXL
                 
; TGI_SETPIXEL(XC+X, YC-Y)
                  PLO               ;XC + X1
                  JSR PLTPXL
                 
; TGI_SETPIXEL(XC+Y, YC+X)
                  CLC
                  ADCGopOzp Y1   ;XC + Y1 => XP
                  PHO               ;XC + Y1
                  CLC
                  ADCHopQzp X1   ;YC + X1 => YP
                  JSR PLTPXL
                 
; TGI_SETPIXEL(XC-Y,YC+X)
                  SEC
                  SBCGopOzp Y1   ;XC - Y1 => XP
                  JSR PLTPXL
                 
; TGI_SETPIXEL(XC-Y, YC-X)
                  SEC
                  SBCHopQzp X1   ;YC - X1 => YP
                  JSR PLTPXL
                 
; TGI_SETPIXEL(XC+Y,YC-X)
                  PLO               ;XC + Y1
                  JSR PLTPXL
                  JMP _CLOOP

;*************************************************
                                       ; SAVE 2 CYCLES BY DOING LDY #$00 BEFORE CALLING PLTPXL
PLTPXL            STOzp SCRLO          ;-STO SCRLO
                  STQzp SCRHI          ;-STQ SCRHI
                  STBiy SCRLO          ;-STB(SCRLO),Y. B ACCUMULATOR IS ALWAYS PIXEL COLOR
                  RTS


So to plot a circle, all that is needed is something like:
Code:
                  LDWi $00EF               ;-LDW #$00EF. MAX RADIUS = 239
                  LDGi $013F               ;-LDG #319. G ACC IS XCENTER
                  LDHi $00EF               ;-LDH #239. H ACC IS YCENTER
PLC               STWzp RA                 ;-STW RA. RA IS RADIUS
                  JSR CIRCLE


edit: So over 4million cycles were saved in this situation. 4,044,455cycles to be exact.


Attachments:
P1000978.JPG
P1000978.JPG [ 80.41 KiB | Viewed 641 times ]

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 3:23 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Removing all the Jsr PLTPXL's and inserting the routine itself saved another 4,350,798 cycles! for 239 circles.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 3:39 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
A useful feature in your hardware may be to add a programmable offset in your address calculation. This would allow you to set the origin of the circle at the beginning of the plot routine, and then the rest of the routine can just plot around (0,0) without having to keep track of the offsets.


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 6:28 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I like that idea!
I'll see if I can implement it soon after I'm done optimizing the circle, and keep it if it doesn't slow the project down. I would think just 2 addressable ports added to the right hand of the equations should work:
Code:
always @* begin                           //optimize the videoRAM address for plotting (X,Y) in the (LSB,MSB) for indirect indexed
   cpuABopt [20:19] <= 0;                  //bank bits
   cpuABopt [18:10] <= cpuAB [24:16];      //Y[8:0]
   cpuABopt [9:0] <= cpuAB [9:0];         //X[9:0]
end


I'm focusing on giving FX and FY their own accumulator in the _C4 and _C6 loops, not quite there yet...

Down to 45,752 cycles for 1 circle radius 239.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 7:50 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I'm done optimizing.
The software uses 6 accumulators and 4 zero page variables. Got the cycles down to 42381 cycles for a circle with radius of 239 on 640x480 pixels.


Attachments:
circ16_65O16b.asm [6.38 KiB]
Downloaded 74 times

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 7:54 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
You can do a bit more optimizing :)
Code:
                 STQzp SCRHI          ;-STQ SCRHI
                 STBiy SCRLO          ;-STB(SCRLO),Y.
                 
; TGI_SETPIXEL(XC-Y,YC+X)
                  SEC
                  SBCGopOzp Y1         ;XC - Y1 => XP
;PLOT                 
                  STOzp SCRLO          ;-STO SCRLO
                  STQzp SCRHI          ;-STQ SCRHI


The last STQzp SCRHI is not necessary, because the value is still the same as before. The same happens in a few other places.


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 8:04 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
That's going to be a huge gain! :shock:
Checking...

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 02, 2013 8:28 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Down to 37,641 cycles!
That's another 300+ 239pixel radius circles at 100MHz. Awesome! Not that I really care about plotting that many, but it is a standard of comparison nonetheless.

LOL @6.66Kb download size :lol:


Attachments:
circ16_65O16b.b.asm [6.66 KiB]
Downloaded 59 times

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 609 posts ]  Go to page Previous  1 ... 23, 24, 25, 26, 27, 28, 29 ... 41  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: