6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 3:26 am

All times are UTC




Post new topic Reply to topic  [ 609 posts ]  Go to page Previous  1 ... 24, 25, 26, 27, 28, 29, 30 ... 41  Next
Author Message
PostPosted: Tue Apr 02, 2013 9:40 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Here are the speed comparisons for the Concentric circle fill (239 circles), showing the speed of the original code on top, the use of 4 65O16.b accumulators in the middle, then 6 accumulators plus some help from Arlet in the final optimization of the software. The number of cycles can be seen to the lower left of each pic.


Attachments:
File comment: Original 65c02 code ported to 65O16.b
P1000973.JPG
P1000973.JPG [ 80.9 KiB | Viewed 884 times ]
File comment: Code utilizing 4 65Org16.b accumulators
P1000978.JPG
P1000978.JPG [ 80.41 KiB | Viewed 884 times ]
File comment: Optimized 65Org16.b code utilizing 6 accumulators
P1000980.JPG
P1000980.JPG [ 44.09 KiB | Viewed 884 times ]

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 12:14 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Pretty cool indeed.

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 1:15 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Arlet wrote:
A useful feature in your hardware may be to add a programmable offset in your address calculation. This would allow you to set the origin of the circle at the beginning of the plot routine, and then the rest of the routine can just plot around (0,0) without having to keep track of the offsets.

I implemented 2 write only offset registers:
Code:
always @(posedge clk) begin
   if (osetCS && cpuWE && !cpuAB[0])
      osetREGL <= cpuDO;
   if (osetCS && cpuWE && cpuAB[0])
      osetREGH <= cpuDO;
end
   
always @* begin                           //optimize the videoRAM address for plotting (X,Y) in the (LSB,MSB) for indirect indexed
   cpuABopt [20:19] <= 0;                  //bank bits
   cpuABopt [18:10] <= cpuAB [24:16] + osetREGH;      //Y[8:0]
   cpuABopt [9:0] <= cpuAB [9:0] + osetREGL;         //X[9:0]
end


It's not working correctly now, but currently I am resetting the pixel counters used for the SyncRAM address in the HVSync module at 639 for X and 479 for Y. When I try resetting X at 1023 and Y at 511, it fails the speed constraint. Running smartExplorer. Timing score was 162, so it should fit... Then I'll try plotting again @ (0,0) with offset of (320,240).

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 1:43 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
After 16 runs, the best timing score it could do was 84.

I don't think offsets would be useful for lines though. I'll work on that next

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 3:46 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Daryl, stupid question but what variables get plotted in your Bresenham line? IX and IY?

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 4:26 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1748
Location: Sacramento, CA
ElEctric_EyE wrote:
Daryl, stupid question but what variables get plotted in your Bresenham line? IX and IY?


No, its not stupid. My graphics module overlapped several variables. In the code I gave back on page 22, the X1,Y1 variables are the ones that get plotted. IX&IY are used as the fractional portion and DX,DY are used as the increment value.

As a quick speed enhancement, you could shift both DX and DY left until the msb of either is 1, this will reduce the loop interations. (If this point is not clear, let me know and I will expand my explanation).

If you need any more help, just ask!

Daryl

_________________
Please visit my website -> https://sbc.rictor.org/


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 6:16 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Thanks!

I believe I got it by blind luck! It plotted a line from (400,100) to (150,250).

It's far from complete though.

8BIT wrote:
...As a quick speed enhancement, you could shift both DX and DY left until the msb of either is 1, this will reduce the loop interations...

How?


Attachments:
File comment: 1st draft
Bresenham line.asm [4.98 KiB]
Downloaded 60 times

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 8:27 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I'm pretty confident it's working. In this pic I plotted 4 lines with the 4 different possible slopes:
1) (5,0) to (635,480)
2) (640,475) to (0,5)
3) (635,0) to (5,480)
4) (0,475) to (640,5)

It's pretty slow though... Time for optimizations!


Attachments:
P1000982.JPG
P1000982.JPG [ 45.26 KiB | Viewed 833 times ]

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 10:38 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1748
Location: Sacramento, CA
ElEctric_EyE wrote:
Thanks!

I believe I got it by blind luck! It plotted a line from (400,100) to (150,250).

It's far from complete though.

8BIT wrote:
...As a quick speed enhancement, you could shift both DX and DY left until the msb of either is 1, this will reduce the loop interations...

How?


I'll work on this...give me 2 days though...full plate right now.

Glad to see its is at least working...optimizations will be straight forward.

Daryl

_________________
Please visit my website -> https://sbc.rictor.org/


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 11:05 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
No rush at all, it'll be a week before I post again...

I need a rest as well. Had too many iced coffees today and I'm burnt out, but happy I've reached my goal.
I think my earlier attempt at Bresenham lines in Verilog helped me in knowing what to look for in your code. I shouldn't have said it was blind luck. Maybe, blind luck with one eye open.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 03, 2013 11:10 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1748
Location: Sacramento, CA
I cleaned up the code some and made a few adjustments. Still need to do the optimize on dx,dy but I want to make sure this still works first.

Give it a try when you find time. Also, I'm not really sure this can be called Bresenham Line drawing as I created the code myself.

File removed - found an error - see next post

Daryl

_________________
Please visit my website -> https://sbc.rictor.org/


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 04, 2013 5:07 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1748
Location: Sacramento, CA
I found and corrected an error in the first post. This post also includes the optimizations to reduce loop cycles.

The next step, if this still works, is to convert it to use multiple accumulators like the circle code.

Will await your test results. In the mean time, enjoy your break!!!

Daryl


Attachments:
Line16.asm [4.67 KiB]
Downloaded 57 times

_________________
Please visit my website -> https://sbc.rictor.org/
Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 04, 2013 2:15 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Cool, I'll try it out tonight. Thanks!

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 05, 2013 2:14 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Ok, finally had a chance to plug your code in...
First, As65 has an issue with 'bcc loop1' on line 48, loop1 is undefined. I assumed it was 'bcc loop2' and went forward with testing the speed with the same 4 lines as before with 32+million cycles, whew that's alot!

Looks like DX and DY are skewed for the first 2 lines: (5,0) to (635,480) & (640,475) to (0,5). However the speed is way down to 4+million cycles.

EDIT:I had found 1 'BRA' opcode that the .b core doesn't have but As65 will translate it to a $0080 opcode value, which is undefined. I just now saw a second one and changed them to a 'JMP'. Retesting...

The first 2 lines are still skewed. Now for the lines with the coordinates that have been plotted correctly, they are "sloppier" although cycles for the routine are way down to 444013.

Since I modified your code, I should paste an update so we are on the same page.


Attachments:
Line16.b.asm [4.68 KiB]
Downloaded 47 times
File comment: Anamolies, but good speed. Correct coordinate plotting.
P1000985b.JPG
P1000985b.JPG [ 749.7 KiB | Viewed 879 times ]
File comment: Coordinate plotting error for (5,0) to (635,480) & (640,475) to (0,5).
P1000984.JPG
P1000984.JPG [ 103.94 KiB | Viewed 879 times ]

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 05, 2013 12:23 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1748
Location: Sacramento, CA
The BCC loop1 was correct. I somehow erased the loop1 label. It has been added back. That should fix the skewed lines. The sloppy lines might be a byproduct of my optimization. There are two sections with comments " dx,dy optimization." Try commenting these sections out to see if the sloppiness goes away. Cycles will increase again, but I have another possible option to fix that.

Two questions about the .b core.

1) Can you do branching (bcc, bne, etc) beyond 128 bytes now?

I have several places where there is this type of code:

bne label1
jmp label 2
label1 next opcode

because label 2 was just beyond the +/- 128 byte window.

2) Does the BIT command now place bit15 and bit14 in the N an V flags?

If the optimizations are causing the sloppieness, I might be able to use the BIT command to improve it.

Will await your further tests.

Daryl


Attachments:
Line16.c.asm [4.71 KiB]
Downloaded 53 times

_________________
Please visit my website -> https://sbc.rictor.org/
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 609 posts ]  Go to page Previous  1 ... 24, 25, 26, 27, 28, 29, 30 ... 41  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 9 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: