6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 9:54 am

All times are UTC




Post new topic Reply to topic  [ 353 posts ]  Go to page Previous  1 ... 20, 21, 22, 23, 24  Next
Author Message
 Post subject: Re: 65ORG16.b Core
PostPosted: Fri May 04, 2012 11:57 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
MichaelM wrote:
... I think that the tools will resolve to the 16k x 1 organization because that provides it the most flexibility in the placement of the BRAMs to meet your PERIOD constraint.

The problem is meeting the constraint. The goal here is 10ns minimum for the core and minimum blockRAM for stack and zeropage. I can reach this using 1K zeropage and 1K stackpage blockRAM on the FPGA. Using 4K zeropage and 4K stackpage did set the speed constraint back by .6 ns which translates to about 5-7MHz loss. Personally, I can live with 256x16 zeropage and 256x16 stackpage.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Sat May 05, 2012 1:51 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
That's not what I meant. Regardless of how you slice up the Block RAMs, they are 18k x 1 components. That's their fundamental size.

If you code for Block RAMs smaller than this size, you will get 18kb Block RAMs with address lines tied off by the tools.

If instead you code for a common 16k x 16 memory, you will get 16 block RAMs which can be more optimally placed. As the data bus organization of a block RAM is widened, the number of routing paths that converge on that tiny part of the chip increases. Since each Block RAM is also a dual port device, I contend that the separate address, data input, and data output busses (of both our cores) results in congestion in the vicinity of the block RAM. The result is longer routing delays between the Block RAMs and your 65Org16.b core's logic elements.

In other words, I contend that the 32 (16 in, 16 out) data lines to each Block RAM is a lot of localized routing, and on top of that you have to add the address lines, and the memory data select multiplexer. All of this will add routing delays far in excess of the 2-2.5 ns (Spartan-3A) access time of the RAMs themselves. After setting this as the baseline, then decreasing the memory depth will most likely result in the number of inferred Block RAMs decreasing and the data width of the these inferred RAMs increasing in the allowed increments until a memory size of 1k x 16 is requested. Below this limit, the address lines of the device are tied off by the tools.

I will try this experiment shortly with the Block RAM memory I have inferred in my second M65C02 core. I'll send you my results.

Presently in that core I have a 16k x 8 Block RAM memory inferred. The tools have inferred 8 16k x 1 block RAMs. Like Arlet expected, the performance reported for synthesis and PAR decreased when I first instantiated this much Block RAM. However, I was able to go into the Floorplanner and establish some more reasonable placements for them, instead of the placer's chosen locations. That little manual extra effort on my part improved the reported performance, decreased the MAP/PAR time, and made the overall project more repeatable as I continued to make modifications and add functionality.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Sat May 05, 2012 4:43 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Completed the Block RAM study. Varied the RAM required from 256 bytes to 16kB in powers of 2. All other parameters were kept the same. The new M65C02 core with the Block RAM interface that I am currently changing and testing the microprogram targeting a 10ns period in a XC3S200AN-6 did not demonstrate any significant variations in performance (synthesis or MAP/PAR). Ive attached a summary of the data I gathered using ISE 10.1i SP3

Similarly, all implementation attempts satisfied the single period constraint for
all configurations:

256 x 8 9.946ns 0.054ns(Setup) 0.736ns(Hold)
512 x 8 9.958ns 0.042ns(Setup) 0.783ns(Hold)
1024 x 8 9.980ns 0.020ns(Setup) 0.730ns(Hold)
2048 x 8 9.955ns 0.045ns(Setup) 0.768ns(Hold)
4096 x 8 9.959ns 0.041ns(Setup) 0.699ns(Hold)
8192 x 8 9.976ns 0.024ns(Setup) 0.785ns(Hold)
16384 x 8 9.951ns 0.049ns(Setup) 0.831ns(Hold)

Attachment:
BRAM_Study_Summary.zip [1.23 KiB]
Downloaded 104 times


Well so much for the theory advanced earlier.

Since the registers in your core are now synthesizing into a LUT, which significantly reduces the number of slices used, you can try this experiment on your design and get a result which matches mine. In its current state, my new core is only about 580 LUTs in size, and occupies less than 20% of the XC3S200AN-5 FPGA. Thus, there is great leeway for the placer in placing the design subject to one PERIOD constraint, LOCed pins, and Block RAMs.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Sat May 05, 2012 9:43 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Interesting I think I understand now. For maximum speed, I should use a 16Kx16 configuration and put the zeropage, stack and 'OS' all in one. Then maybe possibly manually place the blockRAM using floor planner.

Also, since it sounds like your core is complete it certainly should be placed in our softcore comparison chart. I need to add Andre's as well. I'll need to instal ISE10.1 again, but I think I can get to this in a few days.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Sat May 05, 2012 10:40 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Most of the results followed expected behavior, but as always there's more to learn. I had expected that the synthesis results would be more uniform than the map reports indicated. Although the overall number of LUTs in the implementation remained fairly close to the 588 baseline (16kB BRAM), I was surprised by the amount of variation in the slice/LUT counts in the lower level modules as the size of the BRAM was varied. I don't see a rational explanation for those variations. The linkage between the BRAM size and the LUT/slice counts of modules not even at the same hierarchal level just eludes me.

On the subject of loading ISE 10.1i in order to pull the MAM65C02 core into your directory, I think that it would be more work than it's worth. Some changes would be required to the UCF file to put the core into a Spartan-6, but its not difficult to comment out all of the pin LOCs and just retain the PERIOD constraint. I've done that a couple of times in order to see how much more performance a Spartan-6 offers over a Spartan-3AN. I just prefer to stay with ISE 10.1i until such time as I am going to be developing products based on the Spartan-6 FPGA family, which will happen later in the summer. I will stay with ISE 10.1i for the Spartan-3AN products that I will also be developing over the next few months because of the performance differences between 10.1 and 12.4/13.4 that I've noted with respect to the Spartan-3AN architecture.

Therefore, unless you just want to run the MAM65C02 (and/or Andre's) core through your own in 10.1i installation, I would be glad to send you any ISE 10.1i files you would like. An update using ISE 13.4 to your table or cores may be easier for you than adding another installation of the Xilinx tools.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Sat May 05, 2012 11:35 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
MichaelM wrote:
...Therefore, unless you just want to run the MAM65C02 (and/or Andre's) core through your own in 10.1i installation, I would be glad to send you any ISE 10.1i files you would like. An update using ISE 13.4 to your table or cores may be easier for you than adding another installation of the Xilinx tools.

I have to keep true to the methods I'd used in the thread.
I had had other intentions back then. I believe I was still designing with 5V parts. Plus it wouldn't hurt to have an older ISE installed. Especially when it supports so many of the IC's that newer versions of ISE don't support, but are still being produced, i.e. the very capable Spartan 2 series. For sure this family will not run faster than Spartan 3 or Spartan 6, but a nice tradeoff may be in the fact that a smaller package/footprint is offered for the older family.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Wed May 09, 2012 6:08 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I'll try to add the news cores to that thread today. I need a breather!

I just wanted to pop in and say I got the .b core working in my devboard! 3 months of development for this core and now the payoff is finally here!
I found the last of the major flaws in the core today, after 2 straight days of troubleshooting. The last 2 issues I found were erroneously including INC/DEC/<shift,rotate> opcodes that did operations on memory, in the load_reg section.

Now C'mon is working with all the pixel plotting for characters and using some new shift opcodes and some PHY, etc. 65C02 opcodes.

I'm looking forward now to integrating some of the neat stuff Arlet has made in the coming weeks.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Wed May 09, 2012 9:36 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I don't want to load Mike's server down with alot of megabytes of my project file(s), but I figure I can post at least some software (slightly modified with new opcodes, not so much yet!) developed on As65 which more than a few have contributed, and it's less than 14K. Available now for anyone's perusal...
The Reset Vector is labelled Start on line 3464 when using PSPAD editor. IRQ, NMI, and RES vectors are all set to the same Start address.
There's just a few of the new .b core macro definitions from lines 78-3463. The following code doesn't currently take advantage of anything but a few PUSH/PULL 65C02 opcodes which are compatible with As65 (lines 331-342), and some new Macros defining the barrel shifting opcodes (lines 3405-3462).

Bruce's assembly for C'mon is in there from lines 3602-3942.
Arlet's assembly for I2C is in there from lines 3944-4025.
Teamtempest helped me out for the main pixel plot routine, way back when there was only 1 accumulator, from lines 4184-4399.


Attachments:
File comment: boot.rar
boot.rar [12.88 KiB]
Downloaded 99 times

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502
Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Wed May 09, 2012 10:27 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
That's great. It probably feels very good to move on to something more tangible with your core.

I also look forward to wrapping up the testing I am doing. My efforts are supposed to be recreational, but at times it seems like work.

Look forward to reviewing your core and test programs.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Wed May 09, 2012 10:57 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
MichaelM wrote:
That's great. It probably feels very good to move on to something more tangible with your core...

It really does, and thanks for your most recent review of the .b core and your mod of the QAWXYS array register to make it have an I/O bus, even if it asynchronous. It seemed to do top speed a favor, although a warning suggested to make the register synchronous. I'm not sure how to do this yet.
I wasn't sure if it would be OK to add your name to the copyrights, so currently I've added only a comment. Say the word and this can continue to be a 4 person core! :mrgreen:

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Wed May 09, 2012 11:04 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
That's a good sentiment, and I appreciate it, but you, BigEd, and Arlet are the actual contributors to the core's IP.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 10, 2012 2:24 am 
Offline
User avatar

Joined: Thu Mar 11, 2004 7:42 am
Posts: 362
ElEctric_EyE wrote:
Bruce's assembly for C'mon is in there from lines 3602-3942.


Um, you might want to replace the BCS M2 (at line 3708, just before the COMMA label) with JMP M2, since it only branches because of the lucky coincidence that your output routine returns with the carry set. If you wind up modifying your output routine for whatever reason in the future, that may no longer be the case.

Alternatively, you should be able to remove the 3 lines you added before the BCS M2 (to output CRLF and a prompt) and replace BCS M2 with BCS M1. (Think of all the things you can do with 8 extra bytes! :P)


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 10, 2012 2:11 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
dclxvi wrote:
...Alternatively, you should be able to remove the 3 lines you added before the BCS M2 (to output CRLF and a prompt) and replace BCS M2 with BCS M1. (Think of all the things you can do with 8 extra bytes! :P)

I see that now. I used a JMP M1, so I have 7 extra bytes, heh!

Verilog is such a strict discipline, hard on the brain for a newbie like me. It will be refreshing to get into the software again. Some of my own code looks foreign though.

I retested the speed. No changes. Still passes 10ns mark.

BTW, Installing ISE10.1 now. Nice to see Xilinx archived all versions of their ISE's!

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 10, 2012 11:12 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I would like to continue the discussion of the use of the .b core in the software thread of V1.1 of the 65Org16.x devboard. I sort of went off on a tangent towards the end there... I'll bring it back to focus using the .b core as it is now, to use most the 16 accumulators along with the barrel shifter opcodes, in order to tighten the PLTCHR routine. Should be faster using the accumulators only, without using zero page memory for variable storage.

I really have got my eye on multiplication next. Maybe for this thread? Maybe! If it can pass a 10ns constraint for 100MHz+ operation. I would give up some hard earned features of the present .b core for a 16x16 multiplier, for another type of working core that can run at 100MHz. Maybe one geared for video/audio which wouldn't need so many addressing modes, but I digress... I do find myself visiting BigEd's thread here very often recently!

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Fri May 11, 2012 4:23 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Hi EEye
if you're considering a different set of tradeoffs then I would again suggest you start a new core variant and a new thread. If you keep changing the b core and start removing old features to make room for new ones, where is the old one? (This could be as simple as taking a stable copy of your files, if you don't want to use git's features of branching and tagging.)

Cheers
Ed


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 353 posts ]  Go to page Previous  1 ... 20, 21, 22, 23, 24  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: