6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 4:33 am

All times are UTC




Post new topic Reply to topic  [ 353 posts ]  Go to page Previous  1 ... 19, 20, 21, 22, 23, 24  Next
Author Message
 Post subject: Re: 65ORG16.b Core
PostPosted: Wed May 02, 2012 9:31 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Good spot. Note that verilog is quite forgiving about mismatched widths - it will zero-extend the top end of the shorter word. But you did have N and V in the wrong place!


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Wed May 02, 2012 9:54 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
EEyE:

After seeing your "WHEW" post this morning, I stopped looking over the github version of your CPU file. Seeing that you are still hunting down a problem, I gave the DECODE state another look. Like you, I couldn't see anytihing obviously wrong with that case statement. If there is an overlap, the synthesizer should issue an error or a warning. I tried forking it on github in order to have a local copy to examine more closely, but I must be doing something wrong, and your repo would not fork. So I got a zip down load. Without the testbench, I can only use the synthesizer to look for problems.

When I tried to synthesize the github code, I got two errors:

Code:
=========================================================================
*                          HDL Compilation                              *
=========================================================================
Compiling verilog file "/../../../Git-Repo/EEyE-65Org16/ALU.v" in library work
Compiling verilog file "/../../../Git-Repo/EEyE-65Org16/cpu.v" in library work
Module <ALU> compiled
ERROR:HDLCompilers:28 - "/../../../Git-Repo/EEyE-65Org16/cpu.v" line 138 'zp_reg' has not been declared
ERROR:HDLCompilers:28 - "/../../../Git-Repo/EEyE-65Org16/cpu.v" line 139 'st_reg' has not been declared
Module <cpu> compiled
Analysis of file <"cpu.prj"> failed.
-->

Total memory usage is 122932 kilobytes


I don't want to be blathering on about a non-existent issue since you are obviously synthesizing your CPU. If you'll point me to, or send me your current source, I'll try to lend a helping hand. I'll be glad to give you as much help in finding the problem you are tracking down as I can.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Wed May 02, 2012 10:26 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I must have gotten away from copying and pasting an entire cpu.v file. My apologies... And thanks for your help!

I'm going to do that now on Github...

I've commented out all code regarding zeropage and stackpage pointer registers and reimplemented the original STACKPAGE and ZERPAGE parameters, so zeropage is $00000000-$0000FFFF and stack is from $00010000-$0001FFFF.

I've corrected the Processor status byte error that I found and BigEd has confirmed...

I've placed my code to initial some registers to after some of these register were declared.

You'll also see changes to the Microcode state machine as I tried to put them in order due to my lack of understanding of verilog priorities within certain statements/cases...

It's messy, sorry. Alot of changes... But this is what I am working with currently

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Wed May 02, 2012 10:34 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Not a problem. Leaving shortly to pick up some dinner, so I will look for your update in about an hour.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 03, 2012 12:40 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
EEyE:

Did you intend to remove the register file from the original? Synthesis is showing that the register file has been replaced by a large number of registers.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 03, 2012 1:05 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
MichaelM wrote:
EEyE:

Did you intend to remove the register file from the original? Synthesis is showing that the register file has been replaced by a large number of registers.

No I didn't... Where do you see I went wrong?

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 03, 2012 1:13 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
I am not completely sure, but I suspect it is in the initial block.

I've never seen it used in that manner. Reminiscent of the way VHDL performs RAM/ROM initialization. That's not to say it's wrong. There's a lot to these languages, and I know I use only a fraction of Verilog's capabilities for the stuff that I do.

The synthesizer does report that QAXYWS is only partially initialized, and that it is going to ignore all of the initialization. This is why I suspect the initial block. For Arlet's implementation with only 4 registers in the register file, it is easy to initialize all. I suspect that since some of your extended registers are initialized and others are not, the synthesizer is confused by the construct, and simply falling back to basics.

Let me play with it a bit and see if I can get the synthesizer to implement a 32x16 Distributed Single Port RAM as you intended.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 03, 2012 3:21 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
EEyE:

Almost sent you back something with an error, but I caught it while putting together the description below.

I've attached a ZIP file with the changes that I made to CPU.v and ALU.v.

With respect to ALU.v, the changes are formatting to help me read through some of the code, otherwise I made no changes.

WRT CPU.v, I changed the way the register file is initialized. Near the top of the file, line 53, I added a parameter that declares a memory initialization file that I use down in the file to initialize the RAM of the register file. This is the way that I initialize all RAM/ROM. The method that you were using is plainly compatible with the Synthesis Guide. However, I've always found that the method I use works without issue. There is a limitation that you have to watch out for, and that is, the length of the memory initialization file must match the declared length of the memory being initialized. This makes it a bit inconvenient for parameterizable components which use memory blocks of varying lengths, i.e. FIFOs.

A bit lower, you'll find a comment // define register file components: address, RAM, data input/output .... I grouped together in this part of the file the address (regsel), the RAM (QAWXYS), and the input and output data busses (reg_di, reg_do). I renamed regfile to reg_do so that there is a closer correlation of the name with the function. Down below where you implemented the RAM write, I pulled out the input multiplexer from the always block and assigned it to reg_di. I did this so that the synthesizer doesn't have to think too much about the construct. (When in doubt follow the examples given in the lightbulb function's synthesis coding examples. Simpler constructs, less case statement nesting, etc. is always better. HDLs are not hardware description languages, they are HMLs - hardware modeling languages designed primarily for simulation. So it's better to put two simple lines in rather than one complex line; it's more likely to yield the result that you desire.)

Immediately before the always block which infers the QAWXYS RAM is where I always put the initial block that initializes a RAM/ROM that I'm using. Thus, right before the assignment of the reg_di signal, I placed the initial statement the reads in the memory configuration file for the RAM. Above I noted that there are some problems with the technique. But there is a benefit, particularly from simulation, and that is that I can change the contents of these memory initialization files in an editor while in ISim (or ModelSim, etc.), and all that is needed to load the simulation with their contents is a restart of the simulation. This is instead of a recompilation of the source.

The attached ZIP file contains the two modified Verilog files from your latest github update, and I've also added a simple UCF (PERIOD Constraint 11.25ns), a TCL script file that captures the project and tool settings I used, and the synthesis and map reports. The reports show that a distributed RAM is being used instead of 512 FFs. In 10.1i SP3, targeting a XC3S200AN-6 part, the code implements to 11.199ns cycle period, or ~89.3 MHz. In a Spartan-6 LX9 you should be able to get above 90 MHz.

Attachment:
EEyE-65Org16.zip [31.37 KiB]
Downloaded 80 times


Hope this helps. I saw no indication in the synthesis and map/par messages that would indicate that the tool is having a problem with your decoder's case statement.

_________________
Michael A.


Last edited by MichaelM on Thu May 03, 2012 10:37 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 03, 2012 6:17 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Your additions do not look too intrusive. Thank you!
I'm going to add to Github straightaway, in order to see all changes. I immediately see about 3 changes to the cpu.v code, and 1 or 2 changes to alu.v code.

From the little I understand my main problem is that I'm only initializing only part of the [15:0] QAWXYS [31:0] file, namely the 16 acc's, X, Y, W, & S.

I'll try it out right now!

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 03, 2012 7:00 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
A few observations, all of them good!

Putting the 3 files, ALU.v, cpu.v, and INIT.coe into the testbench, and after cleaning up project files, it runs a good simulation without crashing. Maxspeed with a constraint of 9.8ns (9.6 fails, no smartXplorer yet) is 105MHz!

Putting the same 3 files in my devboard project yields a predictable action now, although there is still an error plotting character pixels and it seems to be locking up. I suspect a stack problem or some other oversight on my part. Should be easier to find my errors now though based on what I'm seeing AND I can run simulations. Unfortunately, tomorrow I head back to work, but I will study your additions to the code. Now I must grab the last 2hrs of sleep that I can.

Thanks a million for your help Michael!

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 03, 2012 10:40 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
That's good. Glad I could help.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Thu May 03, 2012 11:26 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Ok, just to keep all the facts on the up and up for anyone who is not following the progress on Github. I had commented out the stackpage and zeropage pointer part of the core when I was doing the speed test that made it to 105MHz. Just made it back from work and I re-enabled that part of the core and speed suffered abit. It does pass a 10.4 ns constraint (fails a 10.2ns constraint) for a topspeed of 100.5MHz. This is still awesome IMO! Running smartXplorer can push it that few more MHz, and make for a good safety margin for an operating speed of 100MHz.

Ok, but as Arlet has pointed out, this speed only applies for this testbench, which now has the 1K zeropage, stackpage and 'OS' blockRAMs. I never did get around to comparing speeds using 4K vs. 1K blockRAMs, sorry about that. Too much to do. The real problem lies in comparing my erroneous code as a speed test. Now I have good code and the new testbench is already there, so...

If one was a real fanatic, you could put the core alone in a single Spartan 6 and utilize the full 100MHz, but we are back to the question: What if we were to use all of the blockRAM for the cpu alone... I will do this test and see how it affects speed, I promise, because I can see that when it finally does come time for 65Org32, I can easily see the pincount of the 144-pin package max'd out for that core alone, and we will want to know some speed/blockRAM utilization comparisons.

That is first on my list for next week. It does require some modification to the address decoding. Not to mention the mods to the As65 makefile, and the testcode itself. Using more blockRAM will simplify the decoding which should be a good thing.

Second on my list is to figure out why this core is still not 100% functional on the 65Org16.b devboard. But it is very consistent which is a good sign.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Fri May 04, 2012 12:28 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
EEyE:

The Spartan-6 Block RAMs are 18kb components. Thus, anything less than 2kB will not result in a change in the implementation in your target. A better test would be to implement an 16k x 16 (32kB) BRAM memory and compare that to a 1k x 16 (2kB) BRAM memory. In the first case, the result that I would expect is that 16 BRAMs would be used to implement the requirement, and that they would be instantiated as 16k x 1 BRAM, i.e. one bit per BRAM. In the second case, that would be implemented in a single BRAM.

As Arlet suggests, there should be a reduction in the performance in the case where more BRAMs are used. However, the amount that performance will be degraded, I suspect, will be fairly small. There is probably more localized routing congestion in the case of a 1k x16 BRAM than there is in the case of a 16k x 1 BRAM. There is, however, the problem of gathering the data from the 16k x 1 BRAM array over a wider area, and those routing delays may be greater than those introduced by the localized congestion of the 1k x16 BRAM.

The Spartan-6 BRAMs do support several other organizations (x2, x4, x8, x9, x16, x18, ...), but I suspect that the final mapping will use the x1 organization since your core is pretty small relative to the size of the part you are targeting. Thus, I think that the tools will resolve to the 16k x 1 organization because that provides it the most flexibility in the placement of the BRAMs to meet your PERIOD constraint.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Fri May 04, 2012 1:00 am 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
If it runs in the vicinity of 100MHz, what I/O is in the works? The I/O ICs we're familiar with that go directly on the bus won't work anywhere near that speed. If you add a few SPI ports to the design, there are a lot of SPI ones to take advantage of, some of which will run at SPI clock speeds of over 50MHz, but that's still not much over 5MB/s. I've at least skimmed every single post, but maybe I'm forgetting something. 22 pages is a lot.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
 Post subject: Re: 65ORG16.b Core
PostPosted: Fri May 04, 2012 1:21 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
GARTHWILSON wrote:
If it runs in the vicinity of 100MHz, what I/O is in the works?...

Hi Garth. Arlet's main 8-bit core which the .b core is built around, does have a RDY pin. Extra I/O interfaces would use more pins and I would imagine lower speeds. So in regards to the 65Org32, the Spartan 6 would house the cpu and ALU and that's it. I am picturing a bidirectional data bus at the top level. I've not gotten down to the specifics at this point, but close to 70+ pins max would be needed for a real 'old school' type CPU where you would have 32bit databus, 32bit+ address bus, r/w, phase2, NMI, IRQ, RDY, RES.

This is abit OT, but I think it's a healthy discussion so please forgive if you disagree...

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 353 posts ]  Go to page Previous  1 ... 19, 20, 21, 22, 23, 24  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 11 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: