6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Oct 03, 2024 8:37 am

All times are UTC




Post new topic Reply to topic  [ 27 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Sun Feb 23, 2014 4:22 am 
Offline

Joined: Tue May 05, 2009 2:49 pm
Posts: 109
I was wondering if someone who is pretty knwoledgeable on CPLDs might want to give a newbie a hand in optimization.

I have an open source VIC-20 cartridge design that requires some programmable logic.

I have learned Verilog
I have learned how to use the Xilinx design tools
I have created a design that will synthesize, and I know much of it works (I have a small CPLD board here to test with)
I am trying to finish the design, and I am running into trouble.

I am using (or would like to use) a 9572xl in the design. But, when I try to put the last things in the design, I get a fit error, that the design is trying to mapping 65 (or 66) equations into 4 function blocks, and evidently the 9572xl can't do that. But, when I simplify the design to just do 64 macrocells, the report states 64/72 macrocells used.

* How do I unlock those lack few macrocells?
* If I can't why does it claim 72 when only 64 can be used?

There *might* be some obvious optimizations I can do to the logic equations (I am new to this, obviously), but I can;t see them at present.

Full source available (I'm happy to zip up my entire project and send it to anyone who wants to see or help).

The CPLD implements banking for the RAM1,RAM2,RAM3,BLK1,BLK2,BLK3,BLK5 banks, and allows one to select RAM, ROM, or write protected RAM at each bank. That all works. The two pieces lacking are:

* ability to reset the PC (but not the cart) under software control
* ability to lock the registers until hard reset.

Just adding:

assign reset_out = (cart_config1_reg_ce & data[7]); // drop 1 on reset_out when register 1 data[7] is high and write active)

puts me over 64.

Again, any help appreciated. I can go to a larger CPLD, but would really just like to know why this is happening.

Jim


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 23, 2014 12:05 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I'll try to help if I can although I have more experience with a few settings for FPGAs.
I'll need your project file.

EDIT: You can try to edit the Design Strategy under Project. Set the Optimization Goal for In Strategy and select for Area instead of Speed. Then the next setting under that Optimization Effort, set that to high. Another one you might want to try is further down, it's called Implementation Template. Set that for Optimize Density. There's a few more, but you probably get the idea.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 23, 2014 12:28 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10943
Location: England
It's probably worth a look at the Fitter Report. There are several different resources which might be the limiting factor on a more complex design. As EEye says, applying different tactics might make the crucial difference. Failing that, expressing yourself differently in Verilog might do it.

If your equations are complex, they may need more than one macrocell to implement.

Here's an example fitting report (for a very small and simple thing):
Code:
cpldfit:  version O.87xd                            Xilinx Inc.
                                  Fitter Report
Design Name: jc2_top                             Date:  2-23-2014, 12:22PM
Device Used: XC9572XL-7-CS48
Fitting Status: Successful

*************************  Mapped Resource Summary  **************************

Macrocells     Product Terms    Function Block   Registers      Pins           
Used/Tot       Used/Tot         Inps Used/Tot    Used/Tot       Used/Tot       
6  /72  (  8%) 28  /360  (  8%) 24 /216 ( 11%)   6  /72  (  8%) 8  /38  ( 21%)

** Function Block Resources **

Function    Mcells      FB Inps     Pterms      IO         
Block       Used/Tot    Used/Tot    Used/Tot    Used/Tot   
FB1           3/18        8/54       10/90       2/10
FB2           2/18        8/54       12/90       3/ 9
FB3           1/18        8/54        6/90       2/11
FB4           0/18        0/54        0/90       1/ 8
             -----       -----       -----      -----   
              6/72       24/216      28/360      8/38

* - Resource is exhausted

Cheers
Ed


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 23, 2014 2:27 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
brain:

I'll also be glad to lend a hand. You can PM me and attach the project file.

It is not the number of macro cells which is likely to be the issue you are having. Each macro cell in the XC95xxXL CPLDs has 5 product terms. The factor that limits the fitter is the number of product (AND) terms in the equation. Each macro cell, whether a register or combinatorial output, is limited to the sum (OR) of 5 product terms generated within the macro cell itself.

The XC95xxXL CPLDs have the ability to expand the number of product terms allowed in an equation, but it requires the use of product terms from adjacent macrocells. When this situation arises, i.e. product term expansion, the architecture begins to have some trouble fitting designs. In particular, the Xilinx XC95xxXL fitter has to make decisions regarding the sharing of intermediate product term equations among the various macro cells.

There is a limit to how many inputs a function block may have; a function block is a group of 18 macro cells. The limit is 54. There are 18 dedicated inputs for each function block. Each macro cell has direct access to the TRUE and NOT TRUE state of each of these inputs. That makes 36 inputs to the product term matrix of each macro cell. The remaining 18 inputs to this matrix come from the feedback of the macro cell outputs into the local matrix.

There are three Xilinx CPLD Fitter options that you should check and set. First, when having trouble fitting the first option to check is the Exhaustive Fit Mode option. It is an option for Fit, and is set to OFF (unchecked) by default. If you are having trouble fitting, the first thing to try is to set this option to ON (checked). You can control the number devices tried in the Exhaustive Fit Mode by selecting a specific device, or by selecting a device such as "Automatic XC9500XL". Selecting the "Automatic XC9500XL" device option for your project will increase the run time of the fitter, but it will certainly let you know if your project will fit into a smaller or larger part.

Second is the Collapsing Input Limit. This is typically set to 54. If it is not set to the maximum allowed, I would recommend setting it to the maximum. Third is the Collapsing Pterm Limit. This limit sets the maximum allowable number of pterms in a "collapsed" equation. Multi-level equations in the source files are fitted to the CPLD by collapsing them as much as possible depending on the complexity of the equation, which is easily masked by all of the fancy Verilog operators available, the number of pterms in the resulting equation can be quite large. The end result is that all of the logic optimization that is generally set by default is working against fitting a design into a particular CPLD. I would recommend setting this value to 25 or less. A value of 25 essentially allows a collapsed equation to use the pterms of 5 macro cells in its implementation. A macro cell sharing all of its pterms is very likely to be unavailable for use. Before mucking around with these two parameters as discussed above, set the Exhaustive Fit Mode and let the fitter try the various settings of these two parameters automatically.

Finally, check the fitter report and verify that the number of pterms required in your design is less than the 360 (72 * 5) pterms available in the XC9572XL you are trying to fit into. Another thing to try is view the equations for each of the outputs. Viewing the equations is an option in the fitter report.

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 23, 2014 7:05 pm 
Offline

Joined: Tue May 05, 2009 2:49 pm
Posts: 109
OK, try to contain your laughter and ridicule, as my Verilog is no doubt horrid. I am open to any suggestions on improvements.

I put the entire prj here: http://jbrain.com/incoming/UltiMem.zip

To answer a few questions:

Though I switched back, I did try the strategy changes suggested. They did not seem to help, so I turned them off. But, maybe I didn't turn enough of them on.

The design takes 67 MCs, 268 pterms, 46 registers, 41 pins, and 200 FBIs Those seem inside the 9572 target (right now, I set the design to Auto XC9500xl to get it to compile.

Collapsing Input was set to 54 already, as I recall.

I added in the two items I am trying to get into the design (a register to lock the writes to registers out, and a SW reset). If those are taken out, the design moves down to 64 MCs, and will fit in a 9572xl

Jim


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 23, 2014 8:41 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8411
Location: Midwestern USA
brain wrote:
OK, try to contain your laughter and ridicule, as my Verilog is no doubt horrid. I am open to any suggestions on improvements.

I don't think anyone is going to ridicule anything that involves learning, especially when it comes to an arcane subject like programmable logic. You think your Verilog looks bad, you should have seen some of my early attempts at writing 6502 assembly language. :D A bowl of spaghetti looked clear and meaningful in comparison. :lol:

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 23, 2014 10:59 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
brain:

BigDumbDinosaur wrote:
brain wrote:
OK, try to contain your laughter and ridicule, as my Verilog is no doubt horrid. I am open to any suggestions on improvements.


I don't think anyone is going to ridicule anything that involves learning, especially when it comes to an arcane subject like programmable logic. You think your Verilog looks bad, you should have seen some of my early attempts at writing 6502 assembly language. :D A bowl of spaghetti looked clear and meaningful in comparison. :lol:

I think that BDD's sentiments are right on target.

I have spend a few moments with the code you provided. The following are some observations:

(1) CPLDs, in particular the XC9500/XL/XV families, are not well suited to using them as general storage. Meaning that the resources required to implement multi-bit multiplexers within the CPLD is going to use a lot of the function block (FB) resources. In addition, I try very hard to limit bidirectional access on a data bus to registers within an XC9500XL CPLD. Their internal architectures are just not designed to function in this manner.

In a TQ100 package you should not be pin limited, so one suggestion will be for you to implement an input-only bus for the input data path, and one (or more) output-only buses for the output data path(s). This suggestion may or may not apply in your case. I would prefer, if at all possible, that the data written to the CPLD be shadowed elsewhere so that you do not need to read the contents of the registers. If your system timing is correct, you should not need to verify that the registers are written with the data you computed and stored in some kind of on memory in your 6502 system. (If you need to verify system timing, then expose some bits of your internal registers on unused pins of the CPLD.)

(2) Like you, your project synthesizes rather easily for me in an XC95C144XL. However, it uses all of the FBs. I played around with the fitting parameters that I referred to in an earlier post, and I got your project to fit. I made one change, but I don't think that it had any affect. I changed the equation for your reset_en signal; I converted it into a FF:
Code:
reg reset_en = 0;
always @(negedge clock) reset_en <= (cart_config1_reg_ce & data[7]);

Using all of the default settings for synthesis and fitting, I changed the "Collapsing Input Limit" parameter to 28, and the "Collapsing Pterm Limit" parameter to 5. With these settings, your project fits into an XC9572XL-5TQ100 CPLD.

I've attached the fitted design and the entire project in the attached ZIP file. Consider implementing the recommendations I made above with regard to reading back your internal registers. Either recommendation should provide additional resources beyond the 3 remaining macro cells.

Attachment:
File comment: Updated - Fitted Design
UltiMem.zip [1017.33 KiB]
Downloaded 109 times

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 23, 2014 11:28 pm 
Offline

Joined: Tue May 05, 2009 2:49 pm
Posts: 109
MichaelM wrote:
(1) CPLDs, in particular the XC9500/XL/XV families, are not well suited to using them as general storage. Meaning that the resources required to implement multi-bit multiplexers within the CPLD is going to use a lot of the function block (FB) resources.

I did not know that.
Quote:
In addition, I try very hard to limit bidirectional access on a data bus to registers within an XC9500XL CPLD. Their internal architectures are just not designed to function in this manner.

In a TQ100 package you should not be pin limited, so one suggestion will be for you to implement an input-only bus for the input data path, and one (or more) output-only buses for the output data path(s). This suggestion may or may not apply in your case. I would prefer, if at all possible, that the data written to the CPLD be shadowed elsewhere so that you do not need to read the contents of the registers. If your system timing is correct, you should not need to verify that the registers are written with the data you computed and stored in some kind of on memory in your 6502 system. (If you need to verify system timing, then expose some bits of your internal registers on unused pins of the CPLD.)

It is true that folks should *NEED* to know the values of the registers, so I could make them write only registers... As a software developer, I cronge at write only registers, though, which is why I made them read/write. You make a valid point, though.
Quote:

Code:
reg reset_en = 0;
always @(negedge clock) reset_en <= (cart_config1_reg_ce & data[7]);

Using all of the default settings for synthesis and fitting, I changed the "Collapsing Input Limit" parameter to 28, and the "Collapsing Pterm Limit" parameter to 5. With these settings, your project fits into an XC9572XL-5TQ100 CPLD.

I dloaded your file, and synthesized it for a 9572xl-tq100. Did you send the xds strategy file you used?
Quote:

I've attached the fitted design and the entire project in the attached ZIP file. Consider implementing the recommendations I made above with regard to reading back your internal registers. Either recommendation should provide additional resources beyond the 3 remaining macro cells.

Attachment:
UltiMem.zip


Thanks. I appreciate it.

Doing input and output-only busses (ignoring your comment about bidirectional busses for a moment) would mean putting a '245 or similar outside the CPLD. It'd probably be better to just make the registers write only...

On the multiplexed bank outputs, not sure how to redo that to use different busses. The idea is that each BLK in the VIC-20 can have it's own bank value, but there is only 1 FLASH ROM IC, so somehow all 5 of the bank values need to be multiplexed onto the FLASH ROM high order address lines.

Jim


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 24, 2014 12:20 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
brain wrote:
Did you send the xds strategy file you used?
I am not familiar with the xds file extension. If you are referring to the Xilinx Design Strategy option for ISE 11 or higher, I don't use that feature. For my day-to-day work, I use ISE 10.1i SP3, and that feature is not well developed; I generally get better results by not using the ISE design strategy options in ISE 10.1i or ISE 14.4i.

I am assuming that you synthesized it with the project file included in the ZIP file. If you are not using the same revision of ISE as I use on a day-to-day basis (ISE 10.1i SP3), then you should have been prompted to upgrade to the format of whatever version you are using.
brain wrote:
Doing input and output-only busses (ignoring your comment about bidirectional busses for a moment) would mean putting a '245 or similar outside the CPLD. It'd probably be better to just make the registers write only...
That's not necessary. Connect your external 6502 bidirectional data bus to both the input-only bus to the output-only bus. The output only bus is enabled only when the control signals indicate a read operation, and the input-only bus is always connected to the internal registers. Its value is only written to the registers when a write operation is on your 6502 bus.
brain wrote:
On the multiplexed bank outputs, not sure how to redo that to use different busses. The idea is that each BLK in the VIC-20 can have it's own bank value, but there is only 1 FLASH ROM IC, so somehow all 5 of the bank values need to be multiplexed onto the FLASH ROM high order address lines.
I wasn't trying to comment on this subject.

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 24, 2014 1:37 am 
Offline

Joined: Tue May 05, 2009 2:49 pm
Posts: 109
MichaelM wrote:
I am not familiar with the xds file extension. If you are referring to the Xilinx Design Strategy option for ISE 11 or higher, I don't use that feature. For my day-to-day work, I use ISE 10.1i SP3, and that feature is not well developed; I generally get better results by not using the ISE design strategy options in ISE 10.1i or ISE 14.4i.

Ah, yes, I am probably using a newer version. The ISE Project Navigator About says 14.6 (nt64), so I assume that's not 10.X. Can you walk me through how to apply the design strategy options without using ISE? I have make here, and can create a makefile.

Quote:
I am assuming that you synthesized it with the project file included in the ZIP file. If you are not using the same revision of ISE as I use on a day-to-day basis (ISE 10.1i SP3), then you should have been prompted to upgrade to the format of whatever version you are using.

Strangely did not not ask me to upgrade anything.
Quote:
That's not necessary. Connect your external 6502 bidirectional data bus to both the input-only bus to the output-only bus. The output only bus is enabled only when the control signals indicate a read operation, and the input-only bus is always connected to the internal registers. Its value is only written to the registers when a write operation is on your 6502 bus.

Didn't think of that. Yep, that would work fine.

Jim


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 24, 2014 3:30 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
brain:

I don't use make. The BrainCPLD.ise project file I sent you in the zip files has all of the synthesis and fitter directives embedded. The attached BrainCPLD.ZIP file has a single file included: BrainCPLD.tcl. You will likely need to edit the myProject variable in the script using the ISE editor (or other editor as appropriate) to your project name: UltiMem.

You can run the TCL script from the TCL Console tab of ISE.
TCL Shell wrote:
xtclsh BrainCPLD.tcl set_process_props
It will set all of the process properties for CPLD synthesis and fitting.

You can generate that script from within ISE to save your tool settings. Under the Project menu, select the Generate TCL script submenu. I always use the generate option that sets all of the options, whether they are the default settings or settings that I may have set.
Attachment:
File comment: TCL Script to set Project Properties
BrainCPLD.zip [4.78 KiB]
Downloaded 123 times

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 24, 2014 4:03 am 
Offline

Joined: Tue May 05, 2009 2:49 pm
Posts: 109
Hmmm, looks like I might have dload 10.X and install it. When I run xtclsh BrainCPLD.tcl set_process_props

I get:

Project UltiMem.ise not found. Use project_rebuild to recreate it.

So, I do project_rebuild:

WARNING:TclTasksC:2145 - The .ise file is no longer supported starting with the
12.1 release. The corresponding .xise file will be used instead.

Then, later, it complains it cannot find the ise file

:-)

Jim


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 24, 2014 2:29 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
brain:

I wouldn't recommend that. It's too big, and you can get the same results with ISE 14.x. I opened up my ISE 14.4 and re-created the project. Attached is a ZIP of the complete project. I tested the TCL file it (ISE 14.4) generated, and it had some errors. It tried to set the process properties for some unrecognized processes. I edited the UltiMem.tcl file to comment out all of those set process property instances. The command "xtclsh UltiMem.tcl set_process_props" correctly sets the properties and terminates without error.
Attachment:
File comment: Complete ISE 14.4 project with edited UltiMem.tcl file to set process properties.
UltiMem-ISE14.4.zip [875.88 KiB]
Downloaded 112 times

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 24, 2014 3:26 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10943
Location: England
About using the Xilinx tools from the command line, whether scripted or using make, you can usually pick off the necessary invocations from the Console window of the GUI mode - which you only need to do once.
For example, I see
Code:
Started : "Fit".
Running cpldfit...
Command Line: cpldfit -intstyle ise -p xc9572xl-7-CS48 -ofmt vhdl -optimize speed -htmlrpt -loc on -slew fast -init low -inputs 54 -pterms 25 -unused float -power std -terminate keeper jc2_top.ngd


Cheers
Ed


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 25, 2014 12:36 am 
Offline

Joined: Tue May 05, 2009 2:49 pm
Posts: 109
Ed, yes, I figured I could sniff around in the Tcl for these things, but I fear I still need the crutch of the GUI at present (I'm comfy with AVR C/C++ using just VIM and make, but this is all new to me.

Comparing the generated tcl script from my UltiMem and Michael's, I see the issue, but I don't see how it happened. I changed my design strategy, but it didn't "take". If I go into "Design Goals and Strategies", and select "view strategy", I can see the current values do not reflect the strategy I created (and saved).

I'd like to understand (via the GUI) how to make it use my parms (28 and 5, is I think the two that matter most of the above items).

JIm


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 27 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 10 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: