EEyE:
Almost sent you back something with an error, but I caught it while putting together the description below.
I've attached a ZIP file with the changes that I made to CPU.v and ALU.v.
With respect to ALU.v, the changes are formatting to help me read through some of the code, otherwise I made no changes.
WRT CPU.v, I changed the way the register file is initialized. Near the top of the file, line 53, I added a parameter that declares a memory initialization file that I use down in the file to initialize the RAM of the register file. This is the way that I initialize all RAM/ROM. The method that you were using is plainly compatible with the Synthesis Guide. However, I've always found that the method I use works without issue. There is a limitation that you have to watch out for, and that is, the length of the memory initialization file must match the declared length of the memory being initialized. This makes it a bit inconvenient for parameterizable components which use memory blocks of varying lengths, i.e. FIFOs.
A bit lower, you'll find a comment // define register file components: address, RAM, data input/output .... I grouped together in this part of the file the address (regsel), the RAM (QAWXYS), and the input and output data busses (reg_di, reg_do). I renamed regfile to reg_do so that there is a closer correlation of the name with the function. Down below where you implemented the RAM write, I pulled out the input multiplexer from the always block and assigned it to reg_di. I did this so that the synthesizer doesn't have to think too much about the construct. (When in doubt follow the examples given in the lightbulb function's synthesis coding examples. Simpler constructs, less case statement nesting, etc. is always better. HDLs are not hardware description languages, they are HMLs - hardware modeling languages designed primarily for simulation. So it's better to put two simple lines in rather than one complex line; it's more likely to yield the result that you desire.)
Immediately before the always block which infers the QAWXYS RAM is where I always put the initial block that initializes a RAM/ROM that I'm using. Thus, right before the assignment of the reg_di signal, I placed the initial statement the reads in the memory configuration file for the RAM. Above I noted that there are some problems with the technique. But there is a benefit, particularly from simulation, and that is that I can change the contents of these memory initialization files in an editor while in ISim (or ModelSim, etc.), and all that is needed to load the simulation with their contents is a restart of the simulation. This is instead of a recompilation of the source.
The attached ZIP file contains the two modified Verilog files from your latest github update, and I've also added a simple UCF (PERIOD Constraint 11.25ns), a TCL script file that captures the project and tool settings I used, and the synthesis and map reports. The reports show that a distributed RAM is being used instead of 512 FFs. In 10.1i SP3, targeting a XC3S200AN-6 part, the code implements to 11.199ns cycle period, or ~89.3 MHz. In a Spartan-6 LX9 you should be able to get above 90 MHz.
Attachment:
EEyE-65Org16.zip [31.37 KiB]
Downloaded 79 times
Hope this helps. I saw no indication in the synthesis and map/par messages that would indicate that the tool is having a problem with your decoder's case statement.