Page 4 of 5
Re: n00b Verilog Questions
Posted: Tue Dec 24, 2013 7:24 pm
by ElEctric_EyE
All those 'if' statements:
Code: Select all
always @(posedge clk) begin
if (lineCS && cpuAB == 5'b00000) //variables for line generator
x0t <= cpuDO;
if (lineCS && cpuAB == 5'b00001)
y0t <= cpuDO;
if (lineCS && cpuAB == 5'b00010)
x1t <= cpuDO;
if (lineCS && cpuAB == 5'b00011)
y1t <= cpuDO;
if (lineCS && cpuAB == 5'b00100) //variables for circle generator
xc <= cpuDO;
if (lineCS && cpuAB == 5'b00101)
yc <= cpuDO;
if (lineCS && cpuAB == 5'b00110)
rad <= cpuDO;
if (lineCS && cpuAB == 5'b00111) //color variable
color <= cpuDO;
if (lineCS && cpuAB == 5'b01000) //X length of blitter
bXlen <= cpuDO;
if (lineCS && cpuAB == 5'b01001) //Y length of blitter
bYlen <= cpuDO;
if (lineCS && cpuAB == 5'b01010) //X start copy
bXc <= cpuDO;
if (lineCS && cpuAB == 5'b01011) //Y start copy
bYc <= cpuDO;
if (lineCS && cpuAB == 5'b01100) //X start paste
bXp <= cpuDO;
if (lineCS && cpuAB == 5'b01101) //Y start paste
bYp <= cpuDO;
if (lineCS && cpuAB == 5'b01110) //border color
border <= cpuDO;
if (lineCS && cpuAB == 5'b10000) //horizontal offset
hoffset <= cpuDO;
if (lineCS && cpuAB == 5'b10001) //vertical offset
voffset <= cpuDO;
end
can be replaced by a more succinct:
Code: Select all
// Register addresses
always @(posedge clk)
if ( lineCS )
case ( cpuAB [4:0] )
5'b00000: x0t <= cpuDO; //variables for line generator
5'b00001: y0t <= cpuDO;
5'b00010: x1t <= cpuDO;
5'b00011: y1t <= cpuDO;
5'b00100: xc <= cpuDO; //variables for circle generator
5'b00101: yc <= cpuDO;
5'b00110: rad <= cpuDO;
5'b00111: color <= cpuDO; //pixel color variable
5'b01000: bXlen <= cpuDO; //X length of blitter
5'b01001: bYlen <= cpuDO; //Y length of blitter
5'b01010: bXc <= cpuDO; //X start copy
5'b01011: bYc <= cpuDO; //Y start copy
5'b01100: bXp <= cpuDO; //X start paste
5'b01101: bYp <= cpuDO; //Y start paste
5'b01110: border <= cpuDO; //border color
endcase
Re: n00b Verilog Questions
Posted: Wed Dec 25, 2013 3:27 am
by player55328
if (lineCS && cpuAB == 5'b10000) //horizontal offset
hoffset <= cpuDO;
if (lineCS && cpuAB == 5'b10001) //vertical offset
voffset <= cpuDO;
Are you not missing these in your new case statement?
I would suggest you add a default to your case statement just in case you get a situation you are not expecting.
Re: n00b Verilog Questions
Posted: Wed Dec 25, 2013 2:46 pm
by ElEctric_EyE
if (lineCS && cpuAB == 5'b10000) //horizontal offset
hoffset <= cpuDO;
if (lineCS && cpuAB == 5'b10001) //vertical offset
voffset <= cpuDO;
Are you not missing these in your new case statement?...
Good eye! But I actually don't need those delay registers on my final output board.
...I would suggest you add a default to your case statement just in case you get a situation you are not expecting.
That seems to be a good practice, but I'm not sure how I would go about that in this case.
Re: n00b Verilog Questions
Posted: Wed Dec 25, 2013 3:53 pm
by MichaelM
I would suggest you add a default to your case statement just in case you get a situation you are not expecting.
When I read EEyE's post yesterday, I had the same thought initially, but I changed my mind.
First, the
if construction that he's using in a single
always block is correct. It could be expanded into 17 separate
always blocks and the structure would generate the expected logic behavior. That is, only when the condition is true is the register updated, and at all other times the register holds its last value.
Second, it is true that the
case statement in the second example, nested inside an
if statement, is missing the recommended
default clause that prevents the generation of inferred latches. However, since these two statements are inside a clocked
always block, there is no need to add the
default clause.
That seems to be a good practice, but I'm not sure how I would go about that in this case.
I also have the same thought. I generally include a
default clause to all
case statements, and in particular, to
case statements in combinatorial
always blocks. However, I think this is an example where that general recommendation is not appropriate. In my opinion, adding a
default clause to this case statement will reduce the readability of the code.
Re: n00b Verilog Questions
Posted: Wed Dec 25, 2013 6:14 pm
by Rob Finch
Just a note: I've found out that sometimes case statements provide better performance than if/else if statements. This is with the Webpack synthesis tools.
Re: n00b Verilog Questions
Posted: Wed Mar 05, 2014 12:42 pm
by ElEctric_EyE
I've got another hopefully simple question regarding ROM's, i.e. initialized blockRAM.
I used to use a clock for my character ROM, but in trying to optimize for speed and lose 1 cycle I decide to try replacing the
with
so the whole ROM module looks like
Code: Select all
//1Kx16 Character ROM, i.e. initialized FPGA blockRAM
`timescale 1ns / 1ps
module CharROM ( input [9:0] addr,
output reg [15:0] dout
);
reg [15:0] RAM [1023:0];
initial $readmemh("C:/FPGA/PVBRAMalt2/character.coe",RAM,0,1023);
always @*
dout <= RAM[addr];
endmodule
It does seem to work and save a cycle, but I see no mention of this style in "ROM HDL Coding Techniques" in the
Xilinx manual, so chances are it is incorrect?
Re: n00b Verilog Questions
Posted: Wed Mar 05, 2014 9:11 pm
by MichaelM
This is the same style that I've used successfully in a commercial product. The size of your implementation and mine differ significantly. My implementation is a 128 x 32 implementation, and yours is a 1024 x 16 implementation. In a clocked style, your implementation would be put into a single 16kbit Block RAM, but in the unclocked, asynchronous style you have used you are going to require 256 64 x 1 LUTs in a Spartan 6 FPGA.
The only caution of using LUTs for ROMs is given in the quote I cut out of the Spartan 6 data sheet: ds160.
CLBs, Slices, and LUTs
Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side as part of two vertical
columns. There are three types of CLB slices in the Spartan-6 architecture: SLICEM, SLICEL, and SLICEX. Each slice
contains four LUTs, eight flip-flops, and miscellaneous logic. The LUTs are for general-purpose combinatorial and
sequential logic support. Synthesis tools take advantage of these highly efficient logic, arithmetic, and memory features.
Expert designers can also instantiate them.
SLICEM
One quarter (25%) of Spartan-6 FPGA slices are SLICEMs. Each of the four SLICEM LUTs can be configured as either a
6-input LUT with one output, or as dual 5-input LUTs with identical 5-bit addresses and two independent outputs. These
LUTs can also be used as distributed 64-bit RAM with 64 bits or two times 32 bits per LUT, as a single 32-bit shift register
(SRL32), or as two 16-bit shift registers (SRL16s) with addressable length. Each LUT output can be registered in a flip-flop
within the CLB. For arithmetic operations, a high-speed carry chain propagates carry signals upwards in a column of slices.
SLICEL
One quarter (25%) of Spartan-6 FPGA slices are SLICELs, which contain all the features of the SLICEM except the
memory/shift register function.
SLICEX
One half (50%) of Spartan-6 FPGA slices are SLICEXs. The SLICEXs have the same structure as SLICELs except the
arithmetic carry option and the wide multiplexers.
As long as your logic requirements are being met, then using such a large number of LUTs for a ROM that can be easily placed in a BRAM is not a bad trade. However, the Spartan 6 architecture, as nice as it is, does have a tendency to orphan a bunch of resources because of its characteristic of only having 25% of its slices associated with SliceMs.
Did you try clocking the character generator ROM on the falling edge of the clock? That's one approach to reducing the delay effect of the synchronous BRAM on the overall pipeline delay of a design. Depending on the path delays from your character shift register to the address input of the BRAM and the path delays from the BRAM output to your video shift register, the read access time of the BRAM, ~2.5ns, should allow plenty of set up to your output video shift register.
Re: n00b Verilog Questions
Posted: Fri Mar 07, 2014 2:32 am
by ElEctric_EyE
Looking at the console, it still looks to be synthesizing as a blockRAM with a clock signal.
Code: Select all
Synthesizing (advanced) Unit <top_level>.
INFO:Xst:3226 - The RAM <CharROM/Mram_RAM> will be implemented as a BLOCK RAM, absorbing the following register(s): <HVSync/CRaddr>
-----------------------------------------------------------------------
| ram_type | Block | |
-----------------------------------------------------------------------
| Port A |
| aspect ratio | 1024-word x 16-bit | |
| mode | write-first | |
| clkA | connected to signal <clk> | rise |
| enA | connected to internal node | high |
| weA | connected to signal <GND> | high |
| addrA | connected to signal <HVSync/char[7]_char[7]_mux_304_OUT> | |
| diA | connected to signal <GND> | |
| doA | connected to signal <CRdata> | |
-----------------------------------------------------------------------
| optimization | speed | |
-----------------------------------------------------------------------
Re: n00b Verilog Questions
Posted: Fri Mar 07, 2014 10:07 pm
by MichaelM
That's good news for your project.
To keep it the BRAM you need to keep away from looking at the character ROM output data before the CRdata register. To remove another clock delay, if necessary, you might consider attaching the address of the BRAM to the input of the HVSync/char[7]_char[7]_mux_304_OUT register.
It's a nice feature of the tools to use downstream registers so that large asynchronous RAM/ROM elements can be placed in BRAM. It certainly helps preserve the more flexible SLICEM slices.
Re: n00b Verilog Questions
Posted: Sat Mar 08, 2014 1:48 am
by ElEctric_EyE
Thanks for your input, it's much appreciated! I had thought something was going awry.
Re: n00b Verilog Questions
Posted: Wed Mar 19, 2014 10:34 am
by ElEctric_EyE
Yesterday I was adding 2 large (16Kx11) blockRAMs in the same fashion that my smaller 1Kx16 zeropage and 1Kx16 stackpage blockRAMs were written (see OLD below). The smaller RAMs have been working fine for over a year. However I noticed I was having problems with the functioning of the larger RAMs, in simulation the outputs were undefined X's.
I thought I would share what I had done to get it working. The 2 kinds of RAMs are similar, just widths and depths are changed.
A few other notes: I removed the .bmm file which forces the location of the ROM, since this was a major change to the design. After the tools reassigned positions, I added the .bmm file back in so that I could same alot of development time as mentioned
here. Also, changing the width to [15:0] and depth down to [8191:0] had no change in incorrect behavior.
Any thoughts? Again this style is not in the Xylinx manuals, but the new style seemed more natural. I got lucky!
OLD:
Code: Select all
// 1Kx16 stack page FPGA blockRAM
`timescale 1ns / 1ps
module RAM ( input clk,
input we,
input rst,
input [9:0] addr,
input [15:0] din,
output reg [15:0] dout
);
reg [15:0] RAM [1023:0];
always @(posedge clk) begin
if (we)
RAM[addr] <= din;
else
dout <= RAM[addr];
if (rst)
dout <= 0;
end
endmodule
NEW:
Code: Select all
// 16Kx11 scratch page FPGA blockRAM
`timescale 1ns / 1ps
module SCRAM ( input clk,
input we,
input rst,
input [13:0] addr,
input [15:0] din,
output reg [15:0] dout
);
reg [10:0] RAM [16383:0];
always @(posedge clk)
if (we)
RAM[addr] <= din;
else
dout <= rst ? 16'h0000 : RAM[addr];
endmodule
Re: n00b Verilog Questions
Posted: Thu Mar 20, 2014 3:26 am
by Rob Finch
in simulation the outputs were undefined X's.
The contents of the RAM are typically X's (for unknown) unless you have hardware that initializes them. Unless the RAM's are initialized to something before they are dumped in the simulator, they will read as X.
One way to initialize the RAM's is to include an 'initial begin' block like the following:
Code: Select all
// 1Kx16 stack page FPGA blockRAM
`timescale 1ns / 1ps
module RAM ( input clk,
input we,
input rst,
input [9:0] addr,
input [15:0] din,
output reg [15:0] dout
);
integer n;
reg [15:0] RAM [1023:0];
initial begin
for (n = 0; n <= 1023; n = n + 1)
RAM[n] <= 0;
end
always @(posedge clk) begin
if (we)
RAM[addr] <= din;
else
dout <= RAM[addr];
if (rst)
dout <= 0;
end
endmodule
You don't always needs to initialize RAM components. It's just that the memory will read as X until something is written into it.
Re: n00b Verilog Questions
Posted: Thu Mar 20, 2014 10:16 am
by ElEctric_EyE
That's good info Rob, but I don't think that's what was going on, as those RAMs are usually in a rst condition when they aren't selected, so they should be outputting '0's. In simulation the small RAM's were outputting '0's, the larger one's 'X's. If they output anything other than '0's when they're not selected by the cpu, it crashes the cpu since all 4 RAMs (and ROM) data outs are OR'd going to cpu data in.
Re: n00b Verilog Questions
Posted: Thu Mar 20, 2014 12:52 pm
by Rob Finch
Well, in that case,
All I can see is the two code samples for the big and small RAMS are slightly different in how the rst signal is applied.
In the smaller RAM rst is applied independently of we.
Code: Select all
always @(posedge clk) begin
if (we)
RAM[addr] <= din;
else
dout <= RAM[addr];
if (rst)
dout <= 0;
end
In the 16k case rst is applied in 'else' statement of we:
Code: Select all
always @(posedge clk)
if (we)
RAM[addr] <= din;
else
dout <= rst ? 16'h0000 : RAM[addr];
I think this may cause dout to latch the previous value when we is applied. Hence multiple RAM dout's may be active during we resulting in an X value.
I'm just eyeballing this, so could be wrong.
Re: n00b Verilog Questions
Posted: Fri Mar 21, 2014 12:31 am
by ElEctric_EyE
No, the system works fine now when
all the BRAMs are coded like this:
Code: Select all
always @(posedge clk)
if (we)
RAM[addr] <= din;
else
dout <= rst ? 16'h0000 : RAM[addr];
And the system
had worked fine when only the smaller stackpage and zeropage BRAMs
were coded like below. I ran into issues when I had tried adding the larger BRAMs with this same code:
Code: Select all
always @(posedge clk) begin
if (we)
RAM[addr] <= din;
else
dout <= RAM[addr];
if (rst)
dout <= 0;
end