n00b Verilog Questions

Topics relating to PALs, CPLDs, FPGAs, and other PLDs used for the support or creation of 65-family processors, both hardware and HDL.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: n00b Verilog Questions

Post by ElEctric_EyE »

All those 'if' statements:

Code: Select all

always @(posedge clk) begin
   if (lineCS && cpuAB == 5'b00000)			//variables for line generator
      x0t <= cpuDO;
   if (lineCS && cpuAB == 5'b00001)
      y0t <= cpuDO;
   if (lineCS && cpuAB == 5'b00010)
      x1t <= cpuDO;
   if (lineCS && cpuAB == 5'b00011)
      y1t <= cpuDO;
	
	if (lineCS && cpuAB == 5'b00100)			//variables for circle generator
		xc <= cpuDO;
	if (lineCS && cpuAB == 5'b00101)
		yc <= cpuDO;
	if (lineCS && cpuAB == 5'b00110)
		rad <= cpuDO;
		
	if (lineCS && cpuAB == 5'b00111)			//color variable
		color <= cpuDO;
		
	if (lineCS && cpuAB == 5'b01000)			//X length of blitter
		bXlen <= cpuDO;
	if (lineCS && cpuAB == 5'b01001)			//Y length of blitter
		bYlen <= cpuDO;
	if (lineCS && cpuAB == 5'b01010)			//X start copy
		bXc <= cpuDO;
	if (lineCS && cpuAB == 5'b01011)			//Y start copy
		bYc <= cpuDO;
	if (lineCS && cpuAB == 5'b01100)			//X start paste
		bXp <= cpuDO;
	if (lineCS && cpuAB == 5'b01101)			//Y start paste
		bYp <= cpuDO;
	
	if (lineCS && cpuAB == 5'b01110)			//border color
		border <= cpuDO;	
	
	if (lineCS && cpuAB == 5'b10000)			//horizontal offset
		hoffset <= cpuDO;
	if (lineCS && cpuAB == 5'b10001)			//vertical offset
		voffset <= cpuDO;
		
end
can be replaced by a more succinct:

Code: Select all

// Register addresses
always @(posedge clk)
 if ( lineCS )
	case ( cpuAB [4:0] )
		5'b00000: x0t <= cpuDO;					//variables for line generator
		5'b00001: y0t <= cpuDO;
		5'b00010: x1t <= cpuDO;
		5'b00011: y1t <= cpuDO;
		
		5'b00100: xc  <= cpuDO;					//variables for circle generator
		5'b00101: yc  <= cpuDO;
		5'b00110: rad <= cpuDO;
		
		5'b00111: color <= cpuDO;				//pixel color variable
		
		5'b01000: bXlen <= cpuDO;				//X length of blitter
		5'b01001: bYlen <= cpuDO;				//Y length of blitter
		5'b01010: bXc   <= cpuDO;				//X start copy
		5'b01011: bYc   <= cpuDO;				//Y start copy
		5'b01100: bXp   <= cpuDO;				//X start paste
		5'b01101: bYp   <= cpuDO;				//Y start paste
		
		5'b01110: border <= cpuDO;				//border color
		
	endcase
User avatar
player55328
Posts: 23
Joined: 06 Aug 2013
Location: Oregon

Re: n00b Verilog Questions

Post by player55328 »

ElEctric_EyE wrote:
if (lineCS && cpuAB == 5'b10000) //horizontal offset
hoffset <= cpuDO;
if (lineCS && cpuAB == 5'b10001) //vertical offset
voffset <= cpuDO;
Are you not missing these in your new case statement?

I would suggest you add a default to your case statement just in case you get a situation you are not expecting.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: n00b Verilog Questions

Post by ElEctric_EyE »

player55328 wrote:
if (lineCS && cpuAB == 5'b10000) //horizontal offset
hoffset <= cpuDO;
if (lineCS && cpuAB == 5'b10001) //vertical offset
voffset <= cpuDO;
Are you not missing these in your new case statement?...
Good eye! But I actually don't need those delay registers on my final output board.
player55328 wrote:
...I would suggest you add a default to your case statement just in case you get a situation you are not expecting.
That seems to be a good practice, but I'm not sure how I would go about that in this case.
User avatar
MichaelM
Posts: 761
Joined: 23 Apr 2012
Location: Huntsville, AL

Re: n00b Verilog Questions

Post by MichaelM »

player55328 wrote:
I would suggest you add a default to your case statement just in case you get a situation you are not expecting.
When I read EEyE's post yesterday, I had the same thought initially, but I changed my mind.

First, the if construction that he's using in a single always block is correct. It could be expanded into 17 separate always blocks and the structure would generate the expected logic behavior. That is, only when the condition is true is the register updated, and at all other times the register holds its last value.

Second, it is true that the case statement in the second example, nested inside an if statement, is missing the recommended default clause that prevents the generation of inferred latches. However, since these two statements are inside a clocked always block, there is no need to add the default clause.
ElEctricEyE wrote:
That seems to be a good practice, but I'm not sure how I would go about that in this case.
I also have the same thought. I generally include a default clause to all case statements, and in particular, to case statements in combinatorial always blocks. However, I think this is an example where that general recommendation is not appropriate. In my opinion, adding a default clause to this case statement will reduce the readability of the code.
Michael A.
User avatar
Rob Finch
Posts: 465
Joined: 29 Dec 2002
Location: Canada
Contact:

Re: n00b Verilog Questions

Post by Rob Finch »

Just a note: I've found out that sometimes case statements provide better performance than if/else if statements. This is with the Webpack synthesis tools.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: n00b Verilog Questions

Post by ElEctric_EyE »

I've got another hopefully simple question regarding ROM's, i.e. initialized blockRAM.

I used to use a clock for my character ROM, but in trying to optimize for speed and lose 1 cycle I decide to try replacing the

Code: Select all

always @(posedge clk)
with

Code: Select all

always @*
so the whole ROM module looks like

Code: Select all

//1Kx16 Character ROM, i.e. initialized FPGA blockRAM
`timescale 1ns / 1ps

module CharROM ( input [9:0] addr,
					  output reg [15:0] dout
					);

reg [15:0] RAM [1023:0];

initial $readmemh("C:/FPGA/PVBRAMalt2/character.coe",RAM,0,1023);

always @*
	dout <= RAM[addr];
					
endmodule
It does seem to work and save a cycle, but I see no mention of this style in "ROM HDL Coding Techniques" in the Xilinx manual, so chances are it is incorrect?
User avatar
MichaelM
Posts: 761
Joined: 23 Apr 2012
Location: Huntsville, AL

Re: n00b Verilog Questions

Post by MichaelM »

This is the same style that I've used successfully in a commercial product. The size of your implementation and mine differ significantly. My implementation is a 128 x 32 implementation, and yours is a 1024 x 16 implementation. In a clocked style, your implementation would be put into a single 16kbit Block RAM, but in the unclocked, asynchronous style you have used you are going to require 256 64 x 1 LUTs in a Spartan 6 FPGA.

The only caution of using LUTs for ROMs is given in the quote I cut out of the Spartan 6 data sheet: ds160.
Quote:
CLBs, Slices, and LUTs
Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side as part of two vertical
columns. There are three types of CLB slices in the Spartan-6 architecture: SLICEM, SLICEL, and SLICEX. Each slice
contains four LUTs, eight flip-flops, and miscellaneous logic. The LUTs are for general-purpose combinatorial and
sequential logic support. Synthesis tools take advantage of these highly efficient logic, arithmetic, and memory features.
Expert designers can also instantiate them.
SLICEM
One quarter (25%) of Spartan-6 FPGA slices are SLICEMs. Each of the four SLICEM LUTs can be configured as either a
6-input LUT with one output, or as dual 5-input LUTs with identical 5-bit addresses and two independent outputs. These
LUTs can also be used as distributed 64-bit RAM with 64 bits or two times 32 bits per LUT, as a single 32-bit shift register
(SRL32), or as two 16-bit shift registers (SRL16s) with addressable length. Each LUT output can be registered in a flip-flop
within the CLB. For arithmetic operations, a high-speed carry chain propagates carry signals upwards in a column of slices.
SLICEL
One quarter (25%) of Spartan-6 FPGA slices are SLICELs, which contain all the features of the SLICEM except the
memory/shift register function.
SLICEX
One half (50%) of Spartan-6 FPGA slices are SLICEXs. The SLICEXs have the same structure as SLICELs except the
arithmetic carry option and the wide multiplexers.
As long as your logic requirements are being met, then using such a large number of LUTs for a ROM that can be easily placed in a BRAM is not a bad trade. However, the Spartan 6 architecture, as nice as it is, does have a tendency to orphan a bunch of resources because of its characteristic of only having 25% of its slices associated with SliceMs.

Did you try clocking the character generator ROM on the falling edge of the clock? That's one approach to reducing the delay effect of the synchronous BRAM on the overall pipeline delay of a design. Depending on the path delays from your character shift register to the address input of the BRAM and the path delays from the BRAM output to your video shift register, the read access time of the BRAM, ~2.5ns, should allow plenty of set up to your output video shift register.
Michael A.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: n00b Verilog Questions

Post by ElEctric_EyE »

Looking at the console, it still looks to be synthesizing as a blockRAM with a clock signal.

Code: Select all

Synthesizing (advanced) Unit <top_level>.
INFO:Xst:3226 - The RAM <CharROM/Mram_RAM> will be implemented as a BLOCK RAM, absorbing the following register(s): <HVSync/CRaddr>
    -----------------------------------------------------------------------
    | ram_type           | Block                               |          |
    -----------------------------------------------------------------------
    | Port A                                                              |
    |     aspect ratio   | 1024-word x 16-bit                  |          |
    |     mode           | write-first                         |          |
    |     clkA           | connected to signal <clk>           | rise     |
    |     enA            | connected to internal node          | high     |
    |     weA            | connected to signal <GND>           | high     |
    |     addrA          | connected to signal <HVSync/char[7]_char[7]_mux_304_OUT> |          |
    |     diA            | connected to signal <GND>           |          |
    |     doA            | connected to signal <CRdata>        |          |
    -----------------------------------------------------------------------
    | optimization       | speed                               |          |
    -----------------------------------------------------------------------
User avatar
MichaelM
Posts: 761
Joined: 23 Apr 2012
Location: Huntsville, AL

Re: n00b Verilog Questions

Post by MichaelM »

That's good news for your project.

To keep it the BRAM you need to keep away from looking at the character ROM output data before the CRdata register. To remove another clock delay, if necessary, you might consider attaching the address of the BRAM to the input of the HVSync/char[7]_char[7]_mux_304_OUT register.

It's a nice feature of the tools to use downstream registers so that large asynchronous RAM/ROM elements can be placed in BRAM. It certainly helps preserve the more flexible SLICEM slices.
Michael A.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: n00b Verilog Questions

Post by ElEctric_EyE »

Thanks for your input, it's much appreciated! I had thought something was going awry.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: n00b Verilog Questions

Post by ElEctric_EyE »

Yesterday I was adding 2 large (16Kx11) blockRAMs in the same fashion that my smaller 1Kx16 zeropage and 1Kx16 stackpage blockRAMs were written (see OLD below). The smaller RAMs have been working fine for over a year. However I noticed I was having problems with the functioning of the larger RAMs, in simulation the outputs were undefined X's.
I thought I would share what I had done to get it working. The 2 kinds of RAMs are similar, just widths and depths are changed.
A few other notes: I removed the .bmm file which forces the location of the ROM, since this was a major change to the design. After the tools reassigned positions, I added the .bmm file back in so that I could same alot of development time as mentioned here. Also, changing the width to [15:0] and depth down to [8191:0] had no change in incorrect behavior.
Any thoughts? Again this style is not in the Xylinx manuals, but the new style seemed more natural. I got lucky!
OLD:

Code: Select all

// 1Kx16 stack page FPGA blockRAM
`timescale 1ns / 1ps

module RAM ( input clk,
					   input we,
					   input rst,
					   input [9:0] addr,
					   input [15:0] din,
					   output reg [15:0] dout
					   );
					
reg [15:0] RAM [1023:0];

always @(posedge clk) begin
	if (we) 
		RAM[addr] <= din;
		else 
			dout <= RAM[addr]; 
	if (rst)	
		dout <= 0;
end

endmodule
NEW:

Code: Select all

// 16Kx11 scratch page FPGA blockRAM
`timescale 1ns / 1ps

module SCRAM ( input clk,
					input we,
					input rst,
					input [13:0] addr,
					input [15:0] din,
					output reg [15:0] dout
					);
					
reg [10:0] RAM [16383:0];

always @(posedge clk) 
	if (we) 
		RAM[addr] <= din;
		else 
			dout <= rst ? 16'h0000 : RAM[addr];

endmodule
User avatar
Rob Finch
Posts: 465
Joined: 29 Dec 2002
Location: Canada
Contact:

Re: n00b Verilog Questions

Post by Rob Finch »

ElEctric_EyE wrote:
in simulation the outputs were undefined X's.
The contents of the RAM are typically X's (for unknown) unless you have hardware that initializes them. Unless the RAM's are initialized to something before they are dumped in the simulator, they will read as X.

One way to initialize the RAM's is to include an 'initial begin' block like the following:

Code: Select all

// 1Kx16 stack page FPGA blockRAM
`timescale 1ns / 1ps

module RAM ( input clk,
                  input we,
                  input rst,
                  input [9:0] addr,
                  input [15:0] din,
                  output reg [15:0] dout
                  );

integer n;               
reg [15:0] RAM [1023:0];

initial begin
   for (n = 0; n <= 1023; n = n + 1)
      RAM[n] <= 0;
end

always @(posedge clk) begin
   if (we) 
      RAM[addr] <= din;
      else 
         dout <= RAM[addr]; 
   if (rst)   
      dout <= 0;
end

endmodule
You don't always needs to initialize RAM components. It's just that the memory will read as X until something is written into it.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: n00b Verilog Questions

Post by ElEctric_EyE »

That's good info Rob, but I don't think that's what was going on, as those RAMs are usually in a rst condition when they aren't selected, so they should be outputting '0's. In simulation the small RAM's were outputting '0's, the larger one's 'X's. If they output anything other than '0's when they're not selected by the cpu, it crashes the cpu since all 4 RAMs (and ROM) data outs are OR'd going to cpu data in.
User avatar
Rob Finch
Posts: 465
Joined: 29 Dec 2002
Location: Canada
Contact:

Re: n00b Verilog Questions

Post by Rob Finch »

Well, in that case,

All I can see is the two code samples for the big and small RAMS are slightly different in how the rst signal is applied.

In the smaller RAM rst is applied independently of we.

Code: Select all

always @(posedge clk) begin
   if (we) 
      RAM[addr] <= din;
      else 
         dout <= RAM[addr]; 
   if (rst)   
      dout <= 0;
end
In the 16k case rst is applied in 'else' statement of we:

Code: Select all

always @(posedge clk) 
   if (we) 
      RAM[addr] <= din;
      else 
         dout <= rst ? 16'h0000 : RAM[addr];
I think this may cause dout to latch the previous value when we is applied. Hence multiple RAM dout's may be active during we resulting in an X value.

I'm just eyeballing this, so could be wrong.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: n00b Verilog Questions

Post by ElEctric_EyE »

No, the system works fine now when all the BRAMs are coded like this:

Code: Select all

always @(posedge clk)
   if (we)
      RAM[addr] <= din;
      else
         dout <= rst ? 16'h0000 : RAM[addr];
And the system had worked fine when only the smaller stackpage and zeropage BRAMs were coded like below. I ran into issues when I had tried adding the larger BRAMs with this same code:

Code: Select all

always @(posedge clk) begin
   if (we)
      RAM[addr] <= din;
      else
         dout <= RAM[addr];
   if (rst)   
      dout <= 0;
end
Post Reply