HDL Implementation of Video Generator Test for 16-bit PVB's
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: HDL Implementation of Video Generator Test for 16-bit PV
Arlet wrote:
... you don't keep track of the latency of the SRAM. If you were to implement that properly, you'll find that you need to update the address earlier in the pipeline than the RGB channel data.
Re: HDL Implementation of Video Generator Test for 16-bit PV
The problem is that you'll need a memory controller with read/write capabilities, otherwise there's not much point in reading the correct pixel out of the SRAM. But in addition to a memory controller, you also need a video generator that doesn't mind if the data from the memory doesn't arrive with predictable timing (otherwise you can't do any writes during active video), so it needs a FIFO.
If I were doing this, I'd start with a simple test image generator, like a black/white checker board pattern with a blue border and figure out how to properly align the sync signals so that all pixels end up where you intended. After that, try doing some more complicated stuff, like a rotozoom, or a low resolution bit mapped screen from block RAM, or a character generator from block RAM. When that all works, go back to external RAM. That's basically how I started.
If I were doing this, I'd start with a simple test image generator, like a black/white checker board pattern with a blue border and figure out how to properly align the sync signals so that all pixels end up where you intended. After that, try doing some more complicated stuff, like a rotozoom, or a low resolution bit mapped screen from block RAM, or a character generator from block RAM. When that all works, go back to external RAM. That's basically how I started.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: HDL Implementation of Video Generator Test for 16-bit PV
Sounds like a good plan of attack. I've successfully made my first 100% verilog project now as well, even the top_level.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: HDL Implementation of Video Generator Test for 16-bit PV
Arlet wrote:
...If I were doing this, I'd start with a simple test image generator, like a black/white checker board pattern with a blue border and figure out how to properly align the sync signals so that all pixels end up where you intended. After that, try doing some...
Then, I believe the next step will be to learn FIFO from a Verilog point of view. There are alot of possibilities with a FIFO, just by initially looking at it through the .xco "lightbulb tool". It seems there may be a limitation on the width to 1024 bits though... Whereas in the Virtex 6 and higher devices, one can add widths, but not in the S6. I could be wrong though, tired....
Re: HDL Implementation of Video Generator Test for 16-bit PV
Yes, a FIFO based VGA generator is very convenient to use. I'd recommend an asynchronous FIFO, 16 bit wide (one pixel), and 1K deep, so it will use exactly one block RAM. The VGA module uses its own pixel clock, asynchronous to your main clock. Using the pixel clock, data is read from the FIFO, and send to the VGA output with the exact timing. The VGA module exposes the writing end of the FIFO through its interface, as well as 'start' signal, that indicates it's going to need a new frame. As soon as you see the start pulse, you need to write 640x480 (or whatever the resolution) pixels into the FIFO, but there's no need to worry about exact timing. You can write them faster than the pixel clock, and then stop for a while, which is perfect if you need to read the data from external RAM. The start pulse should be generated some time before the first pixel is needed, for example at the end of the VSYNC interval. That way there's plenty of time to fill the FIFO, so you get some head start.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: HDL Implementation of Video Generator Test for 16-bit PV
I downloaded the Spartan6 HDL library and focused on the 'RAMB16BWER' today. Past couple days I was looking over UG383 - The Spartan 6 Block Ram User's Guide, and I didn't find what I was looking for in there, although fine details for operation are there and may need to be analyzed later.
So at the heart of the FIFO we have a 1Kx16 dualport BRAM with true dual Read & Write ports. It is easy to see that 1 read port goes to the videoDAC and the other read port goes to the controlling hardware. Now at this point, it's also easy to see the 1 write port is also coming from a controlling CPU. Easy enough for me to understand at least, not to implement yet, but...
I began to think that there's an unused port for write access. So it might be possible, in a cascaded videoboard system, to take data from a previous board and put it (whether through AND/OR/EOR/ADD/SUB/MUL/DIV) into the succeeding boards memory?
So at the heart of the FIFO we have a 1Kx16 dualport BRAM with true dual Read & Write ports. It is easy to see that 1 read port goes to the videoDAC and the other read port goes to the controlling hardware. Now at this point, it's also easy to see the 1 write port is also coming from a controlling CPU. Easy enough for me to understand at least, not to implement yet, but...
I began to think that there's an unused port for write access. So it might be possible, in a cascaded videoboard system, to take data from a previous board and put it (whether through AND/OR/EOR/ADD/SUB/MUL/DIV) into the succeeding boards memory?
Re: HDL Implementation of Video Generator Test for 16-bit PV
Quote:
So at the heart of the FIFO we have a 1Kx16 dualport BRAM with true dual Read & Write ports. It is easy to see that 1 read port goes to the videoDAC and the other read port goes to the controlling hardware. Now at this point, it's also easy to see the 1 write port is also coming from a controlling CPU. Easy enough for me to understand at least, not to implement yet, but...
I began to think that there's an unused port for write access. So it might be possible, in a cascaded videoboard system, to take data from a previous board and put it (whether through AND/OR/EOR/ADD/SUB/MUL/DIV) into the succeeding boards memory?
I began to think that there's an unused port for write access. So it might be possible, in a cascaded videoboard system, to take data from a previous board and put it (whether through AND/OR/EOR/ADD/SUB/MUL/DIV) into the succeeding boards memory?
The independent and synchronous operation of each port of the BRAM does improve your chances of implementing a FIFO which uses non-synchronous clocks for the input and output ports. One key concept to keep in mind is that the write address counter is synchronous to the write port clock, and the read address counter is synchronous to the read port clock. That is the easy part. The hard part is tracking the number of filled cells in the FIFO, i.e. the FIFO fill counter. The input side fill counter is clocked by the write clock, but this fill count has to be transferred to the read clock domain. On the other hand, as the reads occur the fill counter needs to be decremented. Since this control is synchronous to the read clock, the read enable pulse must be crossed over to the write clock domain so that counter is decremented on the write side. If you get the CoreGen tool to generate a two clock BRAM FIFO, you'll find that the count sequence for the first few bits is not a simple binary sequence. It is a grey-code sequence so that only one bit changes at a time for some small number of data counts. This makes it easier for a single level synchronizer of the FIFO fill count to operate reliably.
As the frequencies of the two clocks approach each other, there will be long periods of time where the setup and hold times of the logic in one domain or the other will be violated. When the clocks differ from each other substantially, the rate at which this phenomenon occurs will decrease. But rest assured, this phenomenon will occur and when it does, without the right clock domain crossing circuits in place, the FIFO flags and fill count register will go completely bonkers.
I do recommend the use of the Xilinx CoreGen FIFOs when two clock domains are necessary. They are compact, and have never exhibited any metastable behaviour; you do have to be aware that the least significant bits of their counters are not necessarily linear binary sequences.
Michael A.
Re: HDL Implementation of Video Generator Test for 16-bit PV
ElEctric_EyE wrote:
So at the heart of the FIFO we have a 1Kx16 dualport BRAM with true dual Read & Write ports. It is easy to see that 1 read port goes to the videoDAC and the other read port goes to the controlling hardware. Now at this point, it's also easy to see the 1 write port is also coming from a controlling CPU. Easy enough for me to understand at least, not to implement yet, but...
I began to think that there's an unused port for write access. So it might be possible, in a cascaded videoboard system, to take data from a previous board and put it (whether through AND/OR/EOR/ADD/SUB/MUL/DIV) into the succeeding boards memory?
I began to think that there's an unused port for write access. So it might be possible, in a cascaded videoboard system, to take data from a previous board and put it (whether through AND/OR/EOR/ADD/SUB/MUL/DIV) into the succeeding boards memory?
The FIFO itself isn't really made for pixel manipulation, since it only contains at most 1K pixels, and you don't really keep track of which pixels there are in it. Instead what you do is put all the pixel calculation, like merging different video streams, before the FIFO. For instance, in my sprite engine, I had a staging area consisting of 1 horizontal line of pixels, and a state machine that did:
- Write background color in line buffer
- Make a list of all the sprites that intersect current line
- For each sprite, read the pixels from memory
- Draw the sprite pixels in the line buffer (optionally combining them with the background pixel for transparency)
You could do something similar with other operations.
Last edited by Arlet on Tue Oct 16, 2012 8:42 am, edited 1 time in total.
Re: HDL Implementation of Video Generator Test for 16-bit PV
MichaelM wrote:
The independent and synchronous operation of each port of the BRAM does improve your chances of implementing a FIFO which uses non-synchronous clocks for the input and output ports. One key concept to keep in mind is that the write address counter is synchronous to the write port clock, and the read address counter is synchronous to the read port clock. That is the easy part. The hard part is tracking the number of filled cells in the FIFO, i.e. the FIFO fill counter. The input side fill counter is clocked by the write clock, but this fill count has to be transferred to the read clock domain. On the other hand, as the reads occur the fill counter needs to be decremented. Since this control is synchronous to the read clock, the read enable pulse must be crossed over to the write clock domain so that counter is decremented on the write side. If you get the CoreGen tool to generate a two clock BRAM FIFO, you'll find that the count sequence for the first few bits is not a simple binary sequence. It is a grey-code sequence so that only one bit changes at a time for some small number of data counts. This makes it easier for a single level synchronizer of the FIFO fill count to operate reliably.
It may be not as high performance as the Xilinx CoreGen generated FIFOs, but there's a certain satisfaction in understanding how it works.
Re: HDL Implementation of Video Generator Test for 16-bit PV
Quote:
but there's a certain satisfaction in understanding how it works.
Michael A.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: HDL Implementation of Video Generator Test for 16-bit PV
Arlet, what does this circuit look like in your CS4954.v module? Is it just that hblank1 is delayed by 1 clock cycle, hblank2 is delayed by 2 cycles, etc. with respect to hblank0?
Code: Select all
// synchronize in other clock domain
always @(posedge clk ) begin
hblank1 <= hblank0;
hblank2 <= hblank1;
hblank3 <= hblank2;
endRe: HDL Implementation of Video Generator Test for 16-bit PV
You can look at it as a 2 clock cycle delay, but it's used here as a synchronizer to solve metastability issues.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: HDL Implementation of Video Generator Test for 16-bit PV
Arlet wrote:
... it's used here as a synchronizer to solve metastability issues.
Re: HDL Implementation of Video Generator Test for 16-bit PV
ElEctric_EyE wrote:
This is because the 'clk' signal was anticipated to be a high frequency around 100MHz? I'm wondering why you do this only to the hblank signal.
Code: Select all
// no need to synchronize vtrigger here,
// because vcount is not near an edge
always @(posedge clk )
vtrigger <= htrigger0 & (vcount == top);
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: HDL Implementation of Video Generator Test for 16-bit PV
Arlet, thanks for explanations.
So, I will be trying to put your advice into action in my next 2 days off:
I also remember a former discussion, that BigEd brought up regarding SSO (Simultaneous Switching Outputs, in the Concept & Design Thread, 4 posts down) that would an 'ultimate test'.
So I'm thinking of adding a horizontal counter in your HVSync vga.v generator and than can send out alternating colors based on even or odd bits. Wish I had a current meter, but I'll be able to observe effect at different resolutions.
So, I will be trying to put your advice into action in my next 2 days off:
Arlet wrote:
...If I were doing this, I'd start with a simple test image generator, like a black/white checker board pattern with a blue border and figure out how to properly align the sync signals so that all pixels end up where you intended....
So I'm thinking of adding a horizontal counter in your HVSync vga.v generator and than can send out alternating colors based on even or odd bits. Wish I had a current meter, but I'll be able to observe effect at different resolutions.