Concept & Design of 3.3V Parallel 16-bit VGA Boards
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
The number of lines shouldn't matter for max clock speed. The number is just a counter, and it draws the lines sequentially.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
I dialed the resolution settings back down to 640x480.
I was able to get the system clock up to 200MHz (so your state machines are good for that speed, awesome!), pixel clock @25MHz, and it runs well. One thing is, maybe my eyes deceive me, but there seems to be no appreciable increase in speed?
Also, I set all the initial seg_h[x] to 320. To watch all the lines trace to a common point in parallel far off the screen is cool to watch. I'll have to make a video soon.
I was able to get the system clock up to 200MHz (so your state machines are good for that speed, awesome!), pixel clock @25MHz, and it runs well. One thing is, maybe my eyes deceive me, but there seems to be no appreciable increase in speed?
Also, I set all the initial seg_h[x] to 320. To watch all the lines trace to a common point in parallel far off the screen is cool to watch. I'll have to make a video soon.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
ElEctric_EyE wrote:
I was able to get the system clock up to 200MHz (so your state machines are good for that speed, awesome!), pixel clock @25MHz, and it runs well. One thing is, maybe my eyes deceive me, but there seems to be no appreciable increase in speed?
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
I tried again, this time successfully, to get the project working at 1024x768 with a pixel clock of 71MHz. It let me get the system clock to operate at 118MHz. Earlier I said 88MHx was max, but I must have skipped a test in all the excitement.
So I guess this is the sweet spot as far as resolution/performance, since the BRAM used for the scanline buffer is only 1024bits deep.
So I guess this is the sweet spot as far as resolution/performance, since the BRAM used for the scanline buffer is only 1024bits deep.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
So, with a 25 MHz pixel clock you were able to push main clock to 200 MHz, but with a 71 MHz pixel clock the main clock only to 118 MHz ? That's strange.
The BRAM is 2kB, so 1024 pixels. I agree that's the sweet spot. Also, higher pixel clocks would mean less cycles available for processing.
The BRAM is 2kB, so 1024 pixels. I agree that's the sweet spot. Also, higher pixel clocks would mean less cycles available for processing.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Arlet wrote:
So, with a 25 MHz pixel clock you were able to push main clock to 200 MHz, but with a 71 MHz pixel clock the main clock only to 118 MHz ? That's strange.
But why would you say strange?
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
It's strange that the main clock domain would suffer from a change in the pixel clock domain. They are unrelated blocks of logic. Of course, when you're making registers wider, they'll be slower, but from 200 to 118 MHz is a big drop for just one extra bit. I'll have to look at this myself when I get a chance, and go through the timing analyzer reports to see where the long paths are.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
I did a quick test with a 200 MHz main clock, and a 200 MHz pixel clock, and it passed all constraints without modification. The worst case timing path for the pixel clock still has almost 0.7 ns slack, and the main clock domain has 0.44 ns slack. The longest main clock path is the 'last_segment' calculation, and its influence on the state machine followed by paths for the size calculation in the async FIFO. Both parts can be rewritten to incorporate an extra pipeline stage, without impact on the performance.
By the way: the [9:0] address vectors for the line buffer are wide enough for the single block RAM and will allow 1024 pixel horizontal resolution.
By the way: the [9:0] address vectors for the line buffer are wide enough for the single block RAM and will allow 1024 pixel horizontal resolution.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Arlet wrote:
I did a quick test with a 200 MHz main clock, and a 200 MHz pixel clock, and it passed all constraints without modification. The worst case timing path for the pixel clock still has almost 0.7 ns slack, and the main clock domain has 0.44 ns slack. The longest main clock path is the 'last_segment' calculation, and its influence on the state machine followed by paths for the size calculation in the async FIFO. Both parts can be rewritten to incorporate an extra pipeline stage, without impact on the performance...
I must have my PLL hooked up wrong. I see in the warning it converted my PLL_base to PLL_adv, but I see no PLL_adv primitive. Right now I just have a wire from CLKFBOUT to CLKFBIN.
Arlet wrote:
...By the way: the [9:0] address vectors for the line buffer are wide enough for the single block RAM and will allow 1024 pixel horizontal resolution.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
ElEctric_EyE wrote:
You're going to do this even though both clocks can run @200MHz?
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
How did you hook up the FBIN/OUT in your PLL? I think that maybe why my system slowed down with the higher pixel clock.
Also some other things I was wondering:
How difficult is it going to be to code for the plotting for the end of the lines?
How are we going to specify the total number of lines and their coordinates?
Also some other things I was wondering:
How difficult is it going to be to code for the plotting for the end of the lines?
How are we going to specify the total number of lines and their coordinates?
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
ElEctric_EyE wrote:
How did you hook up the FBIN/OUT in your PLL? I think that maybe why my system slowed down with the higher pixel clock.
Quote:
How difficult is it going to be to code for the plotting for the end of the lines?
Quote:
How are we going to specify the total number of lines and their coordinates?
The biggest challenge will be to rewrite the code so that BRAMs can be used instead of the distributed RAM, and optimize it as much as possible to allow a large number of lines to be drawn. In theory, the 3 parallel BRAMs should allow 512 lines to be on the screen at one time, but there will be limitations to how many pixels can be on a given scanline. It's also important to cycle through the list of lines as quickly as possible. I'm thinking it should be possible to handle 1 line per cycle, including drawing 1 pixel. For each additional pixel, another cycle is needed. So, for 512 vertical lines, it requires 512 cycles per scanline to draw them all. If that can be accomplished, it is even feasible to support 1024 lines, at the cost of 6 BRAMs.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
I rewrote nearly all the line drawing code, see github for sources. It's still a work in progress, and right now it can't even draw diagonal lines. Instead, the code draws filled in rectangles. This is just for testing purposes to allow me to focus on the state machine and pipeline, without having to deal with Bresenham at the same time. The code works as follows:
The line module now has an interface to retrieve the vectors from the outside, through a new interface:
It outputs the number of the vector it wants, and a read enable signal. When the main module sees the read_vector enable, it writes the coordinates of the vector to (x0,y0) (x1,y1) and col. It also indicates if this was the last valid vector by setting the last_vector bit. For each scanline, the line module cycles through all the vectors, figures out which ones cross the current scanline, and draws them. Except instead of drawing a diagonal line, it paints the entire range between x0 and x1.
The idea is that the main code can choose how to generate the vectors. They could be read from a BRAM, or generated on the fly. Also, the interface could be extended with a 'wait' signal so that the main code could take some extra time to generate the vectors. This could allow a read from external SRAM, for instance. Of course, we're still racing the beam, so you can't afford to waste too many cycles.
The next step is to combine this with the Bresenham code, so that instead of rectangles, it draws diagonal lines. Below is the demo output of the current code. Each of the green horizontal and vertical lines are 1-pixel wide rectangles. Of course, the red square is also a rectangle.
The line module now has an interface to retrieve the vectors from the outside, through a new interface:
Code: Select all
output reg [9:0] vector = 0, // vector number
output read_vector, // read vector enable
input [9:0] x0, // top x coordinate
input [9:0] y0, // top y coordinate
input [9:0] x1, // vector delta y (abs)
input [9:0] y1, // vector delta x (abs)
input [15:0] col, // vector color
input last_vector, // last vector
The idea is that the main code can choose how to generate the vectors. They could be read from a BRAM, or generated on the fly. Also, the interface could be extended with a 'wait' signal so that the main code could take some extra time to generate the vectors. This could allow a read from external SRAM, for instance. Of course, we're still racing the beam, so you can't afford to waste too many cycles.
The next step is to combine this with the Bresenham code, so that instead of rectangles, it draws diagonal lines. Below is the demo output of the current code. Each of the green horizontal and vertical lines are 1-pixel wide rectangles. Of course, the red square is also a rectangle.
- Attachments
-
- 0000.png (790 Bytes) Viewed 2532 times
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Looking at the code in line.v module:
32 green lines are drawn using the vector_nr as a simple counter. When it's even it draws vertical lines, when it's odd it draws horizontal lines. Back in the line.v module the logic that increments the vector counter:
So, if we wanted to read co-ordinates from memory, let's say, we would test the read_vector bit the same way as above, and have 32 consecutive reads (using the current example)? Do you think there would have to be another FIFO from the SRAM to the part of the code that assigns x0,y0,x1,y1?
Code: Select all
x0 <= 16 + (vector_nr[0] ? {vector_nr[4:1], 4'h0} : 0);
y0 <= 16 + (vector_nr[0] ? 0 : {vector_nr[5:1], 4'h0});
x1 <= 16 + (vector_nr[0] ? {vector_nr[4:1], 4'h0} : 240);
y1 <= 16 + (vector_nr[0] ? 240 : {vector_nr[5:1], 4'h0});
last_vector <= 0;
col <= GREEN;Code: Select all
always @(posedge clk)
if( ~drawing ) vector <= 0;
else if( read_vector ) vector <= vector + 1;Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
You don't need a FIFO. Just use the vector_nr as the RAM address input, and return the data. Of course, you need 56 bits worth of data, plus a 'last_vector' bit, so you don't have time to read from external memory.