Concept & Design of 3.3V Parallel 16-bit VGA Boards

ElEctric_EyE · Post by **ElEctric_EyE** » Sun Dec 16, 2012 5:53 pm

So a Block RAM should be used to store up to 1024linesx56bits of line info? Data transfer to this BRAM could happen during every screen vertical refresh (vblank & vsynce)?

Arlet · Post by **Arlet** » Sun Dec 16, 2012 5:59 pm

Yes, that's one way to do it. And because the BRAM is dual ported, this is quite easily implemented. Connect one port to the line module, and use the other port to write during the blanking interval.

Arlet · Post by **Arlet** » Sun Dec 16, 2012 7:59 pm

Some progress with Bresenham, but still a few bugs. Only the first frame works, because I don't reset the per-vector state. Also, as you can see in the picture, the red lines have some weird effects on the left side.

ElEctric_EyE · Post by **ElEctric_EyE** » Sun Dec 16, 2012 8:41 pm

Nice progress! How many lines total? ~70?

MichaelM · Post by **MichaelM** » Sun Dec 16, 2012 8:53 pm

And they flicker if you scroll the display back and forth quickly.

Pretty cool.

Arlet · Post by **Arlet** » Mon Dec 17, 2012 6:23 am

ElEctric_EyE wrote:

Nice progress! How many lines total? ~70?

There are spaced every 10 pixels, so 64 green lines and 48 red ones. Maximum for now is 512, because I have a 512x32 bit BRAM for the Bresenham state. That doesn't mean you can display 512 arbitrary vectors. Too many on a single scanline, and the FIFO will underrun.

ElEctric_EyE · Post by **ElEctric_EyE** » Sat Dec 22, 2012 4:18 pm

I got my Christmas present early from work and was told to 'hit the road Jack', so I spent last week finding a new job and have been away from the project. Now I have the next 4 days off and wish to forget about my situation.

Arlet, any chance we can see your new code even though it may not be fully functional?

Arlet · Post by **Arlet** » Sat Dec 22, 2012 4:33 pm

I just put it up on github. I fixed some of the bugs, but there's still some things not working properly. I've been too busy with work last week, so I haven't done much verilog lately.

ElEctric_EyE · Post by **ElEctric_EyE** » Sat Dec 22, 2012 4:39 pm

Cool, thanks!

ElEctric_EyE · Post by **ElEctric_EyE** » Sat Jan 05, 2013 9:35 pm

At the risk of sounding self-important, I wish to say that I am on a firm path to get my old job back, so stress levels are beginning to decrease.

Soon, I hope to get back into the project and make a useful contribution with a fresh point of view.

BigEd · Post by **BigEd** » Sat Jan 05, 2013 9:52 pm

Good news!

ElEctric_EyE · Post by **ElEctric_EyE** » Tue Jan 15, 2013 3:46 am

Thanks Ed.

I think the next step is to assemble another parallel video board and then to experiment with the digital mixing, even if this means plotting some pixels by plotting a scrolling character and sending data to be mixed into the next board. The next board would scroll another character of a different color 'through' the previous video boards output. Monochrome is easy IIRC, one can just use the XOR math function.

I'm going to continue on my initial path of utilizing the external SyncRAM to store the bitmap, also add the 65Org16.b CPU into the design and a bi-directional data path to the SyncRAM. It will be interesting to observe the top speed limitations.

ElEctric_EyE · Post by **ElEctric_EyE** » Wed Jan 16, 2013 5:40 pm

So my first step was to do a block diagram with all signals and signal directions present. I tried to just start typing in the Verilog for the top-level interconnects, but found there were just too many signals to lat it all out from memory.

The idea for the 'initial' operation is for the CPU to be able to read and write to the external synchronous RAM during the horizontal and vertical retrace periods. All the square blocks will be the individual Verilog modules, although I may have to add another module to accommodate the bidirectional data bus of the SRAM. Also, I expect the timing will be off as I use a common clock for everything, with no provisions for the delay of the FPGA and SRAM, but I would expect to see some recognizable activity if the .b core software is actually running. I will just try to clear the video RAM for the first test.

This part of the project is basically a compilation of everything I've learned from previous FPGA projects using 6502 soft cores with blockRAM together with what I've most recently learned about video using the parallel video boards.

ElEctric_EyE · Post by **ElEctric_EyE** » Thu Jan 24, 2013 1:14 am

I've added the 2 blockRAMs for the .b core zero-page and stack, but not the ROM yet. R1 and R2 are the FPGA blockRAM databus outputs. Also, that ORs module in the above pic turned out to be an extremely simple

Code: Select all

Assign cpuDI = ( R1Dout | R2Dout | VramDout )

line of code inside the top_level module, so a separate ORs module wasn't even needed. This is what attracts me to Verilog; after building the equivalent structure with schematic entry ORing 4 16-bit databuses Inputs & 16 Outputs was a PITA. Now, using Verilog, it takes 1 simple line of code!

So, I've yet to add the ROM and address decoding module and to finish the SRAMif module before some real world testing can begin. I plan on 1024x768 with a 70MHz system clock.

Also, screw that part of trying to write to the video RAM when HSYNC & VSYNC were inactive, only because the cpu would have to be interrupted or it would have to read some bits from a port when HSYNC or VSYNC was inactive. At this point the 'snow effect' should be acceptable before transitioning to a FIFO buffer. I will need help with this though!

MichaelM · Post by **MichaelM** » Sat Jan 26, 2013 4:17 am

Most FPGAs no longer support internal tri-state busses. There are some respected sources that recommend against using internal tri-state buffers. I, on the other hand, believe in letting the tools transform internal tri-state busses (in FPGAs) into the appropriate implementation. I find it error prone and tedious to specify each and every bus, and to explicitly define the required multiplexers.

EEyE's block diagram/schematic in an earlier post is an example of the structure I try to avoid. I am not saying it is incorrect, or otherwise a bad thing to do. I am simply saying, that I try to avoid having to explicitly manage the buses and the required bus multiplexers as shown in EEyE's block diagram/schematic.

To accomplish this I generally create an internal "tri-state" bus, which can't be implemented in the FPGA. This "tri-state" bus allows me to create a single common port on all modules that will connect to the "bus". In this manner, the synthesizer is implicitly increasing or decreasing the size of the OR gate that combines all of the buses together as EEyE shows in his block diagram/schematic. Because the FPGA itself has no way of implementing a "tri-state" bus, the synthesizer has to transform the Verilog "tri-state" bus into a multiplexer of some sort and use point to point connections to tie everything together.

One way the multiplexer can be constructed is illustrated by EEyE's block diagram/schematic: a module with an enable and an OR gate to collect all of the connections together. An AND gate, when not enabled, outputs a logic 0, and an OR gate only outputs a logic 0 when all of its inputs are 0. Thus, in EEyE's circuit, the three memories are mutually exclusively selected. If unselected, their outputs are forced to logic 0. The OR gate samples each of the memories' output data. Since only one is selected at a time, there is no problem correctly resolving the logic level of the output of the enabled memory.

An alternative is to define a single bus. Let's name it something like EEyE has named the output of the OR gate, CPU_DI. At each memory, I would connect its output data to CPU_DI using a tri-state construction:

Code: Select all

localparam pDataWidth = 32;

wire    [9:0] Addrs;
wire    CPU_WE, CPU_RE;
wire    [(pDataWidth - 1):0] CPU_DI, CPU_DO;

reg     CS_RAM_A, CS_RAM_B, CS_RAM_C;

wire    WE_RAM_A, OE_RAM_A;
wire    WE_RAM_B, OE_RAM_B;
wire    WE_RAM_C, OE_RAM_C;

reg     [(pDataWidth - 1):0] RAM_A [0:255];
reg     [(pDataWidth - 1):0] RAM_B [0:255];
reg     [(pDataWidth - 1):0] RAM_C [0:511];

reg     [(pDataWidth - 1):0] RAM_A_DO, RAM_B_DO, RAM_C_DO;

always @(*)
begin
      casex(Addrs[9:8])
            2'b00 : {CS_RAM_A, CS_RAM_B, CS_RAM_C} <= {1'b1, 1'b0, 1'b0};
            2'b01 : {CS_RAM_A, CS_RAM_B, CS_RAM_C} <= {1'b0, 1'b1, 1'b0};
            2'b1x : {CS_RAM_A, CS_RAM_B, CS_RAM_C} <= {1'b0, 1'b0, 1'b1};
      endcase
end

assign WE_RAM_A = CS_RAM_A & CPU_WE;
assign WE_RAM_B = CS_RAM_B & CPU_WE;
assign WE_RAM_C = CS_RAM_C & CPU_WE;

assign OE_RAM_A = CS_RAM_A & CPU_RE;
assign OE_RAM_B = CS_RAM_B & CPU_RE;
assign OE_RAM_C = CS_RAM_C & CPU_RE;

always @(posedge Clk)
begin
     if(WE_RAM_A)
         RAM_A[Addrs] <= CPU_DI;
     RAM_A_DO <= #1 RAM_A[Addrs];
end

assign CPU_DI = ((OE_RAM_A) ? RAM_A_DO : {pDataWidth{1'bZ}});

always @(posedge Clk)
begin
     if(WE_RAM_B)
         RAM_B[Addrs] <= CPU_DI;
     RAM_B_DO <= #1 RAM_B[Addrs];
end

assign CPU_DI = ((OE_RAM_B) ? RAM_B_DO : {pDataWidth{1'bZ}});

always @(posedge Clk)
begin
     if(WE_RAM_C)
         RAM_C[Addrs] <= CPU_DI;
     RAM_C_DO <= #1 RAM_C[Addrs];
end

assign CPU_DI = ((OE_RAM_C) ? RAM_C_DO : {pDataWidth{1'bZ}});

In this example, the OR gate is implicitly created by the synthesizer.

This approach may or may not help, but it is the technique that I use when I want to connect a varying number of modules/components together. The approach is (1) automatically transformed into cascaded AND-OR gates as discussed above, (2) automatically accounts for all of connections, and (3) keeps me from having to define the multiplexer and/or the variable width OR manually.

Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards