Concept & Design of 3.3V Parallel 16-bit VGA Boards

Topics relating to PALs, CPLDs, FPGAs, and other PLDs used for the support or creation of 65-family processors, both hardware and HDL.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

Forgot the '2*' in

Code: Select all

radiusError+=2*(y-x+1);
Now it is round, also I adjusted the HSize on the monitor to make up for the 1024x1024, non 4:3 ratio. Not sure why the cube on the right appears lighter on color, will have to investigate that.
Attachments
P1040062.JPG
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

Fixed now. Was a delay problem I fixed by shifting the SyncRAM clock and videDAC clock by 315deg.
Now the software sends line coordinates to plot the cube in hardware, sends 3 coordinates to draw the circle in hardware, and copy/pastes a section in software.
The points you see at the very top of the circle and the very left of the circle are the result of removing the multiplication in the algorithm. I put the multiply back.
Now to continue marching towards ellipses.
Attachments
P1040069b.JPG
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

After some more testing, I am suspecting I have an issue that has been around since the beginning of the project.
The pixels are too thick.
It is deceiving because a diagonal line seems smooth as it should be @1024x1024 resolution.
I suspect this because once in awhile during previous testing, erroneous pixels would be present. These pixels would be very small, maybe 1/4 size, hard to tell...
I had attributed the pixel size difference to brightness/color of the pixel, or maybe a fuzzy focus of the monitor.
Now, I don't think this is the case, especially since more testing has been successfully concluded.

My apologies to 8bit, as I used his circle routine and I thought it was the software double plotting. I'm pretty certain it is my hardware.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

ElEctric_EyE wrote:
...I'm pretty certain it is my hardware.
Idiot! The SyncRAM interface module and cpu/blockrom/blockram were all running at 1/2 speed of the videoDAC and SyncRAM clocks, in order to get the read data correctly from the SyncRAM. Of course it's going to get 2 data, also write 2 data per WE pulse from the cpu...

I am back to running the entire system @80MHz pixel clock, it passes synthesis.
As in the past PlotGen module & CPU writing is good . Cpu reading is not correct yet. :evil:

EDIT: BTW, the entire system running @25MHz pixel clk 640x480 is yielding the same results. This is actually encouraging and tells me my Verilog bidirectional interface to the SyncRAM is wrong. I will focus on this interface on the next two days. Need to do some more research on bidirectional Verilog.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

This is a pic taken from the CY7C1463 2Mx18 SyncRAM datasheet:
It looks like everything is timed on the negative edge of the clock. All the code I've written is timed on the posedge.
Is this going to be a problem? or maybe this is why I've seemed to have some success phase shifting the clock by some amounts?
Attachments
SyncRAM timing.jpg
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

I think I may have answered my own question. Finally reads from video memory by the cpu are working now when everything is clocked at the same pixel clock @25MHz!
First to explain my usage of clocks inside the FPGA. I use 3 separate taps off of an internal PLL for flexibility. The clock signal going to the SyncRAM is phase shifted 90deg:

Code: Select all

PLL_BASE
  #(.BANDWIDTH              ("OPTIMIZED"),
    .CLK_FEEDBACK           ("CLKFBOUT"),
    .COMPENSATION           ("SYSTEM_SYNCHRONOUS"),
    .DIVCLK_DIVIDE          (2),							//100MHz/2=50MHz*14=700MHz/10=70MHz for 1024x768
    .CLKFBOUT_MULT          (14),						//100MHz/2=50MHz*14=700MHz/28=25MHz for 640x480
    .CLKFBOUT_PHASE         (0.000),					//100MHz/2=50MHz*18=900MHz/60=15MHz for 320x200
    .CLKOUT0_DIVIDE         (28),							//video pixel clk
    .CLKOUT0_PHASE          (0.000),
    .CLKOUT0_DUTY_CYCLE     (0.500),
	 .CLKOUT1_DIVIDE         (28),							//cpu&system clk
    .CLKOUT1_PHASE          (0.000),
    .CLKOUT1_DUTY_CYCLE     (0.500),
	 .CLKOUT2_DIVIDE         (28),							//SyncRAM clk
    .CLKOUT2_PHASE          (90.000),
    .CLKOUT2_DUTY_CYCLE     (0.500),
    .CLKIN_PERIOD           (10.0),
    .REF_JITTER             (0.010))
CLK0 goes to the HVSync module and offchip to the VideoDAC
CLK1 goes to the cpu, reset logic, BRAMs.
CLK2 goes offchip to the SyncRAM.

The bit of software that copies/pastes a chunk (76x76pixel cube) of video memory starting at (300,375) to (400,375):

Code: Select all

                  LDA #300
                  STA SCRLO               ;x position
                  LDA #375
                  STA SCRHI               ;y position
                  LDX #76                ;use X reg for easy test for end of y position
CHSH2             LDY #$00
                  LDWi $0064              ;LDW #100
CHSH              LDA (SCRLO),Y
                  STAiw (SCRLO),W         ;STA (SCRLO),W
                  INW
                  INY
                  CPY #76
                  BNE CHSH
                  INC SCRHI
                  DEX
                  BNE CHSH2
Attachments
P1040071.JPG
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

I almost forgot to mention I had used the RDY to halt the cpu in this attempt in my SRAMif module. I will take it out and see if it still works.

Code: Select all

`timescale 1ns / 1ps

module SRAMif( input clk,
					input [15:0] cpuDO,
					input vramCS,						//cpu is reading/writing to videoRAM
					input lineCS,						//lineGen is writing to videoRAM
					input cpuWE,
					input RAMWE,						//LineGen is drawing
					inout [15:0] SRD,
					input [15:0] QACCout,			//pixel color from cpu
					input [9:0] X,						//LSB of LineGen address
					input [9:0] Y,						//MSB
					input [20:0] Vaddr,				//pixel clock address from HVSync Generator
					input [31:0] cpuAB,
					output [20:0] SRaddr,
					output reg [15:0] SRDO,			//to cpu
					output SRWEn,
					input CB0,							//Control Bit 0 from cpu to control page 0 or 1, when page flipping
					input CB1,							//Control Bit 1 from cpu to control page flipping or scrolling
					input CB2,							//Control Bit 2 scrolling horizontal or vertical
					output reg RDY = 1						//RDY = 0 halts cpu
					);

reg SRWEn2;
reg [15:0] SRDI;

always @(posedge clk) begin
	SRWEn2 <= SRWEn;
		if (RAMWE && !vramCS) begin									//hardware module selected
						  SRDO <= 16'h0000;
						  SRDI <= QACCout;
						  RDY <= 1;
		end
		if (RAMWE && (vramCS && cpuWE)) begin						//collision, if both selected favor hardware module
						  SRDO <= 16'h0000;
						  SRDI <= QACCout;
						  RDY <= 1;
		end
		if (RAMWE && (vramCS && !cpuWE)) begin						//collision, favor cpu read from video
						  SRDO <= SRD;
						  SRDI <= 16'hZZZZ;
						  RDY <= 0;
		end
		if (!RAMWE && (vramCS && cpuWE)) begin						//cpu write to video
						  SRDO <= 16'h0000;
						  SRDI <= cpuDO;
						  RDY <= 1;
		end
		if (!RAMWE && (vramCS && !cpuWE)) begin					//cpu read from video
						  SRDO <= SRD;
						  SRDI <= 16'hZZZZ;
						  RDY <= 0;
		end
		if (!RAMWE && !vramCS) begin									//nothing selected
						  SRDO <= 16'h0000;
						  SRDI <= 16'hZZZZ;
						  RDY <= 1;
		end		
end

reg [20:0] cpuABopt;

always @* 													//optimize the videoRAM address for plotting (X,Y) in the (LSB,MSB) for indirect indexed
	 begin													//CB1 = 0, page flipping
			cpuABopt    [20] <= CB0;					//bank bit
			cpuABopt [19:10] <= cpuAB [31:16];		//Y[9:0]
			cpuABopt   [9:0] <= cpuAB [15:0];		//X[9:0]		
	end

assign SRWEn = !(RAMWE || (vramCS && cpuWE));		
assign SRaddr = RAMWE ? { CB0, Y, X } : vramCS ? cpuABopt : Vaddr;
assign SRD = SRWEn2 ? 16'hZZZZ : SRDI;				//I/O MUX'd latch to SyncRAM databus. High 'Z' during a read	

endmodule
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

Using the RDY to halt the cpu is not necessary, at 25MHz anyway. Will attempt to ramp speeds up now...
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by Arlet »

ElEctric_EyE wrote:
This is a pic taken from the CY7C1463 2Mx18 SyncRAM datasheet:
It looks like everything is timed on the negative edge of the clock. All the code I've written is timed on the posedge.
It's based on the positive edge, but you have to respect the setup/hold times. If there's a non-zero setup time, the clock must be delayed wrt to the signals.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

Arlet wrote:
It's based on the positive edge, but you have to respect the setup/hold times. If there's a non-zero setup time, the clock must be delayed wrt to the signals.
Thanks for responding. Told you I struggle with timing diag's, but I see now that dotted line is the clue it is posedge clocked...


I've done many tests now changing the SyncRAM clock phase, i.e. delay, and I think it's going to be real tight @70MHz clock. Tight being a <15deg difference from a working to a non-working design, if it works at all. 70MHz = 14ns cycle time, so 15deg is (360deg/15deg = 24) (14ns/24 = .5ns) .5ns shift.
If it takes 2 cycles for 1 transaction, that cuts the effective 133MHz max speed of the SyncRAM down to 66.5MHz which is <70MHz. So it's probably not going to work. Am I thinking clearly on this matter?

EDIT: There are very definitive, reliable changes in observed behavior when changing the delay of the clock by only .5ns(15deg). I've almost covered then entire 360deg range, in 15deg intervals, starting at 360. Right around 90deg is when the most difference in changes are observed. No luck. I lowered the pixel clock to 64.3MHz on a hunch, still a good picture with 48.5kHz HSync and 60.3Hz VSync. Going in for the kill...
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

ElEctric_EyE wrote:
...it takes 2 cycles for 1 transaction, that cuts the effective 133MHz max speed of the SyncRAM down to 66.5MHz...
This seems to be the case. Taking the delay of the FPGA and other factors into consideration now, the max speed seems to be in the neighborhood of 45MHz. All delays totalled must be greater than 20ns. Sort of disappointing, I couldn't find the sweet spot for the delay running @50MHz, although I may try again seeing the size of this window and the actual delay that seems to work.
An interesting thing about SyncRAM: Nothing random if the timing is off, like asynchronous RAM. The pixels will be in the wrong place, but is consistent.
So at 45MHz (22.2ns period) each degree of delay for the SyncRAM clock is 22.2ns/360=.062ns. I think I calculated the degree of delay incorrectly in the above post. Here are my observations @45MHz, 800x600 resolution, read-write-read:
156-157degrees: 1 pixel misplaced
153-155degrees: looks like 1/2pixel aberrations. barely noticeable, but no misplaced pixels
149-152degrees: looks best
148degrees: misplaced pixels everywhere

A couple other considerations: drive = 4 and slew = slow for the Spartan 6.

A test I would like to do next is to copy a small program from the BRAM and paste it in the video memory and jump to it and see if it works correctly. :)
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

Drive of 4mA is too low for this SyncRAM. Just changing the screen color to black from blue resulted in alot of bad pixels.
Increasing drive to 6mA cleared 99% of them. More testing is needed to find the most efficient drive strength.

EDIT: Observations are: Changing slew rate from slow to fast had no effect. Changing drive strength from 4mA to 6mA had a large effect, as mentioned. Increasing drive strength from 6mA to either 8mA or 12mA had negligible effect. So, I continue forward with 6mA and more testing.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

Another problem: I changed the FPGA output pins driving the videoDAC and monitor signals from 6mA to 12mA with no change. Maybe I need the impedance matching resistors now? The yellow is disappearing into the white. Should be green cubes on white background.
I'm done for today, thoroughly worn out!
Attachments
P1040074.JPG
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

ElEctric_EyE wrote:
...A test I would like to do next is to copy a small program from the BRAM and paste it in the video memory and jump to it and see if it works correctly. :)
I choose to use Bruces' C'mon. Again...
This time only a few variables need to be set, and alot is then happening in the video hardware.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards

Post by ElEctric_EyE »

Hmmm... I realize this is abit complicated as I have to assemble the program for the location in video memory, also I would have to mimic an input. Maybe I can work my way up to C'MON. I did have it working direct from the BRAM when I was using a small TFT module...

For today, I have a goal of putting a small simple program in video memory. I chose a 16-bit hex value character plotter. It reads a value from the random number generator module, then jumps to the PLTCHR routine which is in the BRAM. It is a modified program which was used to display a 32-bit timer value when I was using a timer module, so forgive the comments.

I'm sure there's a better way? But for now I will assemble it for $00000244, which is @(0,580) on the screen and hand type in the values.

Code: Select all

PLTRNG            LDA $C0000000           ; READ RNG VALUE
                  PHA
                  PHA
                  PHA
                  PHA
                  
                  CLC
                  LDX TXSTART
                  PLA                     ;TENS
                  AND #%1111000000000000  ;GET TENTHS OF A SECOND BCD DIGIT
                  LSRAopA12               ;SHIFT IT RIGHT
                  TAY
                  LDA HEXLUT,Y              
                  LDY TY
                  JSR PLTCHR
                  
                  CLC
                  LDA TXSTART
                  ADC #8
                  TAX
                  PLA                     ;HUNDREDS
                  AND #%0000111100000000
                  LSRAopA8
                  TAY
                  LDA HEXLUT,Y
                  LDY TY
                  JSR PLTCHR
                  
                  CLC
                  LDA TXSTART
                  ADC #16
                  TAX
                  PLA                     ;THOUSANDS
                  AND #%0000000011110000
                  LSRAopA4
                  TAY
                  LDA HEXLUT,Y
                  LDY TY
                  JSR PLTCHR
                  
                  CLC
                  LDA TXSTART
                  ADC #24
                  TAX
                  PLA                     ;TEN THOUSANDS
                  AND #%0000000000001111
                  TAY
                  LDA HEXLUT,Y
                  LDY TY
                  JSR PLTCHR
                  
                  RTS
Then a small program will copy the values starting @ $00000244.
Then a few lines to init the subroutine so it will plot the value @(400,0):

Code: Select all

                  LDA #400
                  STA TXSTART
                  LDA #0
                  STA TY
                  JSR $00000244
Post Reply