Page 2 of 6
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Tue Feb 25, 2014 6:09 pm
by ElEctric_EyE
Next step is to put a blockRAM between the onboard 65Org16.b and the hardware pixel generator.
It will be a wide blockRAM but not too deep. Maybe 8 addresses at this point. The blockRAM databus will be accommodating the required 6-bit address and also 16-bit data going to the pixel generator module. It will be write-only to start, getting it's programming from the 65Org16.b.
After this stage is successful, the 65Org16.b will be removed from each of the PVBs, and then all of them then will receive commands from offboard.
This is why 6-bit address: These are the current registers
Code: Select all
// Register addresses
always @(posedge clk2) //write to registers
if (cpuWE & vgaCS)
case ( cpuAB [5:0] )
6'b000000: x0t <= cpuDO; //variables for line generator
6'b000001: y0t <= cpuDO;
6'b000010: x1t <= cpuDO;
6'b000011: y1t <= cpuDO;
6'b000100: xc <= cpuDO; //variables for circle generator
6'b000101: yc <= cpuDO;
6'b000110: rad <= cpuDO;
6'b000111: color <= cpuDO; //pixel color variable for line/circle/fill/pixel plot
6'b001000: bXlen <= cpuDO; //X length of blitter
6'b001001: bYlen <= cpuDO; //Y length of blitter
6'b001010: bXc <= cpuDO; //X start copy
6'b001011: bYc <= cpuDO; //Y start copy
6'b001100: bXp <= cpuDO; //X start paste
6'b001101: bYp <= cpuDO; //Y start paste
6'b001110: Xp <= cpuDO; //X for pixel plot
6'b001111: Yp <= cpuDO; //Y for pixel plot
//6'b010000 reserved for hoffset on previous PVB's
//6'b010001 reserved for voffset on previous PVB's
6'b010010: fXlen <= cpuDO; //X length of fill
6'b010011: fYlen <= cpuDO; //Y length of fill
6'b010100: fXs <= cpuDO; //X start
6'b010101: fYs <= cpuDO; //Y start
6'b010110: htiming[VIDEO] <= cpuDO; // htiming[VIDEO] = 1920;
6'b010111: htiming[FRONT] <= cpuDO; // htiming[FRONT] = 160;
6'b011000: htiming[SYNC] <= cpuDO; // htiming[SYNC] = 438;
6'b011001: htiming[BACK] <= cpuDO; // htiming[BACK] = 120;
6'b011010: vtiming[VIDEO] <= cpuDO; // vtiming[VIDEO] = 1080;
6'b011011: vtiming[FRONT] <= cpuDO; // vtiming[FRONT] = 44;
6'b011100: vtiming[SYNC] <= cpuDO; // vtiming[SYNC] = 85;
6'b011101: vtiming[BACK] <= cpuDO; // vtiming[BACK] = 2;
6'b011110: cX <= cpuDO; //X for character plot
6'b011111: cY <= cpuDO; //Y for character plot
6'b100000: cC <= cpuDO; //background pixel color
6'b100001: Charsize <= cpuDO;
6'b100010: Att <= cpuDO; //Attributes [15:8], 8-bit character value [7:0]
endcase
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Tue Mar 18, 2014 7:11 pm
by ElEctric_EyE
I decided to take a bit of a rest from the project this past week.
My thoughts were on focusing efforts on designing the controller board. I never did get into it... There was in itch for the rotating cube I had always wanted to do since the beginning of this project. EDIT: Andre's video in
this thread was my original inspiration.
Earlier I had wanted to do a hardware ellipse plotter, but I now think this may be best suited to a cpu plotting it in 'scratchpad' memory. The only issue I foresee is the co-cordinates will have to be sorted, or somehow generated in the proper order.
So I went back to work on the project today with this in mind and was able to fit 2x 16Kx11 block RAMs (11bits for 2K resolution), with a total blockRAM usage now up to 90%. 1 RAM holds X and the other RAM holds Y coordinates. The cpu only has access the 'scratchpad' and can read or write to it. So far it is working plotting a simple line.
Now to make things more exciting and non-linear, I'll use genius pseudo sin wave generator I remember seeing
here on the forums.
Simple line drawing test using the 'scratchpad' blockRAMs:
Code: Select all
LDX #255
LDY #0
graph TXA
STA scratchx,Y
TYA
STA scratchy,Y
INY
DEX
BNE graph
LDX #2
LDA CLUT,X
STA color
LDX #0
plot LDA scratchx,X
STA Xp
LDA scratchy,X
STA Yp ;auto plot pixel and halt cpu until done
INX
CPX #255
BNE plot
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Wed Mar 19, 2014 5:30 pm
by ElEctric_EyE
I gave up early on the software sin generator, in favor of the Xilinx CORDIC Sin/Cos hardware engine. I had success very early on, although it does push the project's occupied slices to 92%.
The CORDIC Sin/Cos engine was very easy to interface to, I set it up for a bit width of [10:0] and optimal pipeline mode, which a cycle after it receives an 11-bit phase input, it outputs 11-bit values on the x_out and y_out lines. I sort of rushed through it, I'll have to go back and re-read the CORDIC data sheet.
I made a program to put the sinwave data into the scratchpad X&Y RAMs, then it draws lines from the center of the screen to the coordinates in the RAMs.
Code: Select all
;--------- sine wave generator
LDX #0
LDY #0
sinwave1 STX phase
TYA
STA scratchx,Y
LDA sinyout
CLC
ADC #512
STA scratchy,Y
INX
INY
CPY #1023
BNE sinwave1
LDX #1060
sinwave2 STX phase
TYA
STA scratchx,Y
LDA sinxout
CLC
ADC #512
STA scratchy,Y
INX
INY
CPY #2047
BNE sinwave2
;--------- plot
LDX #2
LDA CLUT,X
STA color
LDA #1920/2
STA lx0
LDA #1080/2
STA ly0
LDY #0
pline LDA scratchx,Y
STA lx1
LDA scratchy,Y
STA ly1
INY
CPY #1919
BNE pline
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Wed Mar 19, 2014 6:21 pm
by BigEd
So you get single-cycle trig functions at 11bit precision?
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Wed Mar 19, 2014 6:43 pm
by ElEctric_EyE
Here is the
CORDIC data sheet.
I was looking at the very bottom of page 15 (also pg.2): "A parallel CORDIC core with N bit output width has a latency of N cycles and produces a new output every cycle."
I use parallel mode, which uses more silicon, but there is a serial mode as well but requires more cycles and more signals. This is my top-level pin assignments for the .xco file, the bare minimum (I'm experimenting w/higher resolutions again):
Code: Select all
sincos sincos ( .clk(clk),
.phase_in(phase_in [11:0]),
.x_out(x_out [11:0]),
.y_out(y_out [11:0]));
The lightbulb tool only lets you choose 1 function from several trig possibilities mentioned under Features. It let's you choose the #of bits too. I initially choose 11 bits, but I'm experimenting with higher bits and shifting right for less error. I'd gone up to 16-bits precision, and Slice resources went from 92% to 96%.
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Wed Mar 19, 2014 6:53 pm
by BigEd
Seems pretty handy! Once you've chosen a model of FPGA, might as well fill it up with useful functions.
Cheers
Ed
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Fri Mar 21, 2014 2:03 am
by ElEctric_EyE
Under trig functions, besides the CORDIC generator, there is also the option to use the DDS compiler. I set that up tonight successfully and compared resources used.
Since all I am really needing here are simultaneous SIN & COS outputs for storage in the X & Y BRAMs for circle/ellipse generation using simple LUT's, the DDS compiler option is appearing to be more appealing. Slice LUT's and occupied slices only went up 2% using DDS. I chose to use blockRAM as opposed to distributed RAM. The curve looks much better too. Less errors
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Fri Mar 21, 2014 2:37 am
by ElEctric_EyE
Much easier to manage software-wise:
Code: Select all
;--------- sine wave generator
LDX #0
LDY #0
sinwave1 STX phase
TYA
STA scratchx,Y
LDA sinxout
CLC
ADC #1080/2
STA scratchy,Y
INX
INX
INX
INX
INY
CPY #2047
BNE sinwave1
;------plot
LDX #2
LDA CLUT,X
STA color
LDA #1920/2
STA lx0
LDA #1080/2
STA ly0
LDY #0
pline LDA scratchx,Y
STA lx1
LDA scratchy,Y
STA ly1
INY
CPY #1919
BNE pline
Again, my weak line drawing algorithm is not suitable for drawing across the entire 1920x1080 resolution yet. This is why the origin of the lines is @(1920/2, 1080/2).
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Mon Mar 24, 2014 11:40 am
by ElEctric_EyE
Oh yeah! I've got it: The KEY.
Code: Select all
;--------- Circle/Ellipse Generator from sine wave LUT
LDWi $0000 ;LDW #0
LDX #0
LDY #50
sinwave1 STX phase
LDA sinout
STAaw scratchx ;STA scratchx,W
STY phase
LDA cosout
STAaw scratchy ;STA scratchy,W
CPX #2047
BNE XNZ
LDX #0
XNZ INX
CPY #2047
BNE YNZ
LDY #0
YNZ INY
INW
CPWi $07FF ;CPW #2047
BNE sinwave1
LDX #5 ;green
LDA CLUT,X
STA color
LDY #0
plot LDA scratchx,Y
CLC
ADC #1920/2
STA Xp
LDA scratchy,Y
CLC
ADC #1080/2
STA Yp
INY
CPY #2047
BNE plot
Not sure why it seems to be doing every other dot...
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Mon Mar 24, 2014 12:47 pm
by BigEd
Do you have a multiply instruction in this CPU?
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Mon Mar 24, 2014 1:22 pm
by ElEctric_EyE
No, I've not had time to pursue that yet...
Why do you ask? Do you have something in mind?
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Mon Mar 24, 2014 1:30 pm
by BigEd
I was just thinking that multiplication is pretty handy for scaling things. And it's cheap - hardware multipliers already on the FPGA. You may recall I tried to add it to a branch of my 65Org16 but got into a mess and never dug it out completely.
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Mon Mar 24, 2014 3:04 pm
by ElEctric_EyE
I was just thinking that multiplication is pretty handy for scaling things. And it's cheap - hardware multipliers already on the FPGA. You may recall I tried to add it to a branch of my 65Org16 but got into a mess and never dug it out completely.
Chances are I will get back into that sooner than later, as this stage of the project is almost finished. I have already tried once a long time ago to add multiply feature and came out with some issues I was not able to resolve, based on your core.
But as far as this project, I would like to put the ellipse/circle generator which is in software now, into a Verilog module so it can run much faster. After this I will be very close to some good 3-d vector animation!
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Sun Mar 30, 2014 9:21 pm
by ElEctric_EyE
...But as far as this project, I would like to put the ellipse/circle generator which is in software now, into a Verilog module so it can run much faster. After this I will be very close to some good 3-d vector animation!
That was a tangent

I tried the idea, but it got hairy real quick. After about 2hrs of writing the Verilog code, simulations revealed that the way I had envisioned this was not going to work due to timing issues and it was not worth the time investment that I was beginning to realize. To compound this issue, I forgot to save the initial project success before the heavy modification. It took 4 days to get back to the last pic with the ellipse. 3 of which due to to 1 character mis-type in address decoding. I found that error this morning after extensive sim's. Stupid mistake, but finding the needle (in the haystack) gives one confidence. So I post...
Long story short: I have 3Kx11 worth (3x 1Kx11) of data values that the cpu modified from an original 1Kx11 sine LUT. I don't think there's any need
yet to do this part very fast...
There is 16Kx11 available.
This is how I program it, for those who dare analyze.

. It's actually cake. The Y index values are cosines in the 'sinwave' routines. The X index values are sines. So by plotting sine/cosine in an X/Y system, you get a circle or ellipse depending on the X/Y index registers. The W index is the length of the LUT.
Code: Select all
;--------- Circle/Ellipse Generator from sine wave LUT
LDWi $0000 ;LDW #0
LDX #128
LDY #256
sinwave1 STX phase
LDA sinout
STAaw scratchx ;STA scratchx,W
STY phase
LDA sinout
STAaw scratchy ;STA scratchy,W
CPX #1023
BMI XNZ
LDX #0
XNZ INX
CPY #1023
BMI YNZ
LDY #0
YNZ INY
INW
CPWi $03FF ;CPW #1023
BNE sinwave1
LDWi $0400 ;LDW #1024
LDX #850
LDY #256
sinwave2 STX phase
LDA sinout
STAaw scratchx ;STA scratchx,W
STY phase
LDA sinout
STAaw scratchy ;STA scratchy,W
CPX #1023
BMI XNZ2
LDX #0
XNZ2 INX
CPY #1023
BMI YNZ2
LDY #0
YNZ2 INY
INW
CPWi $07FF ;CPW #2047
BNE sinwave2
LDWi $0800 ;LDW #2048
LDX #0
LDY #256
sinwave3 STX phase
LDA sinout
STAaw scratchx ;STA scratchx,W
STY phase
LDA sinout
STAaw scratchy ;STA scratchy,W
CPX #1023
BMI XNZ3
LDX #0
XNZ3 INX
CPY #1023
BMI YNZ3
LDY #0
YNZ3 INY
INW
CPWi $0BFF ;CPW #3071
BNE sinwave3
LDX #7 ;yellow
LDA CLUT,X
STA color
LDY #0
plot LDA scratchx,Y
LSR
CLC
ADC #150
STA Xp
LDA scratchy,Y
LSR
CLC
ADC #500
STA Yp
JSR DELAY2
INY
CPY #1023
BMI plot
LDX #5 ;green
LDA CLUT,X
STA color
LDY #1024
plot2 LDA scratchx,Y
LSR
LSR
LSR
CLC
ADC #700
STA Xp
LDA scratchy,Y
LSR
LSR
LSR
CLC
ADC #100
STA Yp
JSR DELAY2
INY
INY
INY
INY
CPY #2047
BMI plot2
LDX #1 ;white
LDA CLUT,X
STA color
LDY #2048
plot3 LDA scratchx,Y
LSR
LSR
CLC
ADC #1250
STA Xp
LDA scratchy,Y
LSR
LSR
CLC
ADC #350
STA Yp
JSR DELAY2
INY
INY
CPY #3071
BMI plot3
STALL JMP STALL
So I have 3 virtex modifiers. A simple triangle would be a first test. These ellipses and circle would not be plotted. Lines would be drawn to their circumferences in some regular order.
BTW, here is theDELAY2 routine so I can see the pixels plotted at a relatively slow rate. This is still 65Org16.b @74.25MHz.
Code: Select all
DELAY2 LDX #$007
LXJ LDWi $FFFF
GHJ DEW
BNE GHJ
DEX
BNE LXJ
RTS
Re: 1080p HD Video on custom FPGA/VDAC/2MBx18 SyncRAM board
Posted: Mon Mar 31, 2014 5:15 pm
by ElEctric_EyE
Alright! Some more progress. Changed the software that writes to each of the scratchpad blockRAM. Each blockRAM has it's own label now... Ellipses are still there for proof, now with a line drawn from each to another.
Heh, this is gonna be good! 3 other PVB's are on standby awaiting updates. They will run in serial, passing their RGB parallel graphics onto this, the final output board.
Sorry for the bad pic. The sun is actually out today which messed with the camera settings.
Code: Select all
;--------- Circle/Ellipse Generator from sine wave LUT
LDWi $0000 ;LDW #0
LDX #128
LDY #256
sinwave1 STX phase
LDA sinout
LSR
CLC
ADC #150
STAaw scratchx1 ;STA scratchx,W
STY phase
LDA sinout
LSR
CLC
ADC #500
STAaw scratchy1 ;STA scratchy,W
CPX #1023
BMI XNZ
LDX #0
XNZ INX
CPY #1023
BMI YNZ
LDY #0
YNZ INY
INW
CPWi $03FF ;CPW #1023
BNE sinwave1
LDWi $0000
LDX #850
LDY #256
sinwave2 STX phase
LDA sinout
LSR
LSR
LSR
CLC
ADC #700
STAaw scratchx2
STY phase
LDA sinout
LSR
LSR
LSR
CLC
ADC #100
STAaw scratchy2
CPX #1023
BMI XNZ2
LDX #0
XNZ2 INX
CPY #1023
BMI YNZ2
LDY #0
YNZ2 INY
INW
CPWi $03FF
BNE sinwave2
LDWi $0000
LDX #0
LDY #256
sinwave3 STX phase
LDA sinout
LSR
LSR
CLC
ADC #1250
STAaw scratchx3
STY phase
LDA sinout
LSR
LSR
CLC
ADC #350
STAaw scratchy3
CPX #1023
BMI XNZ3
LDX #0
XNZ3 INX
CPY #1023
BMI YNZ3
LDY #0
YNZ3 INY
INW
CPWi $03FF
BNE sinwave3
LDX #7 ;yellow
LDA CLUT,X
STA color
LDY #0
plot LDA scratchx1,Y
STA Xp
LDA scratchy1,Y
STA Yp
INY
CPY #1023
BMI plot
LDX #5 ;green
LDA CLUT,X
STA color
LDY #0
plot2 LDA scratchx2,Y
STA Xp
LDA scratchy2,Y
STA Yp
INY
INY
INY
INY
CPY #1023
BMI plot2
LDX #1 ;white
LDA CLUT,X
STA color
LDY #0
plot3 LDA scratchx3,Y
STA Xp
LDA scratchy3,Y
STA Yp
INY
INY
CPY #1023
BMI plot3
LDX #5 ;green
LDA CLUT,X
STA color
LDY #0
LDA scratchx1,Y
STA lx0
LDA scratchy1,Y
STA ly0
LDA scratchx2,Y
STA lx1
LDA scratchy2,y
STA ly1 ;plot line from yellow to green
LDA scratchx2,Y
STA lx0
LDA scratchy2,y
STA ly0
LDA scratchx3,Y
STA lx1
LDA scratchy3,Y
STA ly1 ;plot line from green to white
LDA scratchx3,Y
STA lx0
LDA scratchy3,Y
STA ly0
LDA scratchx1,Y
STA lx1
LDA scratchy1,Y
STA ly1 ;plot line from white to yellow
STALL JMP STALL