Anyway, I kept the code, so if you ever need delayed I/O, let me know.
Concept & Design of 3.3V Parallel 16-bit VGA Boards
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
ElEctric_EyE wrote:
I sort of gave up on the IODELAY since I couldn't figure it out.
Anyway, I kept the code, so if you ever need delayed I/O, let me know.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Arlet wrote:
...Anyway, I kept the code, so if you ever need delayed I/O, let me know.
In other matters, I am thinking of ways to maximize plotting communication between the CPU and the HVSync modules. Old school 6502 software would have had to repetitively add the horizontal resolution to obtain the Y value and monitor the carry in order to increment the MSB, then add in the X value and again monitor the carry.
Today I was thinking of an idea to implement in Verilog that would be superior as far as execution speed is concerned when plotting individual pixels:
Integrate the X and Y pixel counters from the HVSync module and have them go direct into the 65Org16 and have a plot opcode when X counter and Y counter both equaled the values present in 2 X/Y accumulators. Then the CPU would auto write the SyncRAM with a color value from a 3rd accumulator. The only issue being here is that the CPU is running at a multiple of the pixel clock frequency. This means that maybe other video timing signals maybe have to be brought into the CPU directly for the CPU to have it's own X and Y counters being run at a multiple.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
A simple thing you could do is to make a memory interface between CPU and video bitmap that's optimized for the CPU. For instance, if you had an 640x480 resolution, you could map that to a bitmap of 1024x480, or even 65536x480, so you could put the X coordinate in zero page register 'p', and put the Y coordinate in register 'p+1', and just write to the pixel as 'sta (p), y'.
In the FPGA you would take the address, split the bits in X/Y coordinates, and quietly access the memory at [Y * 640 + X] instead.
In the FPGA you would take the address, split the bits in X/Y coordinates, and quietly access the memory at [Y * 640 + X] instead.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Excellent food for thought!
My initial thoughts are to bring the IR out of the core so the interface knows when that special indirect STA opcode (STQ ($xxxx),y or STQ ($xxx),w) is executed and to pluck the 2 addresses off of the databus?
I have to think on this some more, but obviously you have already though. Maybe this should be done inside the core itself.
My initial thoughts are to bring the IR out of the core so the interface knows when that special indirect STA opcode (STQ ($xxxx),y or STQ ($xxx),w) is executed and to pluck the 2 addresses off of the databus?
I have to think on this some more, but obviously you have already though. Maybe this should be done inside the core itself.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
You can treat all memory accesses this way. For instance, if you take vertical resolution of 480, you need 480*64K for the bitmap, which is "only" 30MB. Since you have a 4GB address space, it wouldn't be so bad to waste 3% on this trick. So, if a memory access falls in that 30MB window, it is treated as a special video bitmap access, and otherwise it is treated normally.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
On the other hand, with a good multiply instruction, it may be easier to just do this in software
Inserting the multiplier in the critical address path may not be such a good idea after all...
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
After thinking about this more, could it be as simple as this?:
Instead of having a single [20:0] bit counter for the video address in the HVSync generator, have 2 counters x [9:0] and y [8:0] and present it to the VideoRAM as [18:0]. Some memory would be wasted though.
Instead of having a single [20:0] bit counter for the video address in the HVSync generator, have 2 counters x [9:0] and y [8:0] and present it to the VideoRAM as [18:0]. Some memory would be wasted though.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
The idea was to only have this translation in the memory interface between CPU and video RAM.
When displaying the bitmap, the video module directly grabs the memory without any translation.
When displaying the bitmap, the video module directly grabs the memory without any translation.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
But if the CPU and video module access the memory with the same nonlinear pattern everything is good I would think.
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
For the video module it's easy to access the memory as just a straight bitmap. You don't even have to generate the address from X/Y coordinate. Just initialize address to first pixel when there's a VSYNC, and then just increment for every pixel.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
For what I'm thinking of, the video and CPU have to have identical 'memory maps' to work. I'm at work now, or I would try this idea out immediately!
The address for pixel (0,1) would be 00_00_0000_0000_0_0000_0001 (bb,xx_xxxx_xxxx,y_yyyy_yyyy where b is bank x is X and y is Y). For pixel (639,479)would be 00_10_0111_1111_1_1101_1111.
In a straight bitmap a value for pixel (0,1) would be 00_0000_0000_0000_0001, and (639,479) would be 100_1011_0000_0000_0000. If the video module were setup for linear, things wouldn't match up with the cpu.
The address for pixel (0,1) would be 00_00_0000_0000_0_0000_0001 (bb,xx_xxxx_xxxx,y_yyyy_yyyy where b is bank x is X and y is Y). For pixel (639,479)would be 00_10_0111_1111_1_1101_1111.
In a straight bitmap a value for pixel (0,1) would be 00_0000_0000_0000_0001, and (639,479) would be 100_1011_0000_0000_0000. If the video module were setup for linear, things wouldn't match up with the cpu.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Brainfart. I deviated from my original suggestion, sorry. I realized it's pretty much already set up for yyy_yyyy_yyxx_xxxx_xxxx. I was thinking if I could get the y value then barrel shift it 10x to the left and add it to the X value I would get a nice quick address. But I can't do that easily with a 16-bit CPU.
I'm trying to think ahead for when a controller board is sending raw coordinates to be plotted, in pixel mode.
I'm trying to think ahead for when a controller board is sending raw coordinates to be plotted, in pixel mode.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Arlet wrote:
A simple thing you could do is to make a memory interface between CPU and video bitmap that's optimized for the CPU. For instance, if you had an 640x480 resolution, you could map that to a bitmap of 1024x480, or even 65536x480, so you could put the X coordinate in zero page register 'p', and put the Y coordinate in register 'p+1', and just write to the pixel as 'sta (p), y'.
In the FPGA you would take the address, split the bits in X/Y coordinates, and quietly access the memory at [Y * 640 + X] instead.
In the FPGA you would take the address, split the bits in X/Y coordinates, and quietly access the memory at [Y * 640 + X] instead.
So this would have to mean the real CPU address going to the videoRAM would be intercepted by another module that would rearrange the bits at all times. I think this might be within my skill range.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
Time to take some baby steps. I'm starting from a known working foundation and will change one variable at a time:
1) 640x480x16 bit color. Pixel clock @25MHz.
2) 1K zeropage starting @0000_0000. 1K stackpage starting @0001_0000.
3) 65Org16.b Core and external SyncRAM running @100MHz after using smartExplorer. mapgloboptioreg design strategy.
4) Software performs a 'clear screen' by writing to SyncRAM a value of '1111_11111_1111_1111'. Then successfully reads character data from blockRAM and plots an 8x8 pixel blue color character '#', starting at (0,0). A read from this videoRAM, where the character data is, is incorrect as it plots the read data 16 pixels to the right and it's a dark red. I've not tried to correct for the 2 cycle delay yet.
5) HVSync module is using a linear address counter for the SyncRAM.
6) Address decoding for the VideoRAM is active from $80000_0000-$8FFF_FFFF.
7) All signals from FPGA to SyncRAM are constrained to 4mA 'DRIVE' strength. All signals from FPGA to videoDAC are constrained to 12mA 'DRIVE' strength (character bit placement errors were noted driving the videoDAC @4mA).
8) .1uF bypass capacitors still need to be added to the FPGA, SyncRAM and videoDAC.
1) 640x480x16 bit color. Pixel clock @25MHz.
2) 1K zeropage starting @0000_0000. 1K stackpage starting @0001_0000.
3) 65Org16.b Core and external SyncRAM running @100MHz after using smartExplorer. mapgloboptioreg design strategy.
4) Software performs a 'clear screen' by writing to SyncRAM a value of '1111_11111_1111_1111'. Then successfully reads character data from blockRAM and plots an 8x8 pixel blue color character '#', starting at (0,0). A read from this videoRAM, where the character data is, is incorrect as it plots the read data 16 pixels to the right and it's a dark red. I've not tried to correct for the 2 cycle delay yet.
5) HVSync module is using a linear address counter for the SyncRAM.
6) Address decoding for the VideoRAM is active from $80000_0000-$8FFF_FFFF.
7) All signals from FPGA to SyncRAM are constrained to 4mA 'DRIVE' strength. All signals from FPGA to videoDAC are constrained to 12mA 'DRIVE' strength (character bit placement errors were noted driving the videoDAC @4mA).
8) .1uF bypass capacitors still need to be added to the FPGA, SyncRAM and videoDAC.
-
ElEctric_EyE
- Posts: 3260
- Joined: 02 Mar 2009
- Location: OH, USA
Re: Concept & Design of 3.3V Parallel 16-bit VGA Boards
So, the addition of step #6 has not resulted in lowering the core below 100MHz.
EDIT: This is the CLeaRSCReen routine. It is about to drastically change if Arlet's suggestion is successful!
EDIT: This is the CLeaRSCReen routine. It is about to drastically change if Arlet's suggestion is successful!
Code: Select all
CLRSCR LDA #$0000 ;$8000_0000 START OF VIDEO MEMORY
STA SCRLO
TAY
LDA #$8000
STA SCRHI
PHA
LDX #$0004 ;4x65536= 262,144. WE NEED 307,200 PIXELS for 640x480
LDA SCRCOL
AA STA (SCRLO),Y
DEY
BNE AA
INC SCRHI
DEX
BNE AA
STA (SCRLO),Y ;get that last pixel!
LDY #$AFFF
AB STA (SCRLO),Y ;CLEAR REMAINING 45056 PIXELS
DEY
BNE AB
PLA
STA SCRHI ;reset MSB of videoRAM pointer
RTS