6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Tue Sep 24, 2024 10:28 pm

All times are UTC




Post new topic Reply to topic  [ 37 posts ]  Go to page 1, 2, 3  Next
Author Message
PostPosted: Sun Jun 22, 2014 9:28 am 
Offline

Joined: Wed Sep 11, 2013 8:43 pm
Posts: 207
Location: The Netherlands
MARC-2 is running for some weeks now and I've been working on a VGA generator. Individually both are working satisfyingly, and my intention is to merge the two with as little CPU performance loss as possible. On the picture you see the first 64000 bytes of MARC-2's memory, which firmware was deactivated and memory was constantly in read mode.
Attachment:
1.jpg
1.jpg [ 567.97 KiB | Viewed 1994 times ]


First some details on MARC-2.

The CPLD sits between the CPU and SRAM. Every pin of the CPU and SRAM, except for the power supply, is connected to the CPLD's I/O pins individually.

VGA is also connected to the CPLD. A resistor DAC features 256 colors by the following scheme: Red, Green, Blue and Intensity, each with two bit's accuracy. RRGGBBII = 4x4x4x4 = 256 colors. HSYNC and VSYNC are also directly connected to the CPLD.

The CPLD receives a clock signal of 14.7456MHz. This could be altered to the common available crystal values up to 20MHz, or even a bit more by overclocking the AVR.

The VGA timing is as follows: I've chosen the VGA standard timing for 640 x 480 @ 60Hz, which would need a dot clock of 25.175MHz. I've altered this to 14.7456MHz which provides ca. 375 visible pixels per line. I chose to use 320 of them and center them on the screen, leaving the rest as a black border. Of the 480 lines I chose to use 400 of them and draw each line twice to effectively get 200 lines per screen. One pixel is represented by one byte of memory to achieve 256 colors. So for one horizontal line I need 320 bytes at a rate of 67.8ns per byte.

MARC-2's PHI2 is 7,3728MHz and phase 1 and phase 2 are each 67.8ns. so I need 2 bytes in 1 PHI2 cycle.

My question is how do I do that as elegant as possible while using the least CPU power?
/Edit
To be more specific, I'd like to know how to arrange access for both the CPU and the VGA subsystem from the same memory and fetch two bytes within one PHI2 cycle. The VGA generator needs one byte every 67.8ns while PHI2 takes double as much, 135.6ns.


I'm including 3 firmwares written in ABEL:

1. M2_26OK.abl MARC-2 working without VGA
2. M2_VGA01.abl the same as 1. but only VGA included and working
3. SBC3NTSC16.abl Daryl Rictor's working firmware of SBC-3 on which my design is based

/Edit
The file "2_M2_VGA01.abl.txt" contains both the working MARC-2 and the working VGA-generator ABEL code. I can alternatively get MARC-2 working or the VGA generator by remarking portions of the code. I did several attempts to get both working at the same time without success. :-/


I want to add that I'm on the verge of understanding about stopping the CPU, clock stretching, DMA, cycle stealing and Don Lancaster's micro scan procedure. So please be as basic as possible. :)


Attachments:
1_M2_26OK.abl.txt [11.26 KiB]
Downloaded 111 times
2_M2_VGA01.abl.txt [15.55 KiB]
Downloaded 96 times
3_SBC3NTSC16.abl.txt [15.61 KiB]
Downloaded 83 times

_________________
Marco


Last edited by lordbubsy on Sun Jun 22, 2014 1:28 pm, edited 3 times in total.
Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 11:29 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Is your RAM byte-wide, and able to sustain a 14MHz read rate? If so, your CPLD has a chance of reading the RAM fast enough to feed 8-bit pixel colour values to the VGA output.

If your RAM is 16 bits wide, or can be hooked up at that width, then you can read two pixels in one access using the CPLD. Of course the CPU still accesses a byte at a time.

If your RAM just isn't fast enough, maybe you need to consider reading bytes instead, each byte containing a two 4-bit values which gives you 16 colours per pixel. If you index into a lookup table then each pixel could have one of 16 colours taken from a full palette.

Hope that helps
Ed


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 12:22 pm 
Offline

Joined: Wed Sep 11, 2013 8:43 pm
Posts: 207
Location: The Netherlands
BigEd wrote:
Is your RAM byte-wide, and able to sustain a 14MHz read rate?


The SRAM's data bus, see attached datasheet, is 8-bits wide. It's a 55ns 512k x 8 and fast enough for the required speed. The VGA generator is working perfectly from that SRAM. So that is covered.


Attachments:
AS6C4008.pdf [504.86 KiB]
Downloaded 118 times

_________________
Marco
Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 12:24 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Maybe I'm confused... it looked like you were asking how you might get two bytes per CPU clock cycle.

Perhaps you're not asking about how to get the data rate, but how to arrange access for both the CPU and the VGA subsystem from the same memory?

Ed


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 12:34 pm 
Offline

Joined: Wed Sep 11, 2013 8:43 pm
Posts: 207
Location: The Netherlands
BigEd wrote:
Maybe I'm confused... it looked like you were asking how you might get two bytes per CPU clock cycle.
Sorry for the confusion. Indeed, I'd like to know how to arrange access for both the CPU and the VGA subsystem from the same memory.

Of course I did several attempts, but had no success.

_________________
Marco


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 1:58 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
If you only needed a single access per clock phase, it would be easier, as the 6502 doesn't need the bus during phi1 - see the Beeb's approach for example.

But you need more access than that, more like a VIC-powered machine like the C64. So you will need to keep the 6502 happy, somehow, while the bus is inaccessible. One way is to use RDY to stall the CPU, as the C64 does, and another way is to stop the CPU's clock (or, equivalently, stretch the clock) so the CPU doesn't see a falling edge until the RAM is back in play and has responded to the CPU's address lines.

Note that the original 6502 and the 65816 both honour RDY only during read cycles[*]. So, if not using a 65C02, you have to wait up to three cycles after pulling RDY low before you know the CPU is stalled and the bus is free for the VGA.

In either case, you need some logic in your CPLD to decouple the CPU databus and address bus and instead attach the VGA's busses to the RAM. But maybe that was already obvious!

Cheers
Ed

[*] Edit: I got this info from the 2007 programmer's manual from WDC. The datasheet, by contrast, presently says that both 65C02 and 65816 will honour RDY during writes. I found a Usenet posting from 1996 which says the same.


Last edited by BigEd on Mon Jun 23, 2014 8:47 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 2:20 pm 
Offline

Joined: Wed Sep 11, 2013 8:43 pm
Posts: 207
Location: The Netherlands
To be more specific, I'd like to know how to arrange access for both the CPU and the VGA subsystem from the same memory and fetch two bytes within one PHI2 cycle. The VGA generator needs one byte every 67.8ns while PHI2 takes double as much, 135.6ns.


This is the VGA generator I'd like to fit in.

Code:
 ///////////////////////
//VGA EQUATIONS START//
///////////////////////
@CARRY 1;                                               "ripple-carry addition is fast enough
hcnt.ACLR     = !N_RESET;                               "clear counter on active-high reset
vcnt.ACLR     = !N_RESET;                               "clear counter on active-high reset
hcnt.CLK      = CLOCK;                                  "horz. pixel counter increments on each dot clock
vcnt.CLK      = N_HSYNC;                                "line counter increments after every horiz. line
N_HSYNC.ASET  = !N_RESET;
N_VSYNC.ASET  = !N_RESET;
N_HSYNC.CLK   = CLOCK;
N_VSYNC.CLK   = N_HSYNC;
VADDR.CLK     = CLOCK;
VADDR.ACLR    = !N_RESET;

"horiz. pixel counter rolls-over after 469 pixels
WHEN (hcnt<468) THEN hcnt:=hcnt+1 ELSE hcnt:=0;

"horiz. sync is low during this interval to signal start of a new line
WHEN ((hcnt>=384)&(hcnt<441)) THEN N_HSYNC:=0 ELSE N_HSYNC:=1;

"vertical line counter rolls-over after 525 lines
WHEN (vcnt<524) THEN vcnt:=vcnt+1 ELSE vcnt:=0;

"vert sync is low during this interval to signal the start of a frame
WHEN ((vcnt>=490)&(vcnt<492)) THEN N_VSYNC:=0 ELSE N_VSYNC:=1;

"blank video outside of visible region: (0,0)->(320,400) left border 26 top border 40
WHEN (hcnt>=346)#(hcnt<26)#(vcnt>=440)#(vcnt<40) THEN blank=1 ELSE blank=0;

"store the blanking signal for use in the next pipeline stage
pblank.ACLR   = !N_RESET;
pblank.CLK    = CLOCK;
pblank       := blank;

VDATA.ACLR    = !N_RESET;
VDATA.CLK     = CLOCK;

WHEN pblank==0 THEN VDATA := RDATA                      " when the video signal is blanked, the RGB value is forced to 0.
  ELSE              VDATA := [0,0,0,0,0,0,0,0];         " blanks edges of screen

WHEN vcnt==0       THEN VADDR := [0,0,0 ,0,0,0,0 ,0,0,0,0 ,0,0,0,0 ,0,0,0,0]
                                             " set new page at end of frame
  ELSE WHEN pblank==0
                   THEN VADDR := VADDR + 1   " inc video memory pointer during active display and reset to load RAM from EEPROM

  ELSE WHEN double THEN VADDR := VADDR - 320 " on odd lines, draw the previous line again

  ELSE                  VADDR := VADDR;      " all other fclks, keep VADDr the same

double = ((vcnt>40) & (hcnt==467) & (vcnt0==1)); "decides which line to draw twice

RADDR = VADDR;
RDY = 0;
BE  = 0;
N_MEMRD = 0;
N_MEMWR = 1;
N_RAMCS =0;




And this is the relevant portion where the VGA generator should be granted access.

Code:
// CPU RAM address bus control
RADDR      = [CA18..CA0];                               "Connect CPU address to RAM address bus

// CPU RAM data bus control (bi-directional) [taken from ABEL reference]
CDATA      = RDATA;                                     "RAM data moves to CPU data bus
RDATA      = CDATA;                                     "CPU data moves to RAM data bus
CDATA.oe   = N_RW & PHI2 & N_RESET &                    "i.e. read from RAM
             N_VIA0CS & N_VIA1CS;                       "When VIA's selected -> tri-state CDATA
RDATA.oe   = !N_RW & PHI2;                              "i.e. write to RAM

// MEMORY Read Write Qualification
N_MEMRD    = !(PHI2 & N_RW);                            "MEMORY Read qualified with PHI2
N_MEMWR    = !(PHI2 & !N_RW);                           "MEMORY Write qualified with PHI2


The first thing I tried is to get access to the VGA while PHI2 is low.
That's easy enough for the address lines:
Code:
WHEN PHI2 THEN RADDR = [CA18..CA0]                      " On PHI2 high, place CPU address on Memory Address bus
  ELSE         RADDR = VADDR;                           " On PHI2 low, put Video data address on Memory Address Bus


The SRAM's data bus should be connected to the VIDEO data buffer during visible pixels and while PHI2 is low.
Code:
WHEN pblank==0 & !PHI2 THEN VDATA := RDATA          " when the video signal is blanked, the RGB value is forced to 0.
  ELSE              VDATA := [0,0,0,0,0,0,0,0];     " blanks edges of screen
With those alterations memory access and address decoding etc. work correctly.


N_MEMRD should be low when PHI2 is low.
Code:
!N_MEMRD    = ((PHI2 & N_RW) # (PHI2));                            "MEMORY Read qualified with PHI2



Here's the first problem. The CPU crashes.



BigEd wrote:
If you only needed a single access per clock phase, it would be easier, as the 6502 doesn't need the bus during phi1 - see the Beeb's approach for example.
Yes, that's part of the problem, Daryl's SBC-3 uses composite video which only needs one byte per clock phase.

BigEd wrote:
One way is to use RDY to stall the CPU
I roughly tried that with:

Code:
WHEN pblank==0 THEN RDY := 0
  ELSE              RDY := 1;

WHEN pblank==0 THEN BE := 0
  ELSE              BE := 1;

But it seems the CPU doesn't like that and crashes.

BigEd wrote:
another way is to stop the CPU's clock (or, equivalently, stretch the clock)
I also tried that with:
Code:
WHEN (pblank==0 & PHI2) THEN PHI2= 1
  ELSE              PHI2 = DIV0; " normal clock rate


Same again, the CPU crashes. Btw. I don't use any interrupts.

BigEd wrote:
Note that the original 6502 and the 65816 both honour RDY only during read cycles.
I'm using the 65816. Should I use (VPA # VDA)?

BigEd wrote:
In either case, you need some logic in your CPLD to decouple the CPU databus and address bus and instead attach the VGA's busses to the RAM. But maybe that was already obvious!
There has been taken care of, but I don't rule out that there could still be a problem.


I think a good plan would be to first try stretching the clock or stopping the CPU. When that works, I could begin to fetch bytes for the VGA generator.

_________________
Marco


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 2:57 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
If you can take the performance hit of holding RDY low throughout the active line (or all lines) that might be simplest. I think your approach of bringing RDY and BE low in the same cycle might be a mistake. I think the effect of RDY is to make the following cycle a stall cycle: so bring BE low one cycle later.

Do you have means of diagnosing what happens when the CPU crashes? Any trace of activity, or a logic analyser, or a simulation to help debug the whole system?

Cheers
Ed


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 3:16 pm 
Offline

Joined: Sun Jul 28, 2013 12:59 am
Posts: 235
BigEd wrote:
Note that the original 6502 and the 65816 both honour RDY only during read cycles. So, if not using a 65C02, you have to wait up to three cycles after pulling RDY low before you know the CPU is stalled and the bus is free for the VGA.


Or introduce logic to buffer some number of writes, although then you'll need to hold the CPU in RDY while you flush that buffer. Something to consider, at least.

Another angle is that three consecutive write cycles sounds like writing to the stack during interupt response, so if you move the stack page into your programmable logic you can handle it independently from a write-buffer to "normal" memory.

Changing direction somewhat, your VGA generator needs one byte every 67.8ns, and your CPU needs one byte every 135.6ns? So, three bytes every 135.6ns total, is 45ns per access, which is faster than the limit on your RAM, so you need to lose CPU cycles every so often in order to keep up. If your RAM is fast enough, put a small FIFO on the VGA generator's RAM access so that the CPU can preempt it when it wants a cycle (and there will be sufficient throughput to completely re-fill the FIFO before the CPU needs another access). Or use a 16-bit datapath for the VGA, which would double your VGA data rate, and you can use A0 to toggle which half of the RAM bus gets used for CPU access.

If your RAM isn't fast enough, the FIFO approach can still be workable, suspending the CPU when the VGA FIFO hits a low-water mark, and resuming it when it hits a high-water mark.

And, of course, take all this with a grain of salt: I've never made a working video generator, never worked with programmable logic, and simpler solutions are probably available.


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 3:22 pm 
Offline

Joined: Wed Sep 11, 2013 8:43 pm
Posts: 207
Location: The Netherlands
BigEd wrote:
If you can take the performance hit of holding RDY low throughout the active line (or all lines) that might be simplest.
Certainly for now I'd like to get it working that way, later on I'll try to save CPU time.

BigEd wrote:
I think your approach of bringing RDY and BE low in the same cycle might be a mistake.
Ah, I see, I don’t have to use BE anyway, so I'll leave it high.

BigEd wrote:
Do you have means of diagnosing what happens when the CPU crashes?
yes, an old 20MHz scope and a 8 channel 24MHz logic analyzer.

_________________
Marco


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 5:36 pm 
Offline

Joined: Wed Sep 11, 2013 8:43 pm
Posts: 207
Location: The Netherlands
nyef wrote:
your VGA generator needs one byte every 67.8ns, and your CPU needs one byte every 135.6ns? So, three bytes every 135.6ns total, is 45ns per access
No it's actually simpler, the CPU is running at 7,3728MHz and VGA at 14.7456MHz. VGA needs and has priority to get one byte at a rate of 14.7456MHz, so every 67.8ns. For now it doesn't matter what the CPU does during the time the VGA generator needs it's 320 bytes of memory in one burst. Later on it would be nice if the CPU could do its normal job, but that would only be possible if VGA could get its two bytes during PHI2 is low, which is impossible of course.

A good way would be what Ed said, to reduce the color depth to 16 colors to halve the data rate.

nyef wrote:
Or use a 16-bit datapath for the VGA,
Unfortunately the SRAM's data bus is only 8-bits wide.

For now I'm trying to simply halt the CPU while the vertical counter is in the active display range by putting RDY low by:

Code:
RDY.clk = PHI2;
WHEN (vcnt>442)#(vcnt<38) THEN RDY := 1
  ELSE                         RDY := 0;

This works partially, during DUART access, the computer crashes.

How could I prevent that from happening?

_________________
Marco


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 5:43 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
At minimum I suggest you bring RDY low three cycles before you start using the bus.

Or can you say that there are no writes being done by the CPU during display, and yet it still crashes? Could you put it into an idle loop and somehow see that it doesn't crash?

About your 8-bit path to SRAM... if you have the spare pins, then two SRAM chips will give you a 16-bit path. Or, two SRAM chips and an external mux will give you an 8-bit path but with the ability to access two bytes in one memory access cycle.


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 6:36 pm 
Offline

Joined: Wed Sep 11, 2013 8:43 pm
Posts: 207
Location: The Netherlands
This arrangement works kind of flaky, and I have to press reset several times to make the computer behave like it should. Then it only crashes while accessing the DUART (keyboard and screen). Not right away though, the point it crashes seems random.

Code:
RDY.clk = PHI2;
WHEN (vcnt>444)#(vcnt<36) THEN RDY := 1
  ELSE                         RDY := 0;                "During active display

WHEN (vcnt>442)#(vcnt<38) THEN RADDR = [CA18..CA0]
  ELSE                         RADDR = VADDR;           "During active display

WHEN (vcnt>442)#(vcnt<38) THEN N_MEMRD = !(PHI2 & N_RW)
  ELSE                         N_MEMRD = 0;             "During active display

WHEN (vcnt>442)#(vcnt<38) THEN N_MEMWR = !(PHI2 & !N_RW)
  ELSE                         N_MEMWR = 1;             "During active display




BigEd wrote:
At minimum I suggest you bring RDY low three cycles before you start using the bus.
How can I be certain of that? The bus will be accessed at any given time?

By this for example:
Code:
// CPU RAM data bus control (bi-directional) [taken from ABEL reference]
CDATA      = RDATA;                                     "RAM data moves to CPU data bus
RDATA      = CDATA;                                     "CPU data moves to RAM data bus
CDATA.oe   = N_RW & PHI2 & N_RESET &                    "i.e. read from RAM
             N_VIA0CS & N_VIA1CS;                       "When VIA's selected -> tri-state CDATA
RDATA.oe   = !N_RW & PHI2;                              "i.e. write to RAM


I could tri-state the data bus during active display?

_________________
Marco


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 22, 2014 6:54 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Sorry, I should have said, leave the bus serving the CPU for at least three cycles after dropping RDY before you start using the bus for VGA. It may have up three writes to perform before it stalls. If those writes were return addresses being stacked, which don't get written to RAM, then a crash won't be far away.
Cheers
Ed


Top
 Profile  
Reply with quote  
PostPosted: Mon Jun 23, 2014 4:15 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8394
Location: Midwestern USA
BigEd wrote:
Note that the original 6502 and the 65816 both honour RDY only during read cycles.

Actually, the 65C816 stops when RDY is asserted on both read and write cycles.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 37 posts ]  Go to page 1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 12 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: