Taking in the earlier comments, I'm currently considering the option of using a Mega32 as a video co-processor
I'm not too happy about the VGA approach; the cards I found were all sixteen bit cards, and all were too big to fit on my half-eurocard cube spec, and I couldn't get good details about what happens at slow clock rates.
Plus, where's the fun in someone else's design?
So I started thinking about using an M32 as the co-proc: it has enough legs to talk to an external 16k memory for 512*256 pixels, and with a clock at 16M I *think* I can get it to generate all the correct timings for a 'nearly-PAL' spec 312/624 line 50Hz signal. There's about one fifth of the processor time easily available for the pixelisation and writing to the ram.
A single HCT574 latches main CPU requests onto an internal databus, and a couple of HCT74 flip-flops latch main CPU requests and ready bits, and an error bit.
There's enough internal eeprom for a couple of sized sets of font data.
The theory is, rather than hold the data in the chip and build pixels on the fly, I use 20% of the time - field blanking, effectively - to render the data onto the ram, and the active field period to address the ram and load an HCT166 PISO to get the data out.
The bad news is that I can't - directly - use the four interrupts per line with lots of sleeping that I've used before. I still need to think about this.
The interface will be dead simple; read an IO address till the ready bit clears, write a command, and if you care, check for an error. I may include another 574 to latch a response from the coproc if it's required, but it would only have seven bits, I think
Something along the lines of:
00 - reset video and clear screen
01 - move to xxxxyy
02 - write text at current position
03 - line to xxxxyy
04 - set line function xx = xor, black, white
... you get the idea.
Neil (and it's only six or seven chips, so it fits on the half-eurocard...)