Video card question

Let's talk about anything related to the 6502 microprocessor.
Charlielamus
Posts: 8
Joined: 04 Jun 2025

Video card question

Post by Charlielamus »

Hello!

I posted earlier that I am putting together a be6502-style breadboard computer, which is proceeding apace and I will be publishing my beginner mistakes and progress on YouTube shortly.

I'm thinking ahead now here. I'd love to be able to develop it into a machine which can output graphics and even play a few games.

I have watched people like George Foot and Matt Regan and I'm now confident I understand the hardware requirements for a VGA video card, and a bus-sharing system.

But. I have *no* idea about how you would go about learning about software, about how you would go about finding, analysing and adapting games and other software to work, or how one might write software for such a device. I'm aware this is likely to be a big learning experience!

I would love to pick your collective minds over the following questions - please, put in the simplest terms for a total and complete idiot!

1: How realistic is this as a goal?
2: Are there assembly language games for 8 bit computers out there which are well-enough commented that I could learn to port them?
3: Would is better easier - or even possible - that I could build a system which is simply compatible with the original software of some well-known 6502 machine?
4: Is there a single (or few) chip alternative to building your own (say) VIC chip?
5: Let's say I build Matt Regan's VIC20 clone - what do I do then? Is there a VIC20 original ROM chip somewhere on the interwebs I could simply flash and use?

I'm assuming the answers to some of these questions involve a certain amount of sighing and explaining. Sorry....
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Video card question

Post by barnacle »

This page may have what you need: https://www.zimmers.net/anonftp/pub/cbm ... index.html

Don't quote me, I've not checked.

Matt Regan doesn't as far as I know make any mention of where he gets his system eproms (though he's very good at explaining how to get video out). I strongly suspect that he has original machines from which he has read the eproms.

Adrian Black does a lot of 8-bit machine repairs and often refers to various online sources for code; it could be he has some pointers available, though you may have to watch a lot of videos to find out... (Not having any of the machines, I watch him mostly so I can say 'well duh!' at him rather than having someone saying it at me. :mrgreen:

Neil
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Video card question

Post by BigEd »

Video output as such isn't too bad. A video frame buffer in the 6502's address space is a bit more difficult. People often get hung up on wanting high resolutions. Video need not have sprite support, but a lot of Commodore users are quite used to that, and it's more hardware (and more debugging.)

Personally I'd recommend following something like the Acorn Atom's video, or perhaps one of the screen modes of the Acorn Electron. Yes, you could arrange the addressing to be compatible.

All the retro machines are different, some more different than others, so you would need to figure out what you want. If you don't pick the simplest, you certainly will have more to learn and more work to do.

As for the software side, there's a cross-platform package
Multi-Platform Arcade Game Designer
https://jonathan-cauldwell.itch.io/mult ... e-designer
whereby various games have been written and ported to many platforms, including the Atom. Hundreds are linked here:
https://stardot.org.uk/forums/viewtopic ... 57&t=16275


As for disassemblies, some are of course better commented than others, but there's a host of links over on stardot:
https://stardot.org.uk/forums/viewtopic.php?t=23155
User avatar
Yuri
Posts: 371
Joined: 28 Feb 2023
Location: Texas

Re: Video card question

Post by Yuri »

Another approach to dealing with the video memory is to place it behind a single address instead of mapping it to a large section of the 6502's memory space. This is how the TMS9918a and related chips work, the interesting thing about this idea is that now the 6502 just need to write to one or two addresses to do what it needs and doesn't have to spend time incrementing an index register. (The incrementing happens automatically in the VDP.)

This gives you the advantage of also making it so that the reads and writes can be placed behind a simpler clock domain boundary circuit than would be possible if you tried to cover a full address range.

Another person you could check out is James Sharman

He's been building his own CPU with basic 74 series logic, and has a section on developing VGA for his custom CPU. He does a good job about explaining his though process on why he elects to do the things that he does.
User avatar
drogon
Posts: 1671
Joined: 14 Feb 2018
Location: Scotland
Contact:

Re: Video card question

Post by drogon »

Yuri wrote:
Another approach to dealing with the video memory is to place it behind a single address instead of mapping it to a large section of the 6502's memory space. This is how the TMS9918a and related chips work, the interesting thing about this idea is that now the 6502 just need to write to one or two addresses to do what it needs and doesn't have to spend time incrementing an index register. (The incrementing happens automatically in the VDP.)
In similar vein you can use a high speed serial line - which is what I did in the 2nd iteration of my own 6502 board. I decided to use the Acorn VDU protocol, so it takes 5 bytes to plot a point or draw a sprite or circle, etc. and while I never did write any real games for it, it did feel more than fast enough to implement nice graphics. The other side of the link was a "smart" terminal program running on my Linux desktop, so really more software than hardware.

I've looked long and hard at video and I don't think there is a good solution here - we all want higher resolution, more colours, bigger, better, faster - and then we run into bandwidth issues and memory constraints. Even in the '816 having a video framebuffer thet spans more than a 64K bank starts to become a bit of a headache from a programming point of view, but restrict yourself to 64K and you get 320x200x8bpp. Not the best, but adequate for retro project, although do work out the number of clock cycles it takes just to clear the screen - at 7 cycles per byte it takes a very noticeable while.

Moving to separate video processors is the way forward - it's just working out the interface - fortunately there are a few out there now - the old TI one, the new "Vera" one on the X16 and there is another in the works by Andy Toone - the VideoBeast. https://feertech.com/microbeast/videobeast.html

-Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
J64C
Posts: 239
Joined: 11 Jul 2021

Re: Video card question

Post by J64C »

barnacle wrote:
Matt Regan doesn't as far as I know make any mention of where he gets his system eproms
They are all just sitting there, if you download an emulator like VICE.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Video card question

Post by BigEd »

Huge ROM collections can be found in the MAME section of the Internet Archive (as well as other places, I'm sure.)
https://archive.org/details/messmame

(You don't need to download gigabytes: you can drill down inside zip files and download specific files.)
jgharston
Posts: 181
Joined: 22 Feb 2004

Re: Video card question

Post by jgharston »

drogon wrote:
Moving to separate video processors is the way forward - it's just working out the interface - fortunately there are a few out there now - the old TI one, the new "Vera" one on the X16 and there is another in the works by Andy Toone - the VideoBeast. https://feertech.com/microbeast/videobeast.html
-Gordon
And if you are really enthusiastic, build the video processor itself as a 6502 project. Back in the '80s I designed (but never got around to building) a Z80 project that was a dedicated video display system with fast I/O port to send VDU commands to.
jgharston
Posts: 181
Joined: 22 Feb 2004

Re: Video card question

Post by jgharston »

drogon wrote:
there is another in the works by Andy Toone - the VideoBeast. https://feertech.com/microbeast/videobeast.html
Wow, that's exactly something I've been working on on and off over the last year. In my case: something that electrically looks like an 8K static RAM, but implements an SAA5050 teletext display.
plasmo
Posts: 1273
Joined: 21 Dec 2018
Location: Albuquerque NM USA

Re: Video card question

Post by plasmo »

jgharston wrote:
drogon wrote:
Moving to separate video processors is the way forward - it's just working out the interface - fortunately there are a few out there now - the old TI one, the new "Vera" one on the X16 and there is another in the works by Andy Toone - the VideoBeast. https://feertech.com/microbeast/videobeast.html
-Gordon
And if you are really enthusiastic, build the video processor itself as a 6502 project. Back in the '80s I designed (but never got around to building) a Z80 project that was a dedicated video display system with fast I/O port to send VDU commands to.
W65C02 can be reliably over clocked to 25.175mhz, so the processor can move video data to video display during the active video period, yet have enough throughput during video retrace period to perform normal tasks.
Bill
User avatar
Yuri
Posts: 371
Joined: 28 Feb 2023
Location: Texas

Re: Video card question

Post by Yuri »

plasmo wrote:
jgharston wrote:
drogon wrote:
Moving to separate video processors is the way forward - it's just working out the interface - fortunately there are a few out there now - the old TI one, the new "Vera" one on the X16 and there is another in the works by Andy Toone - the VideoBeast. https://feertech.com/microbeast/videobeast.html
-Gordon
And if you are really enthusiastic, build the video processor itself as a 6502 project. Back in the '80s I designed (but never got around to building) a Z80 project that was a dedicated video display system with fast I/O port to send VDU commands to.
W65C02 can be reliably over clocked to 25.175mhz, so the processor can move video data to video display during the active video period, yet have enough throughput during video retrace period to perform normal tasks.
Bill

Kind of... Just because the CPU is over clocked to 25.175Mhz doesn't mean it can keep up with the video signal. Keep in mind that is the dot clock, meaning the monitor has moved horizontally one pixel to the right on each clock pulse. Changing the value of the pixel part way through the clock pulse would cause color smearing on various TVs, and I think possibly the VGA as well. (Maybe some newer monitors with digital circuitry might handle it better, I think this is why people tend to see "glitches" on these monitors when their circuits can't keep up. I'm no expert on the internals of those devices so don't quote me on this though.)

The CPU often takes multiple clock pulses to perform a single operation. For the WDC65C02 this is a minimum of 2 clock cycles (e.g. INX) up to 7 clock cycles (e.g. PHA). And you need multiple instructions to get data from A to B, or from A and calculate on it and then to B.

If you double the CPU's clock rate, 50.35Mhz would need RAM with less than 20ns response time, and you'd still only maybe get one instruction completed in a single pixel. If you doubled again, you're looking at 10ns RAM; at which point you start pushing into the realm of needing DRAM.

And maybe someone's gotten their stock WDC65C02 up to 50Mhz, but I'm guessing the 100MHz is just out the question w/o doing something extreme like some crazy huge heat sink + fans + who knows... (Hrm.... LN2 cooled 6502.... :twisted: )

You could half the dot clock (320x480), but still only get one instruction per pixel, not the multiple desired.


DMA helps with this problem somewhat in that the DMA controller basically can tell one chunk of memory to output on the bus while the other reads, thus putting your read/write operation into a single clock; or read one cycle, and the write on the next, keeping the memory move to a two clock cycle per byte operation.

I've agonized over this in my head with trying to use a FPGA to do this, and how those old PPUs managed to pull:
- Pixel value for multiple background tiles and pixel values for the sprites
- Compare all of that to find the one that is on top (SNES had a priority value to set what appeared on top of what)
- Figure out which one doesn't have a transparent pixel at that point.
- And finally, output that to the monitor.

If you had 4 tile backgrounds and 4 sprites at 8Bpp that's 8 bytes to have to manage in a single pixel; as I was working through this mentally I started to realize why the various modes the SNES PPU had would compromise on somethings. E.g. higher resolutions, more backgrounds, whatever at the expense of color depth. All boiled down to really just how much data they could realistically pull from the memory and operate on in a single dot clock cycle.


If you plan to use a second 6502 to be a PPU in of itself; I think you'd probably one something where you can double buffer things.

E.g.: Hardware is incrementing through/displaying a frame buffer (simple counter ICs can do this) while the 6502 "PPU" goes through and updates the needed parts on an off screen frame buffer, perhaps using some DMA to quickly move large chunks around for some sprites and other things, and then you can flip the buffers on the vertical blank.

Effectively you want the CPU to be having to update as small of chunks of the memory as possible so it can keep up with the frame rate.
I wish I had known about this sooner. The page doesn't look like it has been updated sense last year though. Any idea what the status of this project is?
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Video card question

Post by GARTHWILSON »

Quote:
If you double the CPU's clock rate, 50.35Mhz would need RAM with less than 20ns response time,
The RAM would have to be a lot faster than 20ns, when you subtract the address setup time and the processor's read-data setup time.  See Jeff's excellent animated, drawn-to-scale (unlike most in data sheets), visualizations of timing margins, in the forum topic "Timing Diagrams. Visualizing 65xx Timing."  These .gif files help understand what timings are constant and what varies with clock speed—info that's very helpful when selecting and planning your glue logic.
viewtopic.php?f=4&t=2909
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
plasmo
Posts: 1273
Joined: 21 Dec 2018
Location: Albuquerque NM USA

Re: Video card question

Post by plasmo »

A 25.175mhz 6502 can drive monochrome 640x480 VGA by moving a byte of graphic data to the pixel shift register every 8 clocks. The graphic data is thus compressed so 640x480 can be represented by 38K of memory. This simplistic approach allows texts and monochrome graphic with a simple hardware solution. Maybe it is possible to do colored sprites in software, but I have not think that through.
Bill
NormalLuser
Posts: 48
Joined: 24 Sep 2023

Re: Video card question

Post by NormalLuser »

If you really want it to be 'compatible' with something else search for videos and pages on building an Apple II compatible with modern discrete components. Super neat resources for that out there.

But since you are already doing a BE6502, the 'Worlds Worst Videocard' is a pretty good place to start.
100x64 with 6 bit 64 color output and a simple resistor change to switch to full 8 bit 256 color output.
If you clock your CPU at 5Mhz you still have 1.4Mhz worth of CPU even with the halting DMA of that simple frame buffer.
This makes for a really easy system hardware-wise and software-wise with performance similar to systems like the C64.
To get an idea of what it can do check out my posts here:
EhBasic GFX: viewtopic.php?f=5&t=7763
BadApple: https://github.com/NormalLuser/Ben-Eater-Bad-Apple

Regardless of what your build, you will not have much choice but to get comfortable with 6502 Assembly. It takes a little bit to get used to but it is a lot of fun once you do.
The EhBasic graphics routines at my link above would be helpful to get you going.
Click this Gif to see it in motion:
Double Buffered Draw Routines
Double Buffered Draw Routines
If you want to port existing games over I would look around for commented 6502 assembly versions of things like Snake, Tetris, Pong, Breakout, etc. Anything with large chunky graphics and not much else going on with it would be the best place to start.

Then it is a 'simple' matter of replacing the draw routines with ones that work with the hardware you have.
'Simple'.... as in not always simple and often hard. This is because programs are often very tightly coupled with the hardware by using specific memory layouts, timers, ROM routines, etc.
Given this I find C64 and BBC micro and others like that will be most helpful. Apple II 6502 logic code is helpful like the others, but the screen layout is bizarre and often times the draw and collision code and the like is tightly optimized, so anything draw related can be very confusing. YMMV

The 'WWVC' is a pretty good start to a VGA system.
One upgrade I did you'll see on this forum post was that I added a hardware double buffer:
viewtopic.php?f=4&t=7859

This is easy enough with a couple of extra 74 series chips, or I also have a better PLD based version that gives access to the full 32k of RAM:
https://github.com/NormalLuser/BeEhBasi ... 6502DR.PLD

While the 'WWVC' has great color and simple layout it is quite low resolution, so I've also worked on but not finished adding a character based output with higher resolution than the color output, with a transparency color. Meaning that it could be used somewhat like sprites, higher resolution text, or in a manner like the ZX Spectrum where you have a color background grid of lower resolution and a higher resolution foreground. But instead of 32x24 for color it would be 100x64 with 8 bit color.
This was made simply using a small dual port RAM hooked up to a ROM chip with font images:
Worlds Worst Character Display
Worlds Worst Character Display
Lastly, an upgrade I plan in the future is that since the screen memory layout is 128 bytes wide, but the display is only 100 wide, it means that hardware horizontal scrolling would be a 'relatively easy' addition of a couple of pre-set counter chips and registers. 'Relatively easy' as in a couple more full breadboards and maybe an additional VIA to act as the register.

Best of luck on your project!
User avatar
Yuri
Posts: 371
Joined: 28 Feb 2023
Location: Texas

Re: Video card question

Post by Yuri »

GARTHWILSON wrote:
Quote:
If you double the CPU's clock rate, 50.35Mhz would need RAM with less than 20ns response time,
The RAM would have to be a lot faster than 20ns, when you subtract the address setup time and the processor's read-data setup time.  See Jeff's excellent animated, drawn-to-scale (unlike most in data sheets), visualizations of timing margins, in the forum topic "Timing Diagrams. Visualizing 65xx Timing."  These .gif files help understand what timings are constant and what varies with clock speed—info that's very helpful when selecting and planning your glue logic.
viewtopic.php?f=4&t=2909
Indeed,

I wasn't really thinking too hard about it, just converting 25.175Mhz to ns; but if you figure about half that is active work starting on the rising edge of the clock, that's less than 10ns to access the RAM, not considering the setup and hold times, which are also critical as you point out.
plasmo wrote:
A 25.175mhz 6502 can drive monochrome 640x480 VGA by moving a byte of graphic data to the pixel shift register every 8 clocks. The graphic data is thus compressed so 640x480 can be represented by 38K of memory. This simplistic approach allows texts and monochrome graphic with a simple hardware solution. Maybe it is possible to do colored sprites in software, but I have not think that through.
That does give you 8 clock cycles per pixel, but that's still really tight and you won't be doing too much.

The original Macintosh was monochrome for this reason, as well as the fact that it had all of 128KB of RAM to work with, about 21.4KB of which was for the video display. The video hardware also would block the CPU from accessing the DRAM every 4 clock cycles (according to Wikipedia), I would presume this was to load up a bit shift register.


A ZP read, a ZP write and an increment come out to exactly 8 clock cycles, but you'd need to use the ZP indexed ops which are 4 clock cycles each.


But the ZP is only 256 bytes. You need 38,400 bytes for 640x480 monochrome. Perhaps we can get clever though, we can bank that zero page, which gets updated during the blanking periods? That would then only need 80 bytes in the ZP.

Thinking about this more, perhaps the use of a FIFO would help. The CPU can shove the bytes into the FIFO and the output hardware can then shift a byte into the output bit shift register.

This would avoid the need to increment any index registers on the 6502, present as a single byte ZP address, which is a 2 clock cycle write, and yields about 10 total clock cycles per pixel if you add in the horizontal blanking period as part of your compute time.


Another option of course is to reduce the resolution, e.g. half the horizontal/vertical of 640x480, giving about 1600 clock cycles to compute one line at 320 pixels that would get rendered out twice. You could also fake a 320x200 resolution by giving an extra 40 lines to the vertical blank period, you'd have some ugly blank space on the monitor though. :(

Or you can sacrifice frame rates; instead of updating at 60 FPS, leave the buffer alone for a frame, yielding 30 FPS, might require a double buffer if you're actually doing video work during this time. (Unless there's something clever I'm not thinking of.)
Post Reply