BigEd wrote:
Defender is a great example. I fear that if you build the best sprite engine, you'll only get sprite based games, and so much more is possible. A flexible DMA engine might be a more useful addition to a 6502 machine. A video system which allows for overlays - where there are two framebuffers and the concept of a transparent pixel - is another kind of thing which is much easier in hardware than software.
Yeah, multiple layers are certainly a must, though I would still prefer multiple layers of tiled graphics as opposed to bitmaps, for this class of machine. Bitmaps just take up too much memory and too much bandwidth to draw to them. You're generally going to be drawing the same things over and over onto them, so why not just use tiles/sprites?
With plenty of sprites to go around, they tend to be attached together into any shape you want, so you're not limited to squares. You can do interesting transitions with them coming together/apart, fake zooming, etc, pretty easily, without "redrawing" anything. You can redefine tile definitions & palette entries to change a LOT of onscreen pixels without the CPU or blitter having to do barely anything.
Let's think about 60fps 2d dynamic & animated gaming, at 320x200, in 256 colors, with framebuffers + blitters. Per frame, the video DAC will take 64,000 + 768 bytes of read bandwidth. Considering overdraw, let's be conservative and say a DMA blitter will have to take around 200,000 total reads & writes to redraw the screen with everything at its new position & animation frame. At 60fps, that's ~16MB/sec of bandwidth dedicated to video, if the DMA channel is kept busy (if there's idle time, you'd start to miss frames). However, this is just 1 layer with only about 50% overdraw. For each overlay, that'd be another 3.7MB/sec just to read it, assuming a layer can stay static (which is kinda boring), and if it's dynamic that can double the blitter bandwidth to keep it updated.
The C64 does what it does with around 1.1MB/sec of video bandwidth; the CPU constantly pushing around bitmap data would add only fractions. The Amiga Chip RAM has 7-14MB/sec to do everything (video, audio, I/O), and at 60fps has to resort to panning around static low-bpp bitmap overlays without updating them too much even with a blitter. It's hard to find specific info on the SNES, but I would guess its video system is 3.58MHz * 16bit = 7MB/sec for its dedicated video memory (including idle windows for DMA access from the CPU side, so still 7MB/sec total).
If you're going to have shared memory, and you want fully dynamic 60fps blitter-based 320x200x8bpp graphics, on an 8-bit data bus compatible with the 6502, I would say that you need a pretty fast bus, let's say 40 MHz. The CPU would take half the bandwidth, even if you stretch a 10MHz 6502 across 4 bus cycles, leaving the next 4 to the fast video stuff, or 20MHz 6502 sharing every 2 bus cycles, so 20MB/sec bus access is given to the blitter and video. But again, that's just for a 1-layer output.
It just doesn't make sense to me with this class of hardware. For constantly expanding PC hardware in the 386 to Pentium days where the CPU can push those things that quickly to a non-shared framebuffer, no problem. The Amiga and ST really couldn't do this stuff well, but a lot of that is chalked up to planar graphics besides lack of quickly-updated tiles. Home 2d consoles were all sprite-based, not blitter-based. Arcade hardware that I'm familiar with for pushing a lot of huge 2d stuff around at these sorts of resolutions (Street Fighter 2, Neo Geo) are all composite sprites, though the later ones (Y2K+) are modern texture blitting GPUs even for 2d since they're actually fast enough.