game machines
Re: game machines
Quote:
Rob Finch's web site we also have the bc65816.
-
White Flame
- Posts: 704
- Joined: 24 Jul 2012
Re: game machines
Hugh Aguilar wrote:
I never heard of .D64 images. These are just binary files representing C64 cartridges? That is probably illegal --- copyright infringement --- but I doubt that the owners of those cartridges care enough to do anything about it, as they haven't made any money on the cartridges in decades.
In any case, the TheC64 team has claimed that they legally licensed the use of all the included games from their current IP owners.
DerTrueForce wrote:
320x200 graphics, with 256 programmable colours and a text generator.
- The graphics would require 64,000 bytes for the display bitmap(bytemap?), and then 768 bytes on top of that for the palette(colour) data. 768 bytes are left over from this.
- The graphics would require 64,000 bytes for the display bitmap(bytemap?), and then 768 bytes on top of that for the palette(colour) data. 768 bytes are left over from this.
Multiple layers of tile-based graphics (with selectable palettes per tile) is the key to colorful, animated graphics being dynamically pushed around fast, as well as are easier to create & manage within the realm of this class of hardware. Regarding sprites, that's another place where Commodore went wrong, trying to include a smaller number of larger sprites. The more graphically successful arcade & home game consoles created their sprites from a larger number of smaller (usually 8x8 pixel) sprites, which ends up being far less constraining.
Re: game machines
White Flame wrote:
Regarding sprites, that's another place where Commodore went wrong, trying to include a smaller number of larger sprites. The more graphically successful arcade & home game consoles created their sprites from a larger number of smaller (usually 8x8 pixel) sprites, which ends up being far less constraining.
However, 8 sprites per scanline was high-end for the day. It's dirt simple to create a crude sprite multiplexer on the C64 to get dozens of 24x21 pixel sprites all over the screen. The NES also had to multiplex the sprites as did the SMS. Neither of them supported more then 8 per scan line, IIRC.
The Atari 7800 *might* have a legit fight in the 8-bit sprite wars. But the hardware was quite strange.
At the end of the day, it comes down to how the sprites were used. It's unfair to compare the latest NES games to C64 games because the NES had the ability to alter the hardware on a game-by-game basis (mappers). So comparing the C64 to the original NES (mapper 1) is a more accurate comparison. Between the two, I'd say the C64 was far superior.
Again, with one exception and that is color palettes.
Cat; the other white meat.
-
White Flame
- Posts: 704
- Joined: 24 Jul 2012
Re: game machines
I probably consider color more important than you do, so the NES's lack of wide pixels, and better paletted sprites come out on top for me, even if they might flicker more than the respective pixel area taken up by C64 sprites. The smaller sprites also give you more colors per "object" than a larger sprite with a single palette. Another issue of balance is that the NES had 256 pixels horizontally, vs the C64's 320, so that does mitigate the smaller sprites somewhat.
The NES and SNES, however, did hardware multiplexing for you. From the programmer's perspective, the NES had 64 sprites, and the SNES had 128; the hardware took care of selecting the 8 (NES) or 32 (SNES) per scanline for you. On the C64, you took up precious raster time and code footprint managing that yourself. (I'm not as familiar with the Sega hardware)
Of course, the biggest sprite pixel pusher of the 2d home consoles was the Neo Geo. It had linked structures for connecting its sprites together into larger objects, as well as some support for scaling. Even the background layers were just connected sprites, not a true tiled display. It did have a fixed tile mode only for the top overlay, for things like score/ammo/credits/etc. But the most interesting part IMO is that it was basically a pair of per-scanline frame buffers. It would clear one out and paint a row of sprite pixels from each active sprite in order, then display that linebuffer on the next raster output line. Most systems decided on a per-pixel level which element to output to the display, but the Neo Geo simply had too many sprites flying around to keep all that "hot" in internal registers.
From the older 8-bit systems, however, I do have a certain affinity for the sprite setup of the TMS9918 (Colecovision, MSX, etc), which was a very interesting balance for the 1970s. 32 1bpp sprites, either 8x8 or 16x16, hardware multiplexed up to 4 per scanline (and it reports which one it last drew, so you could rotate them out). If you want multicolor sprites, you simply overlay them. Emulators often give the option of eliminating the per-scanline sprite limit.
The NES and SNES, however, did hardware multiplexing for you. From the programmer's perspective, the NES had 64 sprites, and the SNES had 128; the hardware took care of selecting the 8 (NES) or 32 (SNES) per scanline for you. On the C64, you took up precious raster time and code footprint managing that yourself. (I'm not as familiar with the Sega hardware)
Of course, the biggest sprite pixel pusher of the 2d home consoles was the Neo Geo. It had linked structures for connecting its sprites together into larger objects, as well as some support for scaling. Even the background layers were just connected sprites, not a true tiled display. It did have a fixed tile mode only for the top overlay, for things like score/ammo/credits/etc. But the most interesting part IMO is that it was basically a pair of per-scanline frame buffers. It would clear one out and paint a row of sprite pixels from each active sprite in order, then display that linebuffer on the next raster output line. Most systems decided on a per-pixel level which element to output to the display, but the Neo Geo simply had too many sprites flying around to keep all that "hot" in internal registers.
From the older 8-bit systems, however, I do have a certain affinity for the sprite setup of the TMS9918 (Colecovision, MSX, etc), which was a very interesting balance for the 1970s. 32 1bpp sprites, either 8x8 or 16x16, hardware multiplexed up to 4 per scanline (and it reports which one it last drew, so you could rotate them out). If you want multicolor sprites, you simply overlay them. Emulators often give the option of eliminating the per-scanline sprite limit.
Re: game machines
White Flame wrote:
I probably consider color more important than you do, so the NES's lack of wide pixels, and better paletted sprites come out on top for me, even if they might flicker more than the respective pixel area taken up by C64 sprites.
White Flame wrote:
Another issue of balance is that the NES had 256 pixels horizontally, vs the C64's 320, so that does mitigate the smaller sprites somewhat.
White Flame wrote:
The NES and SNES, however, did hardware multiplexing for you.
White Flame wrote:
From the programmer's perspective, the NES had 64 sprites, and the SNES had 128; the hardware took care of selecting the 8 (NES) or 32 (SNES) per scanline for you. On the C64, you took up precious raster time and code footprint managing that yourself. (I'm not as familiar with the Sega hardware)
White Flame wrote:
Of course, the biggest sprite pixel pusher of the 2d home consoles was the Neo Geo.
Seriously, yeah...it was a monster at pushing sprites around. Would make a TERRIBLE computer though. 80 column text would be a nightmare for it. lol
White Flame wrote:
From the older 8-bit systems, however, I do have a certain affinity for the sprite setup of the TMS9918 (Colecovision, MSX, etc), which was a very interesting balance for the 1970s. 32 1bpp sprites, either 8x8 or 16x16, hardware multiplexed up to 4 per scanline (and it reports which one it last drew, so you could rotate them out). If you want multicolor sprites, you simply overlay them. Emulators often give the option of eliminating the per-scanline sprite limit.
I'm in complete agreement on that. I have a huge fondness for anything using the TMS9918. Especially since my first computer was a TI99-4a and my first console was a Colecovision. My only complaint is that I can't force myself to get serious about the TMS CPU or the Z80. I'd love to tinker with the TMS9918 some, though.
Cat; the other white meat.
-
DerTrueForce
- Posts: 483
- Joined: 04 Jun 2016
- Location: Australia
Re: game machines
It wouldn't necessarily be just a framebuffer. And certainly not just a dumb one. At the very least, you'd have some kind of blitter. Some sort of hardware acceleration. It does make a lot of sense to have such things. I just hadn't given much thought to it beyond the hard requirements of the video mode.
Tiles sounds good, but I do think there should be some way to determine what's already on the display, and provision for graphics primitives, like pixel setting and line-drawing.
EDIT: I don't remember where this was said, but I think that most of the Mario games did use sprites. A number of things were more than one sprite, I think, and I gather they had some concept of layers or sprite priority in there.
Tiles sounds good, but I do think there should be some way to determine what's already on the display, and provision for graphics primitives, like pixel setting and line-drawing.
EDIT: I don't remember where this was said, but I think that most of the Mario games did use sprites. A number of things were more than one sprite, I think, and I gather they had some concept of layers or sprite priority in there.
Re: game machines
I'm quite fond of the Williams Defender arcade machine. It has a 1MHz 6809E running fast, high-action, side-scroller with a 292x240 4bpp frame buffer with no hardware sprites or hardware scroll support. It's all software.
The frame buffer spans around 36KB, and the memory is Y-ordered, rather than X-ordered. That is, if RAM address 0000h holds the data for the two pixels (0,0) and (1,0) [that is: the top right pixel in the display, and the pixel to its right] then address 0001h holds pixels (0,1) and (1,1) [this is, the two pixels below those first two]. Moving through incrementing memory addresses moves _down_ the screen... till you get to (0, 239) and (1, 239) at address 00EFh. The final 32 bytes of RAM in that 256-byte page are free to hold software variables, as they're not screen visible.
The next screen-height column begins at address 0100h and continues down to 01EFh. This pattern continues to 91EFh, where you'll find pixels (290,239) and (291,239).
Each 4-bit pixel value goes through a 16-entry lookup table to yield an 8-bit RRRGGBBB value, which is sent to the display.
If you've either played or seen videos of Defender then you might be amazed that such a low spec CPU could drive this game without hardware support (in fact, it did have a coprocessor for sound generation). The designers took advantage of the 6809's stack push/pop operations, which allow it to load or store a 2x8 rectangle (8 consecutive bytes) in one instruction. It's impressive.
Another choice they made, taking advantage of the 2-pixels per byte display is that when plotting a software sprite they ignore conflicts. That is, if a new sprite overlaps (or is even 1-pixel adjacent to) an existing sprite, rather than do a read-modify-write in order to perfectly mask-in the new sprite, Defender doesn't bother... and so when sprites pass each other you might see a 1 pixel blank space where the 2nd sprite drawn overwrites the pixel for the sprite that was only the screen first. Defender is such a fast and furious game that you'll only notice this if you're looking for it (then that's all you see).
With no double-buffer, to avoid flickering, the game code would erase a single sprite and then draw it. Compare this to, for example, erasing the whole screen and then drawing all the sprites - which could cause horrendous flickering if you "miss the beam". When things became extremely hectic, the game would slow down, but things would keep moving and there would be no flickering.
Finally they used memory banking to page in more ROM (and RAM) that could otherwise fit. ROM could co-exist in the same memory range as video RAM - with CPU reads targetting ROM, and writes targetting video RAM. Again, this is because the game code only ever wrote to video RAM.
The 6502 doesn't have the grunt to drive a pixel array like this, so a more practical approach would be a character mapped screen, say 40x24 characters - under 1KB, with each 8x8 character described by either 16 bytes (2-bits per pixel, 4KB total) or, better, 32 byte (4 bits per pixel, 8KB total).
That can make for a colorful background. Now add hardware X and Y scrolling and 16 Commodore 64 style sprites (but with 4-bits per pixel).
Add in an old school Yamaha-style sound generator (or SID) and I think that's a great little toy.
EDIT: Another note to highlight just how smart the designers of Defender were: The 8MHz 68000-based Atari ST, with its smaller 32kB 4-bpp frame buffer, rich addressing modes, and 16-bit data bus never did have as high a performance version of Defender. It's Achille's heel was its frame buffer format, which consisted of 4 independent 320x200 bit video planes. That is, a single 16-bit word from video RAM held one bit-plane of data for 16 pixels. Four 16-bit words were combined to create 16 4-bpp pixels. Changing the color of a single pixel required a LOAD/AND_MASK/OR_DATA/STORE on 4 different words. Awful. When running the GEM GUI desktop, and scrolling text files, you would see color shimmer as the display software scrolled each color plane.
When plottiong sprites there's no option to not cleanly mask in the data, as even byte-sized operations would impact 8 pixels, and the write-only trick used by Williams would look terrible.
The frame buffer spans around 36KB, and the memory is Y-ordered, rather than X-ordered. That is, if RAM address 0000h holds the data for the two pixels (0,0) and (1,0) [that is: the top right pixel in the display, and the pixel to its right] then address 0001h holds pixels (0,1) and (1,1) [this is, the two pixels below those first two]. Moving through incrementing memory addresses moves _down_ the screen... till you get to (0, 239) and (1, 239) at address 00EFh. The final 32 bytes of RAM in that 256-byte page are free to hold software variables, as they're not screen visible.
The next screen-height column begins at address 0100h and continues down to 01EFh. This pattern continues to 91EFh, where you'll find pixels (290,239) and (291,239).
Each 4-bit pixel value goes through a 16-entry lookup table to yield an 8-bit RRRGGBBB value, which is sent to the display.
If you've either played or seen videos of Defender then you might be amazed that such a low spec CPU could drive this game without hardware support (in fact, it did have a coprocessor for sound generation). The designers took advantage of the 6809's stack push/pop operations, which allow it to load or store a 2x8 rectangle (8 consecutive bytes) in one instruction. It's impressive.
Another choice they made, taking advantage of the 2-pixels per byte display is that when plotting a software sprite they ignore conflicts. That is, if a new sprite overlaps (or is even 1-pixel adjacent to) an existing sprite, rather than do a read-modify-write in order to perfectly mask-in the new sprite, Defender doesn't bother... and so when sprites pass each other you might see a 1 pixel blank space where the 2nd sprite drawn overwrites the pixel for the sprite that was only the screen first. Defender is such a fast and furious game that you'll only notice this if you're looking for it (then that's all you see).
With no double-buffer, to avoid flickering, the game code would erase a single sprite and then draw it. Compare this to, for example, erasing the whole screen and then drawing all the sprites - which could cause horrendous flickering if you "miss the beam". When things became extremely hectic, the game would slow down, but things would keep moving and there would be no flickering.
Finally they used memory banking to page in more ROM (and RAM) that could otherwise fit. ROM could co-exist in the same memory range as video RAM - with CPU reads targetting ROM, and writes targetting video RAM. Again, this is because the game code only ever wrote to video RAM.
The 6502 doesn't have the grunt to drive a pixel array like this, so a more practical approach would be a character mapped screen, say 40x24 characters - under 1KB, with each 8x8 character described by either 16 bytes (2-bits per pixel, 4KB total) or, better, 32 byte (4 bits per pixel, 8KB total).
That can make for a colorful background. Now add hardware X and Y scrolling and 16 Commodore 64 style sprites (but with 4-bits per pixel).
Add in an old school Yamaha-style sound generator (or SID) and I think that's a great little toy.
EDIT: Another note to highlight just how smart the designers of Defender were: The 8MHz 68000-based Atari ST, with its smaller 32kB 4-bpp frame buffer, rich addressing modes, and 16-bit data bus never did have as high a performance version of Defender. It's Achille's heel was its frame buffer format, which consisted of 4 independent 320x200 bit video planes. That is, a single 16-bit word from video RAM held one bit-plane of data for 16 pixels. Four 16-bit words were combined to create 16 4-bpp pixels. Changing the color of a single pixel required a LOAD/AND_MASK/OR_DATA/STORE on 4 different words. Awful. When running the GEM GUI desktop, and scrolling text files, you would see color shimmer as the display software scrolled each color plane.
When plottiong sprites there's no option to not cleanly mask in the data, as even byte-sized operations would impact 8 pixels, and the write-only trick used by Williams would look terrible.
Last edited by sark02 on Fri Apr 13, 2018 4:58 am, edited 3 times in total.
-
White Flame
- Posts: 704
- Joined: 24 Jul 2012
Re: game machines
I really do think 4bpp is sufficient. I think the SNES had some of the best art of the 2d era (owing mostly to the quality of companies working on it), and it's pretty much all 4bpp graphics in practical use. It has a 256-color palette, broken up into 16x 16 color palettes. Each individual bg tile could choose from palette 0-7, and each sprite tile could choose from palette 8-15. That, combined with some raster effects for gradients, and a bit of alpha blending, yields tremendous flexibility and color capabilities. Compare to Amiga, which could only set a palette per full bitmap layer, not per tile, so games noticeably were forced to reuse the same colors over and over per plane.
As mentioned, if you have a 256-color framebuffer, that eats up 64,000 bytes of RAM. Plus then you also need a bunch of off-screen memory to hold your sprite graphics, fonts, etc that you will be drawing to it. However, the SNES had 65KB total video memory for everything packaged together: tilemaps, pixel data, fonts, sprite coordinates, etc, that are reused across entire levels. You save a ton of memory by staying away from framebuffers, and that means far less bankswitching and other such headaches from the programmer and bus designer point of view, and far less bandwidth grunt to get things done.
An on-the-fly graphics generation system has a constant read bandwidth of low bpp data to generate a rich video display. A framebuffer system takes a constant read bandwidth of high bpp data to generate the video display, plus the blitters need both reads & writes, with overdraw, to fill every frame (assuming action gaming). A framebuffer + blitter can be easily 8x as expensive in memory bandwidth for the same level of graphics compared to tiles+sprites, getting worse the more overdraw there is. (Of course, Defender just does delta drawing, and the CPU has the grunt to manually blit little pieces around, but that Defender hardware still doesn't have a chance to run a full-graphics sidescroller. Still very impressive, though.)
As mentioned, if you have a 256-color framebuffer, that eats up 64,000 bytes of RAM. Plus then you also need a bunch of off-screen memory to hold your sprite graphics, fonts, etc that you will be drawing to it. However, the SNES had 65KB total video memory for everything packaged together: tilemaps, pixel data, fonts, sprite coordinates, etc, that are reused across entire levels. You save a ton of memory by staying away from framebuffers, and that means far less bankswitching and other such headaches from the programmer and bus designer point of view, and far less bandwidth grunt to get things done.
An on-the-fly graphics generation system has a constant read bandwidth of low bpp data to generate a rich video display. A framebuffer system takes a constant read bandwidth of high bpp data to generate the video display, plus the blitters need both reads & writes, with overdraw, to fill every frame (assuming action gaming). A framebuffer + blitter can be easily 8x as expensive in memory bandwidth for the same level of graphics compared to tiles+sprites, getting worse the more overdraw there is. (Of course, Defender just does delta drawing, and the CPU has the grunt to manually blit little pieces around, but that Defender hardware still doesn't have a chance to run a full-graphics sidescroller. Still very impressive, though.)
Re: game machines
White Flame wrote:
Of course, Defender [...] doesn't have a chance to run a full-graphics sidescroller.
Re: game machines
Defender is a great example. I fear that if you build the best sprite engine, you'll only get sprite based games, and so much more is possible. A flexible DMA engine might be a more useful addition to a 6502 machine. A video system which allows for overlays - where there are two framebuffers and the concept of a transparent pixel - is another kind of thing which is much easier in hardware than software.
Re: game machines
DerTrueForce wrote:
EDIT: I don't remember where this was said, but I think that most of the Mario games did use sprites. A number of things were more than one sprite, I think, and I gather they had some concept of layers or sprite priority in there.
And yes, almost everything used more than one sprite because sprites are 8x8 (but they can be doubled to 16x16 IIRC). So other than a fireball or bullet, most moving objects used 2x1, 2x2, 1x2, etc. sprites. Which is why so many games flickered. There were only 64 of them and only 8 per scanline.
Cat; the other white meat.
Re: game machines
sark02 wrote:
The 6502 doesn't have the grunt to drive a pixel array like this
Cat; the other white meat.
Re: game machines
cbmeeks wrote:
sark02 wrote:
The 6502 doesn't have the grunt to drive a pixel array like this
Code: Select all
; Draw a 10x8 object
; In:
; D -> dest
; Y -> struct obj {
; u8 cols, rols;
; u16 even_pixdata_ptr, odd_pixdata_ptr;
; u16 draw_func_ptr, erase_func_ptr;
; }
; CC.C = 0=even sprite, 1=odd sprite
; Clobber:
; D, Y, CC
__D193: 34 18 | DRAW_10X8: PSHS X,DP ; save X, DP
__D195: 10 DF 77 | STS $77 ; save S
__D198: 24 02 | BCC L_D19C ; branch if even
__D19A: 31 22 | LEAY $2,Y ; offset for odd data
__D19C: 10 EE 22 | L_D19C: LDS $2,Y ; S -> pix_data
__D19F: CB 08 | ADDB #$08 ; D -> bottom row
__D1A1: 1F 03 | TFR D,U ; U -> bottom row
__D1A3: 35 3F | PULS CC,A,B,DP,X,Y ; fetch 2x8 pix
__D1A5: 36 3F | PSHU Y,X,DP,B,A,CC ; store 2x8 pix @ 0,0
__D1A7: 33 C9 01 08 | LEAU $0108,U ; U -> next col
__D1AB: 20 9C | BRA L_D149 ; continue 8x8
...
__D149: 35 3F | L_D149: PULS CC,A,B,DP,X,Y ; fetch 2x8 pix
__D14B: 36 3F | PSHU Y,X,DP,B,A,CC ; store 2x8 pix @ 0,0
__D14D: 33 C9 01 08 | LEAU $0108,U ; U -> next col
__D151: 35 3F | PULS CC,A,B,DP,X,Y ; fetch col
__D153: 36 3F | PSHU Y,X,DP,B,A,CC ; store @ 2,0
__D155: 33 C9 01 08 | LEAU $0108,U ; U -> next col
__D159: 35 3F | L_D159: PULS CC,A,B,DP,X,Y ; fetch 2x8
__D15B: 36 3F | PSHU Y,X,DP,B,A,CC ; store @ 4,0
__D15D: 33 C9 01 08 | LEAU $0108,U ; U -> next col
__D161: 35 3F | PULS CC,A,B,DP,X,Y ; fetch 2x8
__D163: 36 3F | PSHU Y,X,DP,B,A,CC ; store @ 6,0
__D165: 10 FE A0 77 | LDS $A077 ; restore S
__D169: 35 98 | PULS DP,X,PC ; (PUL? PC=RTS)
The 6502 would not only need 1 LDA/STA per byte, but what addressing mode would you even use? (ZP),Y, with two ZP pointers - one for the source and one for the destination? The would 8 x { LDA, STA, INY }, with each load/store being indirect Y - all those redundant reads.... to replace 2 6809 instructions.
You _could_ do it with a fast enough 6502... but I'm not sure I'd want to
Re: game machines
sark02 wrote:
To me that's just beautifully efficient code.
This is similar to the "2Mhz 6502 is as fast/faster than a 4MHz Z80". I don't know how much faster/slower the 6809 is vs a 6502.
Re: game machines
whartung wrote:
sark02 wrote:
To me that's just beautifully efficient code.
This is similar to the "2Mhz 6502 is as fast/faster than a 4MHz Z80". I don't know how much faster/slower the 6809 is vs a 6502.
In this case, though, as far as I've studied it (which isn't much) the 6809 has similar bus efficiency to the 6502. A PSH/PUL instruction type takes 5 fixed cycles + 1 cycle per byte, so transferring 8 bytes from ROM to video memory takes PUL (5+8) + PSH (5+8) = 26 cycles. That's pretty good.
If doing it the easy way in 6502, you'd have 8 of these:
Code: Select all
LDA (source),Y ; 6 cycles
STA (dest),Y ; 6 cycles
INY ; 2 cycles
BTW, as far as processor comparisons go, whereas the Z80 and 6502 were direct competitors, the 6809 came out after those two and was much more expensive... according to Wikipedia being released over 3 years after the 6502 and containing almost 3x the number of transistors. I think it's ok to grant that it's more powerful than the 6502 without diminishing the 6502 in any way. The expense of the 6809 was a key reason it never really got anywhere in the home computer space (the home micros that did use it were not popular). If you're building an arcade machine, though... then you have the budget.