BigDumbDinosaur wrote:
Something to be aware of is reading data from a fixed address with a 65C816, as would be the case with disk I/O, will involve long indirection if the data is going into or coming out of a different bank than the one in which the I/O device is located. Indirection of any kind costs clock cycles because it involves additional internal steps in the MPU. Any 24-bit load or store will incur a one clock cycle penalty for each access.
Hm! So that sounds like yet another reason to move the direct page to cover your I/O addresses when loading data, and then run the load routine in the bank you're loading. That would fix this problem, would it not?
drogon wrote:
If you want to use SD efficiently from the 65xx side, then you might need some sort of hardware SPI driver for it. Bit-banging is not going to be fast enough although it might approximate old-school floppy drive speeds. There have been some CPLD designs in the past.
I'm currently hacking together a 6821 PIA-based SD card interface for my Apple 1 clone, which will clearly be bit-banged. But I recently ordered a couple of 6522 VIAs, which have a built-in serial shift register; I'm wondering if that would be usable to drive SPI. (I'll have a look myself, but not until after I get my bit-banged interface working. Which may be some time; I'm also juggling a couple of other projects right now.)
BigDumbDinosaur wrote:
Your choice of mass storage medium is obviously a limiting factor. Although SD cards can hold a lot of data and internally access it at a relatively quick rate, they communicate serially at a relatively slow pace.
I agree with Garth; this doesn't look to me like it's an issue at all. I don't think this is an issue at all. I grabbed the "Physical Layer Simplified Specification Version 7.10" from the
SD Association download page and found a table in section 3.17.3 (document page 33, PDF file page 53) stating that at "Default Speed" (which any SD card, no matter how old, should support) the maximum bus speed is 25 MHz. I suspect that most cards that are not ancient would support "High Speed" 50 Mhz. So assuming we stick with one-bit-wide SPI mode that's about 3 MB/sec or 6 MB/sec across the bus, both of which are pretty reasonable. Older cards will actually be limited by their memory speed, not the bus speed.
And even at 3 MB/sec, I think you'd probably need to be doing DMA to use the full SD Card bus speed. The fastest 8-bit wide programmed IO I can imagine off-hand would be a direct page read from the card (3 cycles) followed by an absolute indexed write of the data (maybe 5 cycles?) and an index register increment (2 cycles), for about 10 cycles per byte, for about 1 MB/sec at 10 MHz. (Maybe I'm calculating that wrong.) Perhaps the speed nearly doubles if you build an interface that allows reading in 16-bit chunks rather than 8-bit chunks.
At any rate, all of this is well above the 200-300 KB/sec that folks seem to think is good enough. Though it probably would be of interest if you wanted to play back 320×200×256 colour 30 FPS video from your card, which would need close to 2 MB/sec. (Uncompressed, obviously; I imagine that there's no way a 10 MHz 65816 is going to be able to run an algorithm to decompress video at that output rate.)
BigDumbDinosaur wrote:
Simple or complex, filesystem manipulation adds overhead to the basic process of fetching data from mass storage, as I/O cycles are required to read directories, located file descriptors, etc.
So long as you're willing to dedicate an SD card (or even a specific block range on an SD card) to a game, you can simply generate the specific block numbers during the game build process and hardcode them into the code you generate. This wasn't uncommon in the 8-bit era. The Apple II version of Prince of Persia, for example, got particularly clever with its disk read routines in that it would start reading a track with whatever sector happened to be the first passing under the head, thus always reading the entire track in just over a single revolution.
BigDumbDinosaur wrote:
66 seconds would seem like an eternity to an impatient game-player. :D If you are going to page parts of your program from mass storage you're going to need something faster.
Well, first of all, if you're paging in parts of your data you won't be pulling in 2 MB at a shot, so it won't be 66 seconds. I'd imagine you'd be pulling in something more like 128 KB for a level or section of a level, which would be 4 seconds or so, which could easily be covered by a message or an interesting graphic to look at, or maybe even a simple animation.
As for an intital load, if you can do it in 66 seconds you can advertise your system as being "faster than a Playstation 4." :-) That said, retro gamers (with probably the exception of those used to original drives on a Commodore 64) probably have higher standards for load time than modern gamers. But also, lower standards for quality of graphical assets, meaning loading less data.