dmsc wrote:
If you read from two files one byte at a time it is very slow, as it reads a full data sector from file 1, returns one byte, then reads a full data sector from file #2 overwriting the buffer, returns one byte, etc. But if you read (or write) big blocks from one file at a time, it is as fast as possible.
Well, not quite, unless you can ask the DOS to read into its own buffer and then process the data from it yourself,
and you don't need to do a bulk copy of the entire buffer elsewhere in RAM (e.g., because it's one block in sequence of a program you're loading).
Giving a kernel subsystem an arbitrary address in memory and asking it to place data directly directly there, without using intermediate kernel buffers, is known as
"zero-copy I/O", and was heavily used at in networking protocol stacks in the '90s (at least in Unix and its clones). I think that there's a lot of potential in this for older eight-bit systems, particularly since
PIO is often used instead of DMA, even for block devices. Not only could this result in even more savings than on DMA systems when it comes to moves of block data (because PIO is significantly slower), but also with at least some PIO systems scatter-gather I/O may be an option.
I've heard that that zero-copy I/O was used in some custom loaders for games and the like on the Apple II in order to double load speed. With Apple DOS and ProDOS the sectors were interleaved on the track in order to provide time to do a copy after reading a sector but before reading the next one; thus a full track read would need at least two revolutions. I think it should be possible to do a full track read in one revolution (although the timing was
very tight) if one can avoid buffer copies between sectors.
Even on block storage systems without seek delays, such as flash RAM (SD cards, etc.), this can make a large difference in speed. Modern flash RAM is much faster than memory on old machines, so the slowest part of I/O is usually copying data. Copying it only once instead of twice can double I/O speed.
There are some other tricks one can use to help along systems like this, too.
Trailer encapsulation, also used in networking (developed in 4.2BSD around 1982-1984), put metadata about chunks of information
after the information itself. That's why in the filesystem design I described earlier, which has per-block metadata (file number, length of data and next and previous block numbers), I put it at the end; it's then possible (if the BIO interface supports it) to read a block at a given starting address and read the next block at the address where the metadata starts, overwriting the metadata and producing a contiguous sequence of file data in memory from a non-contiguous sequence on disk without any memory copies if the BIO can do zero-copy I/O.
(As it turns out, Atari DOS stores file metadata in blocks, and at the end of each block. I don't know if they were considering the trailer idea at the time, though. It's exceedingly similar to what I do in the filesystem I described earlier in this thread, though I didn't know about this at the time I was doing my design. More information can be found in
Chapter 9 of De Re Atari.)
Quote:
Remember that BW-DOS, including all its buffers, variables, the command line processor (with support for batch files) and all the filesystem code used exactly 6116 bytes, less than 6kB.
Such concision is admirable! Though one must remember that the drives themselves stored and ran the code for dealing with actual sector reads and writes, which I am guessing saved a half kilobyte or more. (The
Apple II RTWS was 1193 bytes, but due to the extreme simplicity of the disk controller that may have been larger than the code would have been had the disk controller hardware been doing more of the work.)
I don't know if the design I did could ever include all that and still be that small, but I am hoping so. And I'm also trying to design it in a way that one can leave out more sophisticated or unnecessary components and make it significantly smaller yet, while maintaining compatibility with filesystems written by more capable systems.