Adventures in FAT32 with 65c02
Re: Adventures in FAT32 with 65c02
barnacle wrote:
And so, after that minor mathematical misdemeanour, I tried it with the call to putchar removed. It now takes ten seconds to complete (mechanically timed with an analogue stopwatch!) so I'm reading 33.6kB/second. Which feels like it could be fast enough for something like virtual memory for a file editor or similar, and certainly not an issue for file save/load for files that will fit into the 64k memory space.
Re: Adventures in FAT32 with 65c02
The inner loop reads each byte of the sector, updates the file pointer, and compares it with the file length, before sending it to serial. (Approximate timings).
Neil
- with everything included, 30 seconds
- with the call to putchar deleted, 10 seconds
- with the entire inner loop removed, about 3 seconds
Neil
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Adventures in FAT32 with 65c02
barnacle wrote:
There's effectively no seek time on a solid state drive, so we're looking something around 110kB/second for raw read times.
If it is doing 110 KB/sec at 1.8432 MHz, extrapolating that to the 16 MHz at which my POC unit is running would theoretically improve the raw transfer rate to about 954KB/sec. That’s better than the 710 KB/sec I am seeing with POC V1.3’s SCSI subsystem, although disk characteristics are not a factor with that. The host adapter (HBA) has to be wait-stated and with two accesses to the HBA hardware per byte transferred, I figure I’m losing about 25 percent of the transfer rate that would be possible without wait-states.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: Adventures in FAT32 with 65c02
With my current design, I feel I'm constrained to 2MHz for the UART's speed limit. I have only a simple design, with nothing by way of clock stretching. Though I may stick a 3.6864MHz oscillator in; I have some but I don't expect success. There's also the speed at which the CF might be accessed; the datasheet I have says 300ns for 5v, but it's a very old datasheet. More modern systems would use a DMA subsystem, too, and sixteen bit access...
Meanwhile, I'm thinking about how to rewrite things so I can use an allocated buffer rather than the static transient, and how much the double indirection is going to slow things down. As I said previously, there's an awful lot of 32 bit arithmetic, and at the moment it all uses a hard coded lba uint32 as a source or a target.
You can't easily do
in 65c02 land.
Neil
Meanwhile, I'm thinking about how to rewrite things so I can use an allocated buffer rather than the static transient, and how much the double indirection is going to slow things down. As I said previously, there's an awful lot of 32 bit arithmetic, and at the moment it all uses a hard coded lba uint32 as a source or a target.
You can't easily do
Code: Select all
uint32 arithmetic (uint32 * p1, uint32 * p2)
{
}
Neil
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Adventures in FAT32 with 65c02
barnacle wrote:
There's also the speed at which the CF might be accessed; the datasheet I have says 300ns for 5v, but it's a very old datasheet.
Quote:
More modern systems would use a DMA subsystem, too, and sixteen bit access...
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: Adventures in FAT32 with 65c02
Another advantage of a 16 bit system would also be enough spare memory to load an entire cluster in one hit.
is about as tight as it can go for the innermost loop, but it's still 4+6+5+2 = 17 clock cycles per byte, eight us.
Neil
Code: Select all
cf_r2:
lda CFREG0
sta (trans_ptr),y
iny
bne cf_r2
Neil
Re: Adventures in FAT32 with 65c02
Hmmm.. wouldn't the cycle count be more like 4+6+2+3 = 15 clock cycles per byte? But I know it's late at night where you are, Neil, so maybe you're getting sleepy.
As for "about as tight as it can go," there is still some wiggle room, as you imply. For example, it's probably permissible to unroll the loop so that the jump back to the top executes only once for every 2 (or 4 or 8 ) bytes rather than once for every byte.
One could also save one cycle per byte by mapping CFREG0 into Zero page, although not everyone has oodles of free space in Z-pg. (I do.)
-- Jeff
As for "about as tight as it can go," there is still some wiggle room, as you imply. For example, it's probably permissible to unroll the loop so that the jump back to the top executes only once for every 2 (or 4 or 8 ) bytes rather than once for every byte.
Code: Select all
cf_r2:
lda CFREG0 ; get a byte and move it
sta (trans_ptr),y
iny
lda CFREG0 ; get another byte and move it
sta (trans_ptr),y
iny
bne cf_r2 ; *now* test for exit
-- Jeff
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Adventures in FAT32 with 65c02
Dr Jefyll wrote:
Hmmm.. wouldn't the cycle count be more like 4+6+2+3 = 15 clock cycles per byte? But I know it's late at night where you are, Neil, so maybe you're getting sleepy.
...
Quote:
...it's probably permissible to unroll the loop so that the jump back to the top executes only once for every 2 (or 4 or 8 ) bytes rather than once for every byte.
Last edited by BigDumbDinosaur on Wed Dec 24, 2025 6:56 am, edited 1 time in total.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: Adventures in FAT32 with 65c02
Yeah, obviously sleepy time. I have no idea where the last numbers came from, I can't even see them this morning on the table I was looking at...
This is code that lives in ROM space. But it's always going to be a power of two, so unrolling is a possibility. I think it will take a lot of unrolling to make a significant difference, but every little helps (premature optimisation!).
The good news is that I think I have an easy way to pass in a source/destination for the data transfer, so I'm not restricted to transient. The bad news is that I'm stuck with the 32-bit file pointer increment and compare I mentioned earlier, for f_xxx routines that need to know about file position.
Oh well.
There is little pain moving to the next sector, just an increment for lba, but the next cluster requires converting sector to cluster, going back to the FAT (so at least one and possibly two sectors read there) and converting back to a sector before actually reading it. Navigating backwards requires tracing the FAT from the start of the file, which I'm not looking forwards to at all.
Neil
This is code that lives in ROM space. But it's always going to be a power of two, so unrolling is a possibility. I think it will take a lot of unrolling to make a significant difference, but every little helps (premature optimisation!).
The good news is that I think I have an easy way to pass in a source/destination for the data transfer, so I'm not restricted to transient. The bad news is that I'm stuck with the 32-bit file pointer increment and compare I mentioned earlier, for f_xxx routines that need to know about file position.
Oh well.
There is little pain moving to the next sector, just an increment for lba, but the next cluster requires converting sector to cluster, going back to the FAT (so at least one and possibly two sectors read there) and converting back to a sector before actually reading it. Navigating backwards requires tracing the FAT from the start of the file, which I'm not looking forwards to at all.
Neil
Re: Adventures in FAT32 with 65c02
Rip 'em up and Start Again
I've currently arrived at the stage of considering ripping things up and starting again... er, I mean, refactoring to reconsider some basic assumptions. There will likely be an amount of moving fast and breaking things...
Where I am at the moment:
But...
This is fine as long as I only ever want to open one file at a time. Unfortunately I can see at least a handful of use cases where that's not enough:
I _could_ simply move from transient to a named buffer each time a sector is read (and the reverse, obviously), but that has two serious implications: firstly, a simple read of a buffer takes twice as long since, and secondly, every access to a file requires reloading at least the current sector, since we don't know what else has been done in the meantime.
So cf_read changes so that instead of using a fixed address of transient as a target, a pointer is set before each call. (I also ask cf_read to do the cf_set_lba before each read, since it's required every time and I previously had it called before the cf_read call anyway (when I remembered)). cf_write changes similarly. Indeed, those two routines are so similar I may combine them and simply set a flag to decide whether a read or write is desired. I'm not yet sure whether it's worth it.
At the moment, f_create, f_del, f_cat and f_dir don't require FILE structures and can work entirely within transient, and all the FAT record tracing works within transient, and I don't see a reason to change that.
FILE structure
This structure is still very tentative. It's also likely to be quite large, so it will have to live in main memory, not zero page. However, first thoughts indicate that it should contain the following pointers/flags:
So f_open has to perform a number of tasks.
Neil
I've currently arrived at the stage of considering ripping things up and starting again... er, I mean, refactoring to reconsider some basic assumptions. There will likely be an amount of moving fast and breaking things...
Where I am at the moment:
- I can create a named file
- I can delete a named file
- I can list a given directory
- I can cat a named file (ideally a text file, but anything works)
But...
This is fine as long as I only ever want to open one file at a time. Unfortunately I can see at least a handful of use cases where that's not enough:
- Copying from one file into another
- Operating e.g. a text editor with a working file larger than available memory
- Anything where it might be handy to have a .bak or .tmp file
- Something like an assembler, which might require multiple .asm files as input, with .hex and .lst as outputs
I _could_ simply move from transient to a named buffer each time a sector is read (and the reverse, obviously), but that has two serious implications: firstly, a simple read of a buffer takes twice as long since, and secondly, every access to a file requires reloading at least the current sector, since we don't know what else has been done in the meantime.
So cf_read changes so that instead of using a fixed address of transient as a target, a pointer is set before each call. (I also ask cf_read to do the cf_set_lba before each read, since it's required every time and I previously had it called before the cf_read call anyway (when I remembered)). cf_write changes similarly. Indeed, those two routines are so similar I may combine them and simply set a flag to decide whether a read or write is desired. I'm not yet sure whether it's worth it.
At the moment, f_create, f_del, f_cat and f_dir don't require FILE structures and can work entirely within transient, and all the FAT record tracing works within transient, and I don't see a reason to change that.
FILE structure
This structure is still very tentative. It's also likely to be quite large, so it will have to live in main memory, not zero page. However, first thoughts indicate that it should contain the following pointers/flags:
- Read/write flag
- Dirty flag (set if the sector has been written to and so requires a rewrite to disc) (note that both these flags are contained in the directory entry's attribute byte)
- Buffer address (uint16, though if I insist on a page boundary alignment it could be just one byte)
- File Pointer (the position we're currently reading/writing from/to)
- Current cluster (uint32)
- Current sector (uint32) which also provides a useful indicator if we need a new cluster
- Copy of the directory entry - 32 bytes, but has lots of useful stuff in it, at least some of which I'll need like the filename and starting cluster
So f_open has to perform a number of tasks.
- Check whether a file is already open, and fail if it is (or maybe not; I can see where it might be handy to open the same file for both read _and_ write operations, but I'm not sure I want to get into that yet.)
- Check whether the file exists, and fail if it doesn't (or call f_create?)
- Copy the file's directory entry into the FILE structure
- Zero the file pointer entry and current cluster
- Allocate a memory buffer, and note its address
- Finally, load the sector into memory
Neil
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Adventures in FAT32 with 65c02
barnacle wrote:
Rip ’em up and Start Again
I’ve currently arrived at the stage of considering ripping things up and starting again... er, I mean, refactoring to reconsider some basic assumptions. There will likely be an amount of moving fast and breaking things...
I’ve currently arrived at the stage of considering ripping things up and starting again... er, I mean, refactoring to reconsider some basic assumptions. There will likely be an amount of moving fast and breaking things...
Quote:
Where I am at the moment:
which means basically I can manipulate the FAT sectors with speed and despatch. It does take rather more 32-bit arithmetic than I would like, but it all seems to work.
- I can create a named file
- I can delete a named file
- I can list a given directory
- I can cat a named file (ideally a text file, but anything works)
which means basically I can manipulate the FAT sectors with speed and despatch. It does take rather more 32-bit arithmetic than I would like, but it all seems to work.
Quote:
This is fine as long as I only ever want to open one file at a time. Unfortunately I can see at least a handful of use cases where that’s not enough:
...a named file...must be opened for reading or writing and a file structure populated defining all sorts of information about the file in question.
- Copying from one file into another
- Operating e.g. a text editor with a working file larger than available memory
- Anything where it might be handy to have a .bak or .tmp file
- Something like an assembler, which might require multiple .asm files as input, with .hex and .lst as outputs
...a named file...must be opened for reading or writing and a file structure populated defining all sorts of information about the file in question.
The key structures are multiple “file control blocks” (FCB) and a table of 16-bit pointers (FCBTAB) to point to in-use FCBs. The FCBTAB would be a fixed size, which when divided by the size of a pointer, determines the maximum number of files that may be simultaneously opened. You would, of course, also have to have as many FCBs as slots in the FCBTAB. With care, an FCB can be kept down to a reasonable size...that will be important in a 65C02 system, which is inherently memory-challenged.
Quote:
At the moment, f_create, f_del, f_cat and f_dir don’t require FILE structures and can work entirely within transient, and all the FAT record tracing works within transient, and I don’t see a reason to change that.
Usually, there are two layers to this sort of thing: the disk buffer pool and the filesystem buffer pool. In a uni-tasking system, the disk buffer pool can be small—two buffers would be more than adequate, whereas the filesystem buffer pool would have to have enough buffers to work with the maximum number of files that may be simultaneously opened, unless you are prepared let open files share buffers.
Quote:
FILE structure
This structure is still very tentative. It’s also likely to be quite large, so it will have to live in main memory, not zero page. However, first thoughts indicate that it should contain the following pointers/flags:
This structure is still very tentative. It’s also likely to be quite large, so it will have to live in main memory, not zero page. However, first thoughts indicate that it should contain the following pointers/flags:
- Read/write flag
- Dirty flag (set if the sector has been written to and so requires a rewrite to disc) (note that both these flags are contained in the directory entry’s attribute byte)
- Buffer address (uint16, though if I insist on a page boundary alignment it could be just one byte)
- File Pointer (the position we’re currently reading/writing from/to)
- Current cluster (uint32)
- Current sector (uint32) which also provides a useful indicator if we need a new cluster
- Copy of the directory entry - 32 bytes, but has lots of useful stuff in it, at least some of which I’ll need like the filename and starting cluster
The file pointer should be file pointerS, as the read and write positions will likely be different for a file that has been opened for reading and writing. You would need to track both to avoid accidentally mangling a file when a read at one location is immediately followed by a write to another.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: Adventures in FAT32 with 65c02
Some food for thought, there, BDD; thanks.
In the interest of speed, I'm envisaging a system whereby once a file has been opened for reading, it will load the first sector into the buffer. That lets me move rapidly forwards through it (a likely use case, I think) with 'instant' access to the first sector, 'speedy' access to the next seven sectors (which are sequential), and reasonably quick access to the next cluster. Movement backwards within a cluster is also similarly speedy, though it gets slower when I get to the start of the cluster and have to start again from the first.
For writing, I think I'm looking by default as 'append' and will start with the last sector loaded in the buffer, and the file position pointer pointing at the first available space. It would still be possible to move the file position point to other points in the file for overwriting existing data.
But in either case, I need that every open file has its own buffer. But I think I like the list of pointers to FCBs.
One reason for keeping a copy of the file's directory entry in the FCB is that it holds pretty much all I need by way of control data for the file (and for when it's time to write it back if things change). Conveniently, that holds the filename in the correct format for searching through the directory so where I know where the data goes back (if required).
Neil (still digesting)
In the interest of speed, I'm envisaging a system whereby once a file has been opened for reading, it will load the first sector into the buffer. That lets me move rapidly forwards through it (a likely use case, I think) with 'instant' access to the first sector, 'speedy' access to the next seven sectors (which are sequential), and reasonably quick access to the next cluster. Movement backwards within a cluster is also similarly speedy, though it gets slower when I get to the start of the cluster and have to start again from the first.
For writing, I think I'm looking by default as 'append' and will start with the last sector loaded in the buffer, and the file position pointer pointing at the first available space. It would still be possible to move the file position point to other points in the file for overwriting existing data.
But in either case, I need that every open file has its own buffer. But I think I like the list of pointers to FCBs.
One reason for keeping a copy of the file's directory entry in the FCB is that it holds pretty much all I need by way of control data for the file (and for when it's time to write it back if things change). Conveniently, that holds the filename in the correct format for searching through the directory so where I know where the data goes back (if required).
Neil (still digesting)
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Adventures in FAT32 with 65c02
barnacle wrote:
Some food for thought, there, BDD; thanks.
Quote:
In the interest of speed, I'm envisaging a system whereby once a file has been opened for reading, it will load the first sector into the buffer...
This whole thing begs for a layered approach. You have a disk driver which knows about the mechanics of accessing the storage medium—the array of blocks, but knows nothing about how that array of blocks is being used. You also have a filesystem driver which knows how the array of blocks is being used, but knows nothing about the mechanics of accessing the storage medium. The key difference is the FAT filesystem sees mass storage in terms of cardinally numbered clusters, but the computer sees mass storage in terms of cardinally number blocks.
The disk driver is always subservient to the filesystem driver, the latter which tells the former what to do by passing four pieces of information:
- Logical block address (LBA).
- Number of contiguous blocks to be accessed at that LBA.
- Buffer pointer.
- Operation: read or write.
In order to give the disk driver that information, your FAT filesystem driver must translate an open file’s read/write pointer into a cluster and an offset within the cluster. Below that step will be another one that converts the cluster’s cardinal number within the filesystem into LBA and number-of-blocks parameters which, along with a buffer pointer and the read/write flag, gets fed to the disk driver. The disk driver does its business and when finished, tells the filesystem driver that all is well...or not. The SCSI library I wrote for my POC unit basically works in that fashion—the caller has no idea what is going on inside the driver, so much so, in fact, that none of the temporary workspace used in the driver is even visible (it’s all on the stack, as are the parameters fed to the driver by the caller).
With this sort of segregation, your FAT filesystem package can be made to work with virtually any random-access storage medium. The disk driver is specific to the medium type, and the filesystem driver only knows what it knows about how the storage is being used. With the right disk driver, you could even support the old C-H-S method of disk addressing.
Quote:
But in either case, I need that every open file has its own buffer. But I think I like the list of pointers to FCBs.
Obviously, each open file having its own buffer (as well as the FCB) could get expensive, memory-wise. Having more buffers potentially means less-frequent medium accesses, but more clock cycles being eaten up with buffer management. Then there is the potential data loss if a lot of buffers are dirty and the machine goes belly-up...fsck to the rescue!
Compromises, compromises!
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: Adventures in FAT32 with 65c02
barnacle wrote:
FILE structure
This structure is still very tentative. It's also likely to be quite large, so it will have to live in main memory, not zero page. However, first thoughts indicate that it should contain the following pointers/flags:
This structure is still very tentative. It's also likely to be quite large, so it will have to live in main memory, not zero page. However, first thoughts indicate that it should contain the following pointers/flags:
- Read/write flag
- Dirty flag (set if the sector has been written to and so requires a rewrite to disc) (note that both these flags are contained in the directory entry's attribute byte)
- Buffer address (uint16, though if I insist on a page boundary alignment it could be just one byte)
- File Pointer (the position we're currently reading/writing from/to)
- Current cluster (uint32)
- Current sector (uint32) which also provides a useful indicator if we need a new cluster
- Copy of the directory entry - 32 bytes, but has lots of useful stuff in it, at least some of which I'll need like the filename and starting cluster
Code: Select all
24-bit channel workspace:
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17
Flags <--Start--> <---PTR---> <---EXT---> <--DIR--> <DskID> <-Alloc-> <fptr> <Buff>
| \--drive
\-- OUT:IN:EOF:D:W:R:written:readQuote:
f_open
So f_open has to perform a number of tasks.
So f_open has to perform a number of tasks.
- Check whether a file is already open, and fail if it is (or maybe not; I can see where it might be handy to open the same file for both read _and_ write operations, but I'm not sure I want to get into that yet.)
Quote:
[*] Check whether the file exists, and fail if it doesn't (or call f_create?)
--
JGH - http://mdfs.net
JGH - http://mdfs.net