Now for some ideas about a filesystem.
I expect to provide several ways to access the contents of a disk - a "forensic track read" which simply dumps out the flux transition delta-times from index mark to index mark, and equivalently a way to write a raw track in the same format (great for dealing with copy protection); access to raw bytes on a sector or part-sector basis, with the host computer handling all details of the filesystem; and a filesystem handler embedded in the drive controller itself, making it an "intelligent peripheral" in a similar manner to the Commodore drives. In theory this could extend to FAT12 and even one or two "classic micro" formats, but the main attraction will be the native high-capacity format.
Allowing for the specified 1.5% deviation in motor speed, there is
just room for a single 18KB sector combining data and metadata on each track, plus a few dozen bytes' worth of sync and leadout before the write head catches up with the sync pattern again. In this context it's actually better for the motor to run slightly slow than fast, as that gives more time to lay down data. The TEAC drives actually have a decent motor speed controller, so I can reasonably expect them to hold the correct speed to within specified tolerance, and probably better. Spindle speed can be measured by timing the index pulses, so excessive deviation can be detected before it becomes a problem.
The EFM encoding allows for two "out-of-band" marker bytes of normal length, as well as the sync pattern which is 50% overlength, and we can use those markers to delimit metadata from file data. Also, the TEAC drive specifies an ability to switch off its write gate within 8µs, which is less than a byte - though I don't know how far ahead of the write position the erase head is. Overall, I'm reasonably confident that if I limit the amount of live data to 18KB per track, I won't need to worry much about accidental overlap.
So what, precisely, can we do with an 18KB sector size? I think we can use 2KB for metadata and 16KB for file data. 16KB per track, over 80 tracks and 2 sides, totals 2560KB of formatted capacity.
So what goes in the metadata? My answer is: after the current track number (to identify seek errors) and CRCs for both the metadata and file-data zones, there's up to 256 eight-byte records, defining file fragments on the same track with 64-byte granularity. That's a good deal finer (and thus more efficient for small files) than most filesystems bother with; FAT allocates with minimum 512-byte granularity, and NTFS uses 4KB clusters by default. If the fragments stored on a particular track are relatively large, then fewer metadata records will be needed to represent them, and there would be more margin for error in the gap. Each record:
Code:
[ 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 ] bytes
[ INODE | OFFSET | PREV | NEXT | FIRST | LAST ] fields
Since there's 160 tracks in total, and up to 256 fragments per track, a 16-bit inode number allows every single fragment to contain a different file if necessary, while still identifying every file uniquely.
Similarly, by assuming that the offset (within the file) of the beginning of every fragment is 64-byte aligned, even very large files which take up the whole disk can be accommodated by a 16-bit offset field. After shifting it left 6 bits, it allows for a 4MB file (which is bigger than the disk). A file split between two disks would need to have the offset field reset in between, however.
The 'prev' and 'next' fields form a doubly linked list indicating which track (out of 160) to look for adjacent fragments of the same file on. They deliberately do not indicate where within that track the fragment lies; that's in the metadata of that track. And there's only one fragment from any given file on any given track, since it's trivial to coalesce them before writing the track, so the first and last fragments of the file are indicated by placing the current track number here.
Finally, the 'first' and 'last' fields indicate which (and how many) consecutive 64-byte file-data blocks are occupied by the fragment. The 16KB data field is thus divided evenly and cleanly. It is an error for 'last' to be less than 'first'; if they are equal, exactly one block is occupied. Normally, 'first' will be one higher than the previous fragment's 'last', and while that may seem redundant, it allows for relatively easy error recovery if those bytes should get corrupted. Some of the dirtier hackers out there might even deliberately overlap fragments as a primitive compression scheme.
Theoretically, this is already enough for a filesystem - as long as we don't mind purely numerical file handles, potentially having to read every track to find the file we're looking for or a space to write a new file, and 64-byte granularity of file *sizes*. We also don't yet have a way to permanently mark bad areas of the disk (though when it comes to floppies, it's probably better to just throw away the whole disk and use a good one) or to apply any sort of error-correction code.
Let's address the last problem first, because it's easiest. Those two out-of-band EFM codes come to the rescue, as we can use one to represent "not data" bytes as end-of-file padding, and the other one as a divider between the metadata and file-data zones of the sector and to mark the end of the sector as a whole; each zone can thus be shortened when its full capacity is not required, and in favourable cases it may be possible to read both tracks in a cylinder
and move to the next track within two rotations of the disk, rather than needing three.
To solve most of the other problems, we need a
directory, and we can reserve inode zero for that purpose. We still need to search for inode zero the hard way, but we can make it easier by always putting it in a consistent place - the first track is an obvious choice (that's what FAT does), but we could also choose somewhere in the middle of the seek range (as Commodore does) to reduce average seek times while writing tracks and updating the directory. A reasonable strategy here is to have the filesystem check track 0 first, then track 30, then track 60, then step through each track in turn looking for clues; the choice of where inode zero actually goes can then be made circumstantially with relatively little penalty on disk insertion.
One of the most important jobs of the directory is to record which tracks have space available for new files. This is commonly done with a Block Allocation Map in most filesystems, often with one bit per sector or cluster. We can implement a Track Allocation Map for up to 256 tracks within one 64-byte block with two bits per track; an empty flag and a full flag. Tracks marked as full can be skipped entirely by the allocation algorithm, while tracks marked as empty can be preferred when writing a file that's known to be large. Meanwhile, tracks marked as not empty and yet not full will still have space left for small files.
Since we actually have 160 tracks on a standard disk (80 per side), that's more than enough - and we mark those tracks which
don't exist as being simultaneously empty
and full, a combination which only makes sense if they have zero capacity. Actually, we need only mark the first few such tracks (160-167, perhaps) as nonexistent, and use the space beyond that for additional metadata.
An alternative strategy would be to indicate the exact number of free blocks on each track, but as that would be 0 for a full (or nonexistent) track but 256 for an empty one, it would be necessary to store at least one byte plus one bit per track to show both states accurately, occupying 3 blocks if we omitted the nonexistent tracks from the map. Most practical allocation algorithms don't need to know that level of detail, anyway; one that's still popular is "first fit" which only needs to look at the "full" flag.
Also important is to indicate which inode numbers are available for use, so as to keep them guaranteed unique. A simple way is to store the highest inode number in use, and increment that when a new number is needed. However, inodes with low numbers can be deleted and become available for reuse, but this simple "highest in use" marker doesn't capture that. We'll come back to this when discussing file deletion in more detail.
The other important jobs the directory must do are to give each inode a user-friendly means of reference, ie. a name and possibly some directory structure, and to give some clue as to where the file can be found, ie. a track number. A straightforward way to implement a hierarchical directory structure is to simply make each directory (including the root) occupy a distinct inode. Inode zero then only needs to contain (among its other metadata) a single reference showing which inode the root directory occupies (this might normally be inode 1), and where to find it. This will easily fit into the same 64-byte block as the Track Allocation Map.
An inode reference is then just a 16-bit inode number, followed by one byte each indicating which tracks the file begins and ends on. The beginning track is most useful for reading a file in its entirety or sequentially from the beginning; the ending track is most useful for
appending to an existing file, or reading the last few lines from it. Since the tracks occupied by any given inode are doubly-linked in the track metadata, this is enough to locate the whole file with reasonable efficiency.
What format is a directory inode in, then? Well, that depends, and we can handwave the details for now. But we can reasonably suppose that it'll start with a reference to its parent inode, then a list of entries, each including a name, an inode reference, and at least a flag indicating whether it's another directory or a normal file. More detailed filetype information is often useful as well - a BASIC program, machine code, some other language, plain text, image data, or miscellaneous raw data - and maybe some timestamps? Those are just details, to be decided later.
So the sequence of events when writing a new file is something like:
- Obtain a free inode number, perhaps by incrementing the highest-used-inode field.
- Locate a track with free space, using the Track Allocation Map.
- Read the destination track, if it wasn't marked empty. Otherwise, prepare a blank track buffer while seeking to the destination track.
- Append the new file's data, 64-byte aligned, to the data zone of the track, and a suitable fragment record to the metadata zone.
- Write the new/updated track to disk.
- Update the track's empty/full status in the cached Track Allocation Map.
- Append a directory entry to the appropriate directory inode. This may require extending the inode onto an additional track, in which case the TAM, the original track's metadata and the parent directory reference to the inode all need to be refreshed. However, using additional blocks on the same track is a local operation.
- Update the on-disk copy of inode zero, including the Track Allocation Map and the highest-used-inode field.
Naturally, an optimisation may be to perform several file operations before completing writes, minimising the number of seeks and track overwrites. Some care must be exercised to ensure that an interruption doesn't leave the filesystem inconsistent or corrupted. Implementing such optimisations can also complicate the filesystem code, though the resulting performance improvements may justify the effort.
Deleting a file introduces several complications. It's necessary but not sufficient to remove the entry in the directory referring to the file; the inode then still exists on disk, occupying space in one or more tracks. Frequently, the whole point of deleting files is to free up that space, so we (eventually) need to go and rewrite those tracks to remove the inode's metadata, and then possibly update the TAM as a consequence. But in the short term, we can add the deleted inode to an "orphan list" maintained in inode zero, and postpone that relatively time-consuming task for when it's actually needed. If the disk becomes full, the orphan list can be scavenged to really free up the space.
If we happen to rewrite a track containing a fragment from an orphan inode, we can take the opportunity to reclaim the space it uses. Here we can identify when it was the
only fragment for that inode, by noting that both the begin and end tracks of the orphan reference point to the track now being processed, and remove the orphan reference entirely. We can also identify when we hit the first or last fragments of an orphan, and update the corresponding reference using the linked-list in the track metadata.
More problematic is when a middle fragment is found; in that case, the quickest option which maintains track metadata integrity is to remove the file data (freeing the space), but leave the metadata entry in place with the 'last' field pointing just
before the 'first' field; this is normally an error, as noted above. Edge cases to consider here are when the fragment was previously at the very beginning or end of the track, as calculating the 'last' entry as 'first' minus one would wrap to 255 if 'first' was 0, and conversely if 'first' were calculated from 'last'. It may be best to fix 'first' as 1 and 'last' as 0 to avoid mistakes.
It's also desirable to keep a list of "free inodes" which can be used for new files without incrementing the highest-used-inode. Clearly, a disk that was continuously used to create and delete files would eventually run out of inode numbers if no means to reclaim them were provided. Indeed, if maintained as contiguous ranges of unused inodes, the highest-used-inode field could simply be replaced by an entry in that list. Since the list can only be lengthened by deleting and reclaiming the space from a file, there will always be at least one free block into which inode zero can be extended to accommodate the list - something that is not necessarily true of the orphan list, but the latter problem resolves itself by triggering immediate scavenging of orphans.
Phew - anything else? Oh yes - error correction.
The TEAC drives explicitly specify that they can access 82 tracks per side, not just 80; attempting to seek below track 0 or above track 81 will be ignored. The specification of Microsoft's extra-capacity DMF format actually
uses 82 tracks, suggesting that this is a widespread capability of PC-type drives. But I've deliberately ignored these two extra tracks per side when calculating the usable capacity of this filesystem - because we can use them to store 64-72KB of error-correction data, depending on whether we use the metadata zones of those tracks or only the file-data zones. That's enough to give every 64-byte block on the disk protection against at least one erased byte, though greater efficiency is achieved by grouping blocks and perhaps using a 16-bit variant of Reed-Solomon to operate on a whole-track basis. The standard 8-bit version of Reed-Solomon operates on blocks of under 256 bytes.
Reed-Solomon looks fairly complicated to implement. So that's a feature to not describe in detail here, and indeed not even to implement at first (experiment with it on a normal machine, not directly on the disk controller). We can instead indicate its presence on the disk by reserving inode number $FFFF for that purpose, and marking the tracks used for it as "full" rather than "nonexistent".
Meanwhile, error
detection is achieved by separate 32-bit CRCs on the metadata and file-data zones of each track, as well as the error states built into the EFM decoder. For the purpose of the CRC, the "no data" EFM codes are simply skipped. Those codes will be an additional headache for Reed-Solomon, which assumes an alphabet size of 2^N if you want a straightforward implementation; we will need to transform the data before applying the code.