6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Sep 20, 2024 3:56 am

All times are UTC




Post new topic Reply to topic  [ 49 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Wed Jan 30, 2019 3:19 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
To inform more precisely what the hardware needs to look like, I decided to think about the software (which until now has been handwaved). I'm assuming here that we've got a captive 6502 running at a fairly high speed, as in the Commodore drives, making the drive an "intelligent peripheral" that could be interfaced to many different machines. Since the mechanical drive has a TTL interface, it makes sense to run this captive CPU at 5V which should enable a 12MHz clock (trivially divided from the 24MHz master). Our job is to encode and decode EFM and locate sync marks as close to realtime as is practical.

Encoding EFM should be easy, so long as a suitable method of calculating pre-compensation can be found (and I have little doubt that it can be). A simple table of delta-time values can be built and iterated over. On the outer tracks at least, the raw table is likely to work without applying pre-compensation. Reading this table should be fast enough to do in realtime, with only a small FIFO to cover accidents.

The EFM sync mark is a 24-symbol sequence which uniquely contains two maximum-length gaps between flux reversals. The simplest and fastest code to recognise this pattern could be:
Code:
FindSync:
   LDA rfifo
   CMP #gap10_11
   BCC FindSync
   LDA rfifo
   CMP #gap10_11
   BCC FindSync
FoundSync:
This assumes that flux gaps longer than those in the sync sequence, as would be found on unformatted sections of the track, read as zeroes, which might not be true - but it requires just eight cycles to identify the second gap as completing the sync sequence, which at 12MHz is considerably shorter than the shortest flux gap the medium can support immediately after the sync sequence (ie. 2µs, or 24 cycles at 12MHz). This bodes well, as we can add a check for an over-length gap (another 4 cycles) and still be ready before the next pulse arrives, even if we reduce the clock speed to 8MHz. So it becomes reasonable to have the pulse detector indicate "timeout with no pulse" by returning 255, which should be straightforward to arrange in hardware.

So we can precisely identify the moment at which a sync mark arrives, without having to do a "track read" from the index pulse. In principle we could also recognise any other single symbol in a similar manner.

Implementing a decoding state machine in (effectively) the Program Counter is possible and potentially very fast, but rather wasteful of code space. There's a maximum of 5 flux reversals per decoded byte, so theoretically we could anticipate 1280 states in the decoder, each with at least 7 bytes of code to identify the next state to transition to (some of which will be error states). It might be more efficient to implement a data-driven state machine once past the sync mark.

Such a state machine would have 256 or so "leaf states", one for each valid data byte value, one each to identify the two metadata markers defined in EFM, three to handle the two merging symbols (which may be both blank, or at most one containing a reversal), and at least one error state in which a recovery operation can be scheduled. Such recovery might include deeper examination of the delta-time values to obtain a maximum-likelihood decoding, application of an error-correcting code (Reed-Solomon would be applicable; since EFM is a fixed-length code, the bytes won't easily get entirely out of step, and the errored byte can be marked as an erasure), or simply a second attempt at reading the whole sector.

At a valid-data-byte leaf state, it is only necessary to record the decoded byte, advance the pointer and check it against the sector length, then subtract the applicable gap value from the current delta-time (which will overlap the end of the 14-symbol group) and proceed to the merging-symbol states to align with the next group. From the merging-symbol states, a properly trimmed delta-time gap is used to re-enter the root state of the machine. The other leaf states require similar housekeeping, except that there is no decoded byte to record.

The inner states of the machine (including the root) have an entirely different function; they must choose between multiple following states based on the flux gap presently in question. Valid states may include gaps of 1-11 symbols - the shortest of these only occurring after trimming from a previous leaf state - with longer gaps being explicitly excluded from the encoding and constituting error states. Since each symbol is nominally 16 clocks long, it seems feasible to mask the value and implement this decision in an interleaved lookup table:
Code:
StateTrans:
   LDA rfifo
   PHA      ; use stack as circular history buffer for error handling
   AND #$F8   ; retains one fractional bit for rounding
   TAY
   LDA (state),Y
   TAX
   INY
   LDA (state),Y
   STX state
   STA state+1
   CMP #maxinnerstate
   BCC StateTrans
LeafState:
This code however requires 36 CPU cycles, which means it will fall behind the denser sequences of flux reversals which can appear every 24 CPU cycles at 12MHz. Unless further optimisations can be found to make this code run in realtime, it will be necessary to implement a very deep read FIFO (ie. a whole-track buffer) to ensure that pathological data can be read correctly. As noted previously, this can be done with a readily-available 256Kx8b SRAM.

Another option is to exploit the 16-bit capabilities of a 65816 in place of a 6502, which would avoid the awkward shuffling through the X register to avoid disturbing the state pointer while loading the second half of its new value, as well as the second use of the expensive indirect-indexed addressing mode. That appears to save 6 cycles in a direct translation, which isn't quite enough. However, we don't need to use indirect-indexed with the '816, as we can use X directly as a 16-bit pointer, and the Direct Page Register of all things as the index! This abuse results in the following:
Code:
StateTrans:
   ; A is 8-bit, high half pre-cleared, indexes are 16-bit
   LDA (rfifo)   ; absolute, not indirect
   PHA      ; use stack as history buffer for error handling - must reset stack pointer regularly
   AND $F8   ; immediate 8-bit
   TCD
   LDY {0+XX}   ; direct-page indexed, not indirect
   TYX
   CPX maxinnerstate   ; immediate 16-bit
   BCC StateTrans
LeafState:
…which takes 25 cycles. This may be close enough to permit a smaller FIFO, provided the leaf-state handling can be assured to be very fast. Further, slightly less clean optimisations can be envisaged, such as unrolling the loop so as to alternate the use of X and Y, and encoding the "leaf" flag into the high-order bit so as to elide the CPX/Y instructions - that would bring it down to 20 cycles per state transition, which is comfortably within requirements.

With that said, there is one key advantage to having a full-sized track buffer; the ability to perform a "forensic read" of a track, from index pulse to index pulse. That's the sort of capability normally reserved for the likes of the Catweasel FDC, which is reputed to be expensive and has closed firmware. So it may be sufficient to merely run the state machine fast enough to not be visibly slow.


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 02, 2019 11:31 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Now for some ideas about a filesystem.

I expect to provide several ways to access the contents of a disk - a "forensic track read" which simply dumps out the flux transition delta-times from index mark to index mark, and equivalently a way to write a raw track in the same format (great for dealing with copy protection); access to raw bytes on a sector or part-sector basis, with the host computer handling all details of the filesystem; and a filesystem handler embedded in the drive controller itself, making it an "intelligent peripheral" in a similar manner to the Commodore drives. In theory this could extend to FAT12 and even one or two "classic micro" formats, but the main attraction will be the native high-capacity format.

Allowing for the specified 1.5% deviation in motor speed, there is just room for a single 18KB sector combining data and metadata on each track, plus a few dozen bytes' worth of sync and leadout before the write head catches up with the sync pattern again. In this context it's actually better for the motor to run slightly slow than fast, as that gives more time to lay down data. The TEAC drives actually have a decent motor speed controller, so I can reasonably expect them to hold the correct speed to within specified tolerance, and probably better. Spindle speed can be measured by timing the index pulses, so excessive deviation can be detected before it becomes a problem.

The EFM encoding allows for two "out-of-band" marker bytes of normal length, as well as the sync pattern which is 50% overlength, and we can use those markers to delimit metadata from file data. Also, the TEAC drive specifies an ability to switch off its write gate within 8µs, which is less than a byte - though I don't know how far ahead of the write position the erase head is. Overall, I'm reasonably confident that if I limit the amount of live data to 18KB per track, I won't need to worry much about accidental overlap.

So what, precisely, can we do with an 18KB sector size? I think we can use 2KB for metadata and 16KB for file data. 16KB per track, over 80 tracks and 2 sides, totals 2560KB of formatted capacity.

So what goes in the metadata? My answer is: after the current track number (to identify seek errors) and CRCs for both the metadata and file-data zones, there's up to 256 eight-byte records, defining file fragments on the same track with 64-byte granularity. That's a good deal finer (and thus more efficient for small files) than most filesystems bother with; FAT allocates with minimum 512-byte granularity, and NTFS uses 4KB clusters by default. If the fragments stored on a particular track are relatively large, then fewer metadata records will be needed to represent them, and there would be more margin for error in the gap. Each record:
Code:
[   0   |   1   |   2   |   3   |   4   |   5   |   6   |   7   ]  bytes
[     INODE     |    OFFSET     | PREV  | NEXT  | FIRST | LAST  ]  fields
Since there's 160 tracks in total, and up to 256 fragments per track, a 16-bit inode number allows every single fragment to contain a different file if necessary, while still identifying every file uniquely.

Similarly, by assuming that the offset (within the file) of the beginning of every fragment is 64-byte aligned, even very large files which take up the whole disk can be accommodated by a 16-bit offset field. After shifting it left 6 bits, it allows for a 4MB file (which is bigger than the disk). A file split between two disks would need to have the offset field reset in between, however.

The 'prev' and 'next' fields form a doubly linked list indicating which track (out of 160) to look for adjacent fragments of the same file on. They deliberately do not indicate where within that track the fragment lies; that's in the metadata of that track. And there's only one fragment from any given file on any given track, since it's trivial to coalesce them before writing the track, so the first and last fragments of the file are indicated by placing the current track number here.

Finally, the 'first' and 'last' fields indicate which (and how many) consecutive 64-byte file-data blocks are occupied by the fragment. The 16KB data field is thus divided evenly and cleanly. It is an error for 'last' to be less than 'first'; if they are equal, exactly one block is occupied. Normally, 'first' will be one higher than the previous fragment's 'last', and while that may seem redundant, it allows for relatively easy error recovery if those bytes should get corrupted. Some of the dirtier hackers out there might even deliberately overlap fragments as a primitive compression scheme.

Theoretically, this is already enough for a filesystem - as long as we don't mind purely numerical file handles, potentially having to read every track to find the file we're looking for or a space to write a new file, and 64-byte granularity of file *sizes*. We also don't yet have a way to permanently mark bad areas of the disk (though when it comes to floppies, it's probably better to just throw away the whole disk and use a good one) or to apply any sort of error-correction code.

Let's address the last problem first, because it's easiest. Those two out-of-band EFM codes come to the rescue, as we can use one to represent "not data" bytes as end-of-file padding, and the other one as a divider between the metadata and file-data zones of the sector and to mark the end of the sector as a whole; each zone can thus be shortened when its full capacity is not required, and in favourable cases it may be possible to read both tracks in a cylinder and move to the next track within two rotations of the disk, rather than needing three.

To solve most of the other problems, we need a directory, and we can reserve inode zero for that purpose. We still need to search for inode zero the hard way, but we can make it easier by always putting it in a consistent place - the first track is an obvious choice (that's what FAT does), but we could also choose somewhere in the middle of the seek range (as Commodore does) to reduce average seek times while writing tracks and updating the directory. A reasonable strategy here is to have the filesystem check track 0 first, then track 30, then track 60, then step through each track in turn looking for clues; the choice of where inode zero actually goes can then be made circumstantially with relatively little penalty on disk insertion.

One of the most important jobs of the directory is to record which tracks have space available for new files. This is commonly done with a Block Allocation Map in most filesystems, often with one bit per sector or cluster. We can implement a Track Allocation Map for up to 256 tracks within one 64-byte block with two bits per track; an empty flag and a full flag. Tracks marked as full can be skipped entirely by the allocation algorithm, while tracks marked as empty can be preferred when writing a file that's known to be large. Meanwhile, tracks marked as not empty and yet not full will still have space left for small files.

Since we actually have 160 tracks on a standard disk (80 per side), that's more than enough - and we mark those tracks which don't exist as being simultaneously empty and full, a combination which only makes sense if they have zero capacity. Actually, we need only mark the first few such tracks (160-167, perhaps) as nonexistent, and use the space beyond that for additional metadata.

An alternative strategy would be to indicate the exact number of free blocks on each track, but as that would be 0 for a full (or nonexistent) track but 256 for an empty one, it would be necessary to store at least one byte plus one bit per track to show both states accurately, occupying 3 blocks if we omitted the nonexistent tracks from the map. Most practical allocation algorithms don't need to know that level of detail, anyway; one that's still popular is "first fit" which only needs to look at the "full" flag.

Also important is to indicate which inode numbers are available for use, so as to keep them guaranteed unique. A simple way is to store the highest inode number in use, and increment that when a new number is needed. However, inodes with low numbers can be deleted and become available for reuse, but this simple "highest in use" marker doesn't capture that. We'll come back to this when discussing file deletion in more detail.

The other important jobs the directory must do are to give each inode a user-friendly means of reference, ie. a name and possibly some directory structure, and to give some clue as to where the file can be found, ie. a track number. A straightforward way to implement a hierarchical directory structure is to simply make each directory (including the root) occupy a distinct inode. Inode zero then only needs to contain (among its other metadata) a single reference showing which inode the root directory occupies (this might normally be inode 1), and where to find it. This will easily fit into the same 64-byte block as the Track Allocation Map.

An inode reference is then just a 16-bit inode number, followed by one byte each indicating which tracks the file begins and ends on. The beginning track is most useful for reading a file in its entirety or sequentially from the beginning; the ending track is most useful for appending to an existing file, or reading the last few lines from it. Since the tracks occupied by any given inode are doubly-linked in the track metadata, this is enough to locate the whole file with reasonable efficiency.

What format is a directory inode in, then? Well, that depends, and we can handwave the details for now. But we can reasonably suppose that it'll start with a reference to its parent inode, then a list of entries, each including a name, an inode reference, and at least a flag indicating whether it's another directory or a normal file. More detailed filetype information is often useful as well - a BASIC program, machine code, some other language, plain text, image data, or miscellaneous raw data - and maybe some timestamps? Those are just details, to be decided later.

So the sequence of events when writing a new file is something like:
  1. Obtain a free inode number, perhaps by incrementing the highest-used-inode field.
  2. Locate a track with free space, using the Track Allocation Map.
  3. Read the destination track, if it wasn't marked empty. Otherwise, prepare a blank track buffer while seeking to the destination track.
  4. Append the new file's data, 64-byte aligned, to the data zone of the track, and a suitable fragment record to the metadata zone.
  5. Write the new/updated track to disk.
  6. Update the track's empty/full status in the cached Track Allocation Map.
  7. Append a directory entry to the appropriate directory inode. This may require extending the inode onto an additional track, in which case the TAM, the original track's metadata and the parent directory reference to the inode all need to be refreshed. However, using additional blocks on the same track is a local operation.
  8. Update the on-disk copy of inode zero, including the Track Allocation Map and the highest-used-inode field.
Naturally, an optimisation may be to perform several file operations before completing writes, minimising the number of seeks and track overwrites. Some care must be exercised to ensure that an interruption doesn't leave the filesystem inconsistent or corrupted. Implementing such optimisations can also complicate the filesystem code, though the resulting performance improvements may justify the effort.

Deleting a file introduces several complications. It's necessary but not sufficient to remove the entry in the directory referring to the file; the inode then still exists on disk, occupying space in one or more tracks. Frequently, the whole point of deleting files is to free up that space, so we (eventually) need to go and rewrite those tracks to remove the inode's metadata, and then possibly update the TAM as a consequence. But in the short term, we can add the deleted inode to an "orphan list" maintained in inode zero, and postpone that relatively time-consuming task for when it's actually needed. If the disk becomes full, the orphan list can be scavenged to really free up the space.

If we happen to rewrite a track containing a fragment from an orphan inode, we can take the opportunity to reclaim the space it uses. Here we can identify when it was the only fragment for that inode, by noting that both the begin and end tracks of the orphan reference point to the track now being processed, and remove the orphan reference entirely. We can also identify when we hit the first or last fragments of an orphan, and update the corresponding reference using the linked-list in the track metadata.

More problematic is when a middle fragment is found; in that case, the quickest option which maintains track metadata integrity is to remove the file data (freeing the space), but leave the metadata entry in place with the 'last' field pointing just before the 'first' field; this is normally an error, as noted above. Edge cases to consider here are when the fragment was previously at the very beginning or end of the track, as calculating the 'last' entry as 'first' minus one would wrap to 255 if 'first' was 0, and conversely if 'first' were calculated from 'last'. It may be best to fix 'first' as 1 and 'last' as 0 to avoid mistakes.

It's also desirable to keep a list of "free inodes" which can be used for new files without incrementing the highest-used-inode. Clearly, a disk that was continuously used to create and delete files would eventually run out of inode numbers if no means to reclaim them were provided. Indeed, if maintained as contiguous ranges of unused inodes, the highest-used-inode field could simply be replaced by an entry in that list. Since the list can only be lengthened by deleting and reclaiming the space from a file, there will always be at least one free block into which inode zero can be extended to accommodate the list - something that is not necessarily true of the orphan list, but the latter problem resolves itself by triggering immediate scavenging of orphans.

Phew - anything else? Oh yes - error correction.

The TEAC drives explicitly specify that they can access 82 tracks per side, not just 80; attempting to seek below track 0 or above track 81 will be ignored. The specification of Microsoft's extra-capacity DMF format actually uses 82 tracks, suggesting that this is a widespread capability of PC-type drives. But I've deliberately ignored these two extra tracks per side when calculating the usable capacity of this filesystem - because we can use them to store 64-72KB of error-correction data, depending on whether we use the metadata zones of those tracks or only the file-data zones. That's enough to give every 64-byte block on the disk protection against at least one erased byte, though greater efficiency is achieved by grouping blocks and perhaps using a 16-bit variant of Reed-Solomon to operate on a whole-track basis. The standard 8-bit version of Reed-Solomon operates on blocks of under 256 bytes.

Reed-Solomon looks fairly complicated to implement. So that's a feature to not describe in detail here, and indeed not even to implement at first (experiment with it on a normal machine, not directly on the disk controller). We can instead indicate its presence on the disk by reserving inode number $FFFF for that purpose, and marking the tracks used for it as "full" rather than "nonexistent".

Meanwhile, error detection is achieved by separate 32-bit CRCs on the metadata and file-data zones of each track, as well as the error states built into the EFM decoder. For the purpose of the CRC, the "no data" EFM codes are simply skipped. Those codes will be an additional headache for Reed-Solomon, which assumes an alphabet size of 2^N if you want a straightforward implementation; we will need to transform the data before applying the code.


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 02, 2019 1:06 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Some interesting ideas there! I notice you give names to files, and that might well be fine, but it was a small surprise: the idea of an inode that I'm familiar with, from Unix, is that the inode contains all metadata, but the name(s) of a file are in the directory entries.

It might be worth writing a high-level simulation of these ideas and seeing how much head movement there is and how much track re-writing, for some sample sorts of typical file activity.

Have you any idea how to encode and decode EFM?


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 02, 2019 1:17 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Maybe there's some terminology confusion - but the names are indeed in directory files, not directly associated with the target inode. However, I don't have a separate table of inodes to map inode numbers to their locations, so the latter are included in the directory entries. In essence the directory is an optimisation over the base filesystem, providing hints on where things should be found.

Encoding of EFM should not be difficult, and I outlined ideas on how to decode it sufficiently quickly in the previous post. It may be possible to prototype it using the C64 tape interface.


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 02, 2019 1:18 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Oh, right, I was probably confused...


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 02, 2019 1:53 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1466
Location: Scotland
Interesting thread so-far.

I'm not personally interested in the bit-twiddling part, but more about the file system part and I'm curious what you're going to code all this in.

I look at past systems like Apple DOS, ProDOS and the BBC Micro DFS and ADFS, and we're looking at 8-16K of 6502 code to implement the filesystem and (in the BBC Micro case) a few utilities in ROM (but not format, thank you, ADFS)

And to the filesystem ... After finding my notes it looks like I was about to implement what was effectively ProDOS on the BBC Micro - I was unhappy with the ADFS for various reasons and I guess that much as I liked the Beeb, Apple was my sort of home-calling, being the first "PC" I really got to use.

I have the advantage right now (more by accident than design) in that I can write my filesystem in C running on an ATmega with an SD card and present the interface to the 6502 via the Ruby operating system, er, BIOS - and since I created all the stubs and whatnot to minimally support BBC Basic, I may as well expand this. I'm going to end up with a Franken-Beeb at this rate - nothing near what I envisaged a few months back!

And I think it'll be usable when I move to the 816 system with a Pi host, although plans there are flexible (where rather than put a 2nd SD card on a Pi, I might just stick the entire filesystem inside a regular file on the Pi)

Some ProDOS features - unlimited files per directory, 16MB max. file size and 32MB max. disk size. (Remember; 1982 and 64K Apple IIs). Date/time stamping.

The weird thing is; I never really got into ProDOS... but since I have a couple of apples and beebs here, maybe I'll go play, er, spend some time and refresh the old grey cells... (a few minutes later, and I realise I've just gone and worked out that I could physically wire an Apple Disk II to my Ruby SBC with 9 pins + power, so will stop that line of thought right now!)

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 02, 2019 2:48 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
The hardware I have in mind is a dedicated 12MHz '816 with a little bit of specialised logic to interface it to the drive, driven from a 24MHz master clock. I think it needs to be that fast to keep up with decoding EFM as it comes off the read head (through a FIFO, so I don't need to be cycle precise); it could be slower if I buffered it all before beginning the decode and accepted some more delays. Not yet quite sure how much ROM and RAM it'll need - hopefully less than 64KB total, but obviously an '816 will take to whatever is necessary.

Given that Acorn DFS does very ordinary things with a standard FDC and implements an exceedingly simple filesystem, I'm really not sure what it uses 16KB of ROM for. I haven't immediately found any annotated disassemblies.


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 02, 2019 3:49 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1466
Location: Scotland
Chromatix wrote:
The hardware I have in mind is a dedicated 12MHz '816 with a little bit of specialised logic to interface it to the drive, driven from a 24MHz master clock. I think it needs to be that fast to keep up with decoding EFM as it comes off the read head (through a FIFO, so I don't need to be cycle precise); it could be slower if I buffered it all before beginning the decode and accepted some more delays. Not yet quite sure how much ROM and RAM it'll need - hopefully less than 64KB total, but obviously an '816 will take to whatever is necessary.

Given that Acorn DFS does very ordinary things with a standard FDC and implements an exceedingly simple filesystem, I'm really not sure what it uses 16KB of ROM for. I haven't immediately found any annotated disassemblies.


DFS is under 8K, ADFS is about 12K, but I always wondered why the fomat program wasn't included... Both these are using a hardware controller chip though. Apple DOS 3.3 is just under 10K in total, but this is everything from reading the disk bit at a time through the track/sector code to the filesystem and hooks into BASIC.

Given the Apple scenario - I imagine what you're after might be just as easy with a 65C02 @ 12Mhz - especially when the Apple II was 1Mhz. (If it's part of a bigger '816 system, then that's understandable though)

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 02, 2019 8:19 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Take a look a few posts back and you'll see that the '816, carefully coded, has a significant performance advantage in traversing the EFM decoding state machine - enough to make the difference between decoding it in realtime via a small FIFO, versus needing to implement a *really* big FIFO. So with an '816, the only reason I'd need a big RAM buffer is to support a "forensic read" to a host that can't accept the data as fast as it might arrive. This "forensic read" would allow the host to decode formats the drive firmware doesn't understand, or disks with particularly dodgy quality, or weird copy protection systems that confuse or exploit quirks of a standard FDC.


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 05, 2019 5:03 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
One thing about EFM: it succeeds in having a decreased density of edges, but (surely) it demands more precision in the timing of edges. So the flutter of the disk rotation could be an issue. Also, I'd somewhat expect you'd need about twice the resolution in the read time-stamps as you need when writing - I could well be wrong about that. Possibly experiment is needed, or simulation - or maybe nothing more than a whiteboard is needed.

I confess to being quite interested in zoned recording: it's clear that some systems succeeded in putting a more transitions around the larger length of the outer tracks. It feels like tweaking the clock rate of the writer and the reader is the easier and most controllable way to do that, so the floppy continues to spin at the constant rate and no modification is needed to the drive. Would that give some tens of percent of extra density? It is unclear to me how to do this, without a PLL or a very fast upstream clock and some fancy logic.


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 05, 2019 5:08 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Oh, and I notice XDF and similar schemes got improved density using mixed sector sizes, so they could have fewer sector boundaries but keep power of two sizes and still use most of the track.
http://www.os2museum.com/wp/the-xdf-diskette-format/

DMF did away with inter-sector gaps, I think.
https://en.wikipedia.org/wiki/Distribution_Media_Format


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 05, 2019 5:54 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
With the 24MHz master clock, the minimum flux transition time (for the innermost track) is 48 cycles, and the symbol rate (1.5MHz) is every 16 cycles. I think that's sufficient precision to work with. The suggested pre-compensation value is 125ns, which is 3 cycles - that can be taken as an estimate of how much the magnetic zones spread out naturally after being written.

The maximum transition gap in EFM is 11 symbols, or 176 clocks, which occurs only rarely outside the sync pattern (with 3 merging bits, I could guarantee it never occurs within the data; with only 2 I can only guarantee that two such long gaps won't occur consecutively). The drive specifies a speed accuracy of ±1.5%, which is 2.64 clocks at that length. So a track written at one extreme of speed tolerance and read at the opposite extreme might be wrong by 5 clocks either way on a single gap. This isn't enough, by itself, to cause a misread. Incidentally, the longest MFM gap in double-density is 192 cycles - though the required precision there is only 500kHz.

Zone recording is theoretically possible, and should be easy enough to write to the disk, but would complicate decoding. Why? Because I got down to 20 CPU cycles per decode state transition by assuming I could just quantise the delta-time value by 8 (by masking) and index directly on that. To quantise by other values, I'd have to indirect it through an extra lookup table (non-trivial without adding several cycles), or else accept both a significant increase in state machine size and the need to update the state machine for each zone. Which is not to say it's impossible, just inconvenient. I think it's a complication not obviously justified by the benefits at this stage.

Slightly off-topic: It turns out that one of the earlier "removable disk pack" drives (the RK05) for the PDP-8 and 11 has a nominal capacity of 2400KB, plus several nominally-spare tracks bringing it up to 2436KB. A slightly conservative version of the EFM format, using just 15KB per track, could emulate such a drive on a 3.5" floppy - representing 5 cylinders of the RK05 on 2 cylinders of the floppy - though slightly slower than the real RK05. But the chap on the PiDP-11 list building a scale replica of an RK05 to go with his scale-replica 11/70 console panel had already decided to mount a solid-state USB drive inside a replica disk-pack shell. Eh, purists…


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 05, 2019 10:34 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
I was imagining the CPU would be speed-variable too, or at least fast enough to keep up with the fastest variation (of course!) If waiting for a sync costs too many cycles, and if there's already a FIFO to do a bit of rate-buffering, and a counter to provide the datestamps that go into the FIFO, perhaps a little logic to find syncs is justified?

Now, back to encodings - it looks to me like this is the same density as you're getting, but in a simpler code. Is that so?
Quote:
(2,7) RLL is rate-​1⁄2 code, mapping n bits of data onto 2n bits on the disk, like MFM, but because the minimum run length is 50% longer (three bit times instead of two), the bits can be written faster, achieving 50% higher effective data density. The encoding is done in two, three or four bit groups.


Edit: from the same wiki page:
Quote:
Suppose a magnetic tape can contain up to 3,200 flux reversals per inch. A Modified Frequency Modulation or (1,3) RLL encoding stores each data bit as two bits on tape, but since there is guaranteed to be one 0 (non flux reversal) bit between any 1 (flux reversal) bits then it is possible to store 6,400 encoded bits per inch on the tape, or 3,200 data bits per inch. A (1,7) RLL encoding can also store 6,400 encoded bits per inch on the tape, but since it only takes 3 encoded bits to store 2 data bits this is 4,267 data bits per inch. A (2,7) RLL encoding takes 2 encoded bits to store each data bit, but since there is guaranteed to be two 0 bits between any 1 bits then it is possible to store 9,600 encoded bits per inch on the tape, or 4,800 data bits per inch.


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 06, 2019 5:10 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Locating the sync pattern is the *easiest* part for the CPU - see the code I posted earlier to do just that. Decoding the actual data is where there might be trouble keeping up. And frankly, introducing a variable oscillator into this would be even more of a complication - easier to rebuild the tables in software, or store multiple tables to cover all zones at once.

Granted, (2,7) RLL looks like a simpler code with the same storage density, but it doesn't have any explicit sync pattern, nor any out-of-band words, and the mapping is not byte-aligned so could be awkward to process with a 65xx CPU. So it's *conceptually* simpler, but might actually slow down the software compared to EFM.


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 06, 2019 10:31 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
AFAICT there is/was a convention for two or three kinds of sync or index marks, which as you'd expect are sequences which can't appear in RLL-encoded data - see for example 36th page of
http://www.bitsavers.org/components/wes ... 60C31B.pdf

For sure I'd be thinking about a CPLD encoder/decoder, not a fully software solution. (But it must be the case that a software solution could be found.)

But understood, of course, if you're not biting on RLL and will stick with EFM!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 49 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 20 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: