Adventures in FAT32 with 65c02

barnacle · Post by **barnacle** » Thu Mar 19, 2026 9:35 am

At the moment, I'm trying to mount an image of the file in loopback, so I can play with C algorithms and see the results immediately in the browser. However, while I can mount it, Linux insists it belongs to root and I'm not allowed to change permissions, which is a bit of a shame.

Neil

barnacle · Post by **barnacle** » Thu Mar 19, 2026 10:16 am

Just for the record: to copy a formatted _complete_ disk image to a file:

Code: Select all

$ sudo dd if=/dev/sda of=cleanfmt bs=4096
125118+0 records in
125118+0 records out
512483328 bytes (512 MB, 489 MiB) copied, 53,8369 s, 9,5 MB/s

That includes all the data on the original disk - the MBR, the partitions (there are two, one small and unformatted, one large and formatted FAT32) and the file system on the FAT32 system.

To mount that FAT32 partition, you have to know where it starts:

Code: Select all

$ fdisk -lu cleanfmt 
Disk cleanfmt: 488,74 MiB, 512483328 bytes, 1000944 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xc5980b8b

Device     Boot  Start    End Sectors  Size Id Type
cleanfmt1         2048 960511  958464  468M  b W95 FAT32
cleanfmt2       960512 999423   38912   19M 83 Linux

but that gives the partition start sector in sectors, not bytes which mount needs, so multiply 2048 by 512.

Then mount the partition in a previously prepared directory:

Code: Select all

$ sudo mount -o loop,offset=1048576 cleanfmt /media/barnacle/t

At which point the _image_ of the disk has been mounted, and with luck I can access the raw data on the disk image while Linux sees the mounted partition.

(cleanfmt is the image file I've mounted. I should keep that as a fast copy/restore and mount something different.)

Neil

SamCoVT · Post by **SamCoVT** » Thu Mar 19, 2026 10:08 pm

I'll recommend dhex as a good terminal-based Linux utility for doing binary diffs. Most distros have it available in their package manager, or you can download it and compile from source (it just needs a C compiler and libncurses-devel) from https://www.dettus.net/dhex/. It's a hex editor, but it has a diff mode if you give it two filenames. The main feature you will probably enjoy is the ability to specify the starting offset on the command line (so you can jump directly to the sectors of interest) using -oh [put hex value here] or -od [put decimal value here] before giving the two file names. I expect this to be useful because you'll be looking in the same spot (once you determine where that is) and could make a shell script to get the image off the CF card and then compare it to your known good filesystem, jumping right to the directory entries.

It's F10 to quit by default, although I had an issue on one of my linux boxes where F10 was already used by the window manager so I remapped that key. It lets you (forces you to) map the keys the first time you run it.

: Screenshot of dhex hex editor showing binary diff

barnacle · Post by **barnacle** » Fri Mar 20, 2026 5:19 am

Thanks for the recommendation.

Neil

barnacle · Post by **barnacle** » Sun Mar 22, 2026 12:32 pm

Standards? We've heard of 'em. Currently working in the C domain under linux, which has proven handy to do some refactoring, and with the advantage that I can test code on a file image mounted so the linux file handlers can complain directly when I get it wrong.

Linux time, and the variant FAT32 wants, are similar but not identical.

Code: Select all

void timeanddate (void)
{
	// put the current time and date into global mstime, msdate.
	// FAT32 requires hundredths of a second also, but we will set to zero
	time_t tadr;
	struct tm * tad;

	time (&tadr);					// get the linux epoch seconds count
	tad = localtime(&tadr);			// convert to time and day
	// we need to convert the tad values to ms compressed style.
	mstime = (tad->tm_sec/2) +		// we only count every second second
			(tad->tm_min << 5) +
			(tad->tm_hour << 11);
	msdate = tad->tm_mday +				// range 1-31
			((tad->tm_mon + 1) << 5) + 	// range 0-11 but we need 1-12
			((tad->tm_year - 80) << 9);	// epoch is 1900 but we need epoch 1980
}

Neil

BigDumbDinosaur · Post by **BigDumbDinosaur** » Sun Mar 22, 2026 2:57 pm

barnacle wrote:

Standards? We've heard of 'em...

You and I and the rest of the 6502.org gang may have heard of ’em, but you’re dealing with a Microsoft thing. That being the situation, the definition of “standards” is subject to interpretation. Back in my military days, we used to joke there was the right way, the wrong way, and the Navy way. Substitute “Microsoft” for “Navy” at your pleasure.

Quote:

Code: Select all

mstime = (tad->tm_sec/2) +      // we only count every second second

Case in point...we don’t need that stupid first second, just the second second. Now, just wait a second...imagine if this were music...

barnacle · Post by **barnacle** » Sun Mar 22, 2026 4:44 pm

It gets better. We _note_ that first second, as hundredths of a second counting 0-199, in a variable named by MS 'tenths'.

The pellet with the poison's in the flagon with the dragon, the vessel with the pestle has the brew that is true.

So, after today I can create new files in my linux/c dev environment. There is either a delay, or a need to unmount and remount the shared file, before things show up in the file manager, and before the changed data if, e.g. I write to the new file through the linux, shows up in the binary file. But it gets there, and Linux is sufficiently happy to open said file and let me edit the contents.

We progress.

Neil

barnacle · Post by **barnacle** » Tue Mar 24, 2026 6:16 am

In Which Our Hero Discovers His Secret Name May Be Claude

While looking into the issue around creating files and directories, a couple of issues raised their crocodilian snouts from the murky Slough of Despond, and started to look at our hero with hungry eyes.

Crocodile One: Allocating New Clusters
There is only one reason to allocate a new cluster: because you ran out of space using the previous one. If you're going to use it to e.g. append a file, you're going to overwrite what might already be in it (effectively random data from the CF card; formatting doesn't clear the data, only the metadata). If it's because you need it for a new directory, then it _must_ be cleared to zero (by specification). If you need it because you ran out of space in a directory, and need to add another cluster, again, it needs to be zero.

Rather than scattering the zeroing code around the place, I've decided that it should live in the allocation routine as the last thing it does. This keeps things a lot tidier (and simplifies things when I return to assembly) but does have the disadvantage that every time a cluster is allocated there is a cluster-full of data written, which will slow down e.g. file transfers. If that becomes an issue, I'll think about it later; perhaps a flag to turn it off if required. But for now, a new cluster is always delivered zeroed.

Crocodile Two: Adding New Files and Directories
This is the one that's currently hurting my head as I search for an efficient way to do it...

When performing many file-related tasks you need to search a directory, to find out whether a file exists or not, and where its metadata lives. To manage this I use fs_find_first() and fs_find_next(). fs_find_first() is only setting the starting conditions and immediately calls fs_find_next() which does all the heavy lifting.

Directories in FAT32 are just files. When they're created (except for /) they contain two fake files '.' and '..' as the first two entries, and thereafter the cluster they occupy is zeroed. The purpose of fs_find_next() is to return a pointer into the cluster, either to the metadata of a file matching the requested name, or to an empty (all zero) metadata record. It ignores files that are marked as deleted, and ignores records that hold long file name data. If it gets to the end of a sector, it loads the next sector, possibly moving to a subsequent cluster. On its return, transient contains the sector containing the record, and pointers to the record are returned.

fs_find_file() uses these two to search for a named file, by sequentially comparing valid names either for a match or for the empty record.

And that's fine. The empty record is by specification the last entry in the directory. If you ask for a file and get a pointer to an empty record, the file didn't exist. This means you're pointing at the place where you might create a new file or directory (or simply that there are no more files to e.g. list in a directory listing).

But...

What if the last record is (a) not the file you're looking for; (b) not the zero record; and (c) the last record in the cluster chain? Where does the empty record - which must be returned - come from? Obviously it has to be part of a new cluster, which must be allocated, but where?

fs_find_file() neither knows nor cares about the directory cluster. It just keeps asking fs_find_next() for the next record. But fs_find_next() can't just allocate a new cluster if it gets to the end of the cluster, either; you wouldn't want, for example, to expand a file just because you tried to read it...

next_cluster_number knows how to find and return the number of the next cluster, but it requires multiple reads of the FAT sectors; something I'd rather not do on efficiency grounds.

The issue is with FAT: the fat record only knows which clusters are in use and belong together, and files don't know where their clusters are, beyond the first. I'm sure there's a way - I'm looking at using a structure to hold the cluster number (along with other essentials) and making a decision somewhere using that but at the moment, all the magic eight-ball is saying is 'Reply hazy; try again later'.

Neil

barnacle · Post by **barnacle** » Tue Mar 24, 2026 9:02 pm

I believe this may hunt down a crocodile or two... I still need to test it, and before that, think about _how_ to set things up with that damn edge case.

Here's the idea: I have a DIR_ENTRY structure (which will probably be fixed ZP locations when I convert to assembly).

Code: Select all

typedef struct {
    uint32_t    sector;				// lba sector address
    uint16_t    entry;				// pointer into transient to entry
    uint8_t		sec_in_cluster;		// which sector in this cluster are we in?
    uint32_t	cluster;			// what's the current cluster number?
} DIR_ENTRY;

This holds sufficient information to keep track of what's going on between successive calls to fs_find_next. As the directory is a simple file, it has the same structure: incrementing sectors in the cluster pool to the size of one cluster, and then if required, subsequent clusters linked through the FAT.

When I search for a file (either to use the file or to prove it doesn't exist so I can create it) I first call fs_find_first() to initialise the parameters in the DIR_ENTRY de. There's a bit of a sneaky there in that I set the pointer to the minus first (virtual) entry, because fs_find_next() automatically increments that, first thing.

Critically, we get the _next_ cluster number as we set these variables. That's disk heavy since it needs to reference and load the FAT sectors, so we don't want to make a habit of it, but it returns either the next cluster number, or a zero if we're currently in the last cluster of the chain (likely on a short directory).

Code: Select all

void fs_find_first (DIR_ENTRY * de)
{
	// find the first entry in the cwd which is neither deleted nor a long file
	// name, by preseting de and then calling fs_find_next.
    // works in the current cwd; sets de members on exit

    // note: we cannot simply return the first file; it will always be there
    // for a subdirectory (as '.') but in root may have been deleted, in which
    // case we should scan past it.
    // as fs_find_next starts by moving to the 'next' record, we fake the minus
    // oneth entry and then call fs_find_next to do the check.

    de->sector = cluster_to_1st_sector(cwd);
    de->entry = -32;
    de->sec_in_cluster = 0;
    de->cluster = next_cluster_number(cwd);
    fs_find_next(de);
}

With that in place, and remembering that de is preserved between calls, we find the next record entry just by adding the record size (32) to de->entry. On the first call, that sets the pointer to zero, so we start with the first record. That handles the case that the root directory contains no '.' or '..' directories, and may have a valid filename starting at the first entry.

If that addition takes us past the end of the sector currently in transient, then we can simply increment the sector LBA until we get to the end of the cluster. At that point, we need to move to the next sector, which is in a different cluster. de->cluster will tell us whether there are more clusters in the chain (if it's non zero, in which case we can use it directly) or if it _is_ zero... we've arrived at the situation in the previous post: we're on the last record of the last cluster, and we need to allocate a new cluster.

Code: Select all

void fs_find_next (DIR_ENTRY * de)
{
	// using existing data in de, alter de to reflect the next directory entry
	// which is neither deleted nor has a long file name. de contains either a
	// pointer to a valid file, or to the first empty entry found (so first byte
	// of filename is 0x00)
	// Advances through successive sectors in a directory cluster, and if
	// necessary to subsequent clusters.
	// NOTE: we enter with de pointing at the last record found.

	read_sector(de->sector);
    while (1)
    {
		// move to next directory entry
		de->entry += 32;		// size of entry
		// increment the sector and cluster if necessary
		if (SECTOR_SIZE == de->entry)
		{
			// time for the next sector
			de->entry = 0;
			de->sector++;
			de->sec_in_cluster++;
			if (de->sec_in_cluster == sectors_per_cluster)
			{
				// we also need a new cluster
				de->sec_in_cluster = 0;
				if (0 != de->cluster)
				{
					// we are not looking at the end of the chain yet
					// so it's safe to use this next link
					de->sector = cluster_to_1st_sector(de->cluster);
					de->cluster = next_cluster_number(de->cluster);
				}
				else
				{
					// we need to allocate a new cluster
					de->sector = cluster_to_1st_sector(f_alloc());
					// this may be redundant since we know we're about to 
					// crash out on the new zero entry, but still...
					de->cluster = 0;
				}
			}
			read_sector(de->sector);
		}
		// now check the attribute byte to see if it's an LFN entry
		if (ATTR_LONG != transient[de->entry + DIR_ATTRIB])
		{
			// or is it perhaps deleted?
			if (0xe5 != transient[de->entry])
			{
				// nope, then we've found a valid entry
				// and de contains the pointers to it
				break;
			}
			// or zero, indicating no further entries?
			if (0 == transient[de->entry])
			{
				break;
			}
		}
	}
}

Hopefully this works. It only does the disc heavy stuff once per sector, and only reads sectors when it needs them, not every call to fs_find_next().

fs_find_first() and fs_find_next() are not intended as general purpose user-called routines, though they may be used if the user takes care not to demolish transient between calls.

Neil

edit: oops, forgot to link the new sector into the directory chain. Thinking...

Adventures in FAT32 with 65c02

Re: Adventures in FAT32 with 65c02

Re: Adventures in FAT32 with 65c02

Re: Adventures in FAT32 with 65c02

Re: Adventures in FAT32 with 65c02

Re: Adventures in FAT32 with 65c02

(Mis)Adventures in FAT32 with 65c02

Re: Adventures in FAT32 with 65c02

Re: Adventures in FAT32 with 65c02

Re: Adventures in FAT32 with 65c02