Adventures in FAT32 with 65c02

Programming the 6502 microprocessor and its relatives in assembly and other languages.
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Adventures in FAT32 with 65c02

Post by barnacle »

A minor bug meant that my code for fs_find_next failed to ignore a deleted file:

Code: Select all

$0240  E5 4E 54 49 54 4C 7E 31 20 20 20 10 00 3E 08 60 .NTITL~1   ..>.`       
It's now fixed...

Neil
jgharston
Posts: 181
Joined: 22 Feb 2004

Re: Adventures in FAT32 with 65c02

Post by jgharston »

barnacle wrote:
[*] crt_tenth - Microsoft's specification says: Millisecond stamp at file creation time. This field actually contains a count of tenths of a second. The granularity of the seconds part of DIR_CrtTime is 2 seconds so this field is a count of tenths of a second and its valid value range is 0-199 inclusive. which makes no sense at all. I _think_ it contains hundredths of a second. I ignore it.
It makes sense if you take it along with what you say later....
Quote:
Time has a problem; to get HMS it needs seventeen bits, so it reduces the resolution of the seconds count: bits 15-11 are the hours 0-23; bits 10-5 are the minutes 0-59, and bits 4-0 are two-second counts.
The TIME field counts in jumps of two seconds. The FINETIME field is the centisecond count between those 2-second jumps. So, 14:30:29.00 would be stored as TIME=14:30:28 and FINETIME=100. 14:30:29.95 would be stored as TIME=14:30:28 and FINETIME=195.
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Adventures in FAT32 with 65c02

Post by barnacle »

Oh, yeah, no problem with the actual count: it's hundredths of (two) seconds. The conflict is between 'milliseconds' and 'tenths of a second'... Not yet an issue; I have no clock, but I must tick :mrgreen:

Neil
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Adventures in FAT32 with 65c02

Post by barnacle »

A quick DIR example
Given the basic routines discussed above to navigate the file structure, it's remarkably easy to list the contents of a directory, even when the directory is over one cluster in size, as my test is.

Code: Select all

Neolithic Compact Flash 
                                        
ECU.C                                                                           
[TEST]                                                                          
EEPROM.C                                                                        
GRAPHICS.C                                                                      
I18N.C                                                                          
MAIN.C                                                                          
SERIAL.C                                                                        
SSD1309.C                                                                       
SYSMEM.C                                                                        
FAT_FI~1.PDF                                                                    
SLIDES~1.PDF                                                                    
UNIT10~1.PDF                                                                    
SLIDES~2.PDF                                                                    
UNIT10~2.PDF                                                                    
SRAM_B~1.PDF                                                                    
CF_44_~1.PDF                                                                    
CF_44_~2.PDF                                                                    
6502_C~1.PDF                                                                    
CF_BOA~1.PDF                                                                    
6502AN~1.PDF                                                                    
NEON65~1.ASM                                                                    
IDE.PDF                                                                         
FAT32F~1.PDF                                                                    
F.PDF                                                                           
HEXCON~1.PDF                                                                    
1.PDF                                                                           
6502_C~2.PDF                                                                    
Z80-DO~1.PDF                                                                    
W65C22.PDF                                                                      
IS62C256.PDF                                                                    
SYNERT~1.PDF                                                                    
SN74HC~1.PDF                                                                    
VIVALD~1.DEB                                                                    
814049.PDF                                                                      
APX823~1.PDF                                                                    
BU48XX~1.PDF                                                                    
DATASH~1.PDF                                                                    
DATASH~2.PDF                                                                    
6502-K~1.ZIP                                                                    
101DSE~1.PDF                                                                    
101D-T~1.PDF                                                                    
FCI-10~1.PDF                                                                    
101DSE~2.PDF                                                                    
NEWSLE~1.PDF                                                                    
ZG6502~1.PDF                                                                    
HIROSE~1.PDF                                                                    
693120~1.PDF                                                                    
JAEIS0~1.PDF                                                                    
MI20-2~1.PDF                                                                    
TS2083.PDF                                                                      
C20415~1.PDF                                                                    
101D-T~2.PDF                                                                    
CFS.PDF 
As you can see, this demonstrates the classic MSDOS fingerprint you get if you follow MS's procedures to save a long-file-name file, which end by generating an abbreviated 8.3 compliant filename and result in things like

Code: Select all

DATASH~1.PDF                                                                    
DATASH~2.PDF
This is not a problem (though it would be a pain if we were dealing with long file names); I shall stick with 8.3 filenames such as those shorter names at the head of the list.

It turns out that the same code I previously used to list the first line can be used with only a change to the output display routine:

Code: Select all

	LYA cluster_begin_lba
	jsr u32_tolba
	jsr fs_find_first
dir:
		lda (fs_dir_ptr)	; check the first character of name
		beq done			; quit if it's zero
;		lda fs_dir_ptr
;		sta put_ptr
;		lda fs_dir_ptr+1
;		sta put_ptr+1
;		jsr putmemline		; show the first half of the record
		LYA string
		jsr fname_to_str
		jsr crlf
		jsr fs_find_next
		bra dir				; until no more records
Ten lines of code...

Admittedly, outputting the file name in a readable format is a bit trickier.

Code: Select all

;-----------------------------------------------------------------------
; fname_to_str
;
; convert the 8.3 filename at [fs_dir_ptr] to a zero terminated string
; at [Y:A]
; Enclose a directory entry with [ ]
fname_to_str:
	sta put_ptr
	sty put_ptr+1			; borrow put_ptr
	ldy #0
	lda #'-'
fs_fts_00:
	; is it a directory entry?
	ldy #dir_attrib
	lda (fs_dir_ptr),y
	ldy #dir_name
	and #attr_dir
	beq fs_fts_01
	lda #'['
	sta (put_ptr);			; add [ to the string beginning
	jsr inc_put_ptr
fs_fts_01:
	lda (fs_dir_ptr),y	
	cmp #' '				; is it a space?
	beq fs_fts_03
	sta (put_ptr)			; no, copy it to our string
	jsr inc_put_ptr
fs_fts_02:
	iny						; and the read pointer
	cpy #8					; have we done eight characters?
	bne fs_fts_01
fs_fts_03:
	; reset y to point to extension
	ldy #8
	lda (fs_dir_ptr),y
	cmp #' '				; does it start with a space? Then no ext.
	beq fs_fts_10
	lda #'.'
	sta (put_ptr)
	jsr inc_put_ptr
	; now the same trick for the extension
fs_fts_05:
	lda (fs_dir_ptr),y
	cmp #' '
	beq fs_fts_10			; until we get to a space
	sta (put_ptr)
	jsr inc_put_ptr
fs_fts_06:
	iny
	cpy #11
	bne fs_fts_05
fs_fts_10:
	; was it a directory entry?
	ldy #dir_attrib
	lda (fs_dir_ptr),y
	and #attr_dir
	beq fs_fts_x
fs_fts_11:
	lda #']'
	sta (put_ptr)
	jsr inc_put_ptr
fs_fts_x:
	; terminate the string
	lda #0
	sta (put_ptr)
	LYA string
	jsr puts
	rts

inc_put_ptr:
	inc put_ptr
	bne ipp_x
		inc put_ptr+1
ipp_x:
	rts
This code builds a string with the filename, removing unneeded spaces and inserting a period if there is a prefix. If the entry is a directory, it is surrounded by '[ ]' (see TEST for example). I ended up incrementing put_ptr so often it saved space to make it a subroutine; I may include that in my bios.asm as a permanent object.

This is the listing of the root directory. We can see that by looking how it was called:

Code: Select all

	LYA cluster_begin_lba
	jsr u32_tolba
	jsr fs_find_first
The first two lines put the address of the first data cluster into lba; as we've seen, that's always the first cluster of the root directory. If we wanted to have a look in 'TEST' we would need to
  • search the root directory until we find the TEST directory. That would require (in general) a routine to convert a string 'test' to a valid 8.3 filename, and then a routine to match that filename against the currently observed directory entry, moving to successive entries (so fs_find_next) until we find a match.
  • returning with pointers to the desired entry in the transient area, read the cluster high and low words at offsets 20 and 26, and put them in lba
  • then start with fs_find_first
It's worth remembering that there is nothing explicit in a directory sector that says it is a directory, other than the marker in its directory listing of its parent, or (for the root) its position. If you try and use fs_find_first and fs_find_next when they aren't looking at a directory, you'll get lots of 'files' returned right up to the point that the file data happens to contain a zero in the first byte of the 'name'. This caused me some confusion...

Microsoft's warnings
As I've previously mentioned, the directory files are unordered. The file data includes not only long file names (which we don't care about) but also potentially many deleted files. When files are deleted on FAT systems, the space they occupy in the clusters remains unchanged until such a time that there are no more unused clusters. At that point, previously occupied file space is overwritten, but until that time it - and its directory entry - remains available for recovery.

But the lack of order is a potential issue. You can only search linearly for a file; you must check every entry to see if you've found it. A binary search, for example, is impossible.

So MS recommend limiting the size of a directory. They do not give a recommended size - though there is a hard maximum of 64k files per directory. I'd probably recommend keeping it to as few files and directories as possible, ideally a few tens or dozens.

Neil
teamtempest
Posts: 443
Joined: 08 Nov 2009
Location: Minnesota
Contact:

Re: Adventures in FAT32 with 65c02

Post by teamtempest »

Quote:
I ended up incrementing put_ptr so often it saved space to make it a subroutine; I may include that in my bios.asm as a permanent object
You've probably already noticed, but you can go a little further with this and store your output char as well:

Code: Select all

file_char_out:
  sta (put_ptr)
  inc put_ptr
  bne fcp1:
  inc put_ptr+1
fcp1:
  rts
If you used this for every filename character, it would do a useless increment after the last character. At least its use would be consistent for every character.

You could also add an entry point just before the start of this routine and load the accumulator with any particularly common character and fall through. Perhaps 'file_name_terminate' would load the accumulator with zero first. This wouldn't save a whole lot of space but would make pretty clear when reading the source code what was going on.
teamtempest
Posts: 443
Joined: 08 Nov 2009
Location: Minnesota
Contact:

Re: Adventures in FAT32 with 65c02

Post by teamtempest »

There's a lot in favor of "first get it to work", but as long as I'm in a "premature optimization" mood, in for a penny...

Code: Select all

check_directory:
  ldy #dir_attrib
  lda (fs_dir_ptr),y
  and #attr_dir
  beq fcp1:
  txa
file_char_out:
  sta (put_ptr)
  inc put_ptr
  bne fcp1:
  inc put_ptr+1
fcp1:
  rts
Then in the main line code replace the directory check code with:

Code: Select all

  ldx #'['
  jsr check_directory
  ldy #dir_name
to start and

Code: Select all

  ldx #']'
  jsr check_directory
to end.
User avatar
BigDumbDinosaur
Posts: 9425
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: Adventures in FAT32 with 65c02

Post by BigDumbDinosaur »

barnacle wrote:
As I've previously mentioned, the directory files are unordered. The file data includes not only long file names (which we don't care about) but also potentially many deleted files.
That is not unique to FAT filesystems.

Prior to the development of the ext3 filesystem as used with Linux and others, UNIX-like directory structures were analogous to what you are encountering.  Directory entries generally occur in the order in which they were created, hence they may be thought of as being “sorted” in chronological order.  Therefore, a search for a file is linear and the bigger the directory, the lengthier the search.

To some extent, the performance hits are mitigated by the fact that a UNIX filesystem directory entry is merely a filename paired with a binary, non-zero inode number—information about the file itself, e.g., its type, is stored in the inode, not the directory.  Prior to ext2, filenames are of a fixed length, padded if necessary with nulls.  The inode number precedes the filename, making it easy to test for zero and skip a deleted file.  So there is less grunt work to search the directory.  Even so, once a directory exceeds around 600 entries, search time markedly increases due to indirection in the filesystem.

Starting with the ext3 filesystem, it is possible to arrange directories into an htree structure, which is a mash-up of a B-tree and hashing—the choice is made during filesystem creation.  Using the htree structure, a search is significantly faster, even in a directory with thousands of filenames.

Quote:
When files are deleted on FAT systems, the space they occupy in the clusters remains unchanged until such a time that there are no more unused clusters. At that point, previously occupied file space is overwritten, but until that time it - and its directory entry - remains available for recovery.
That too differs from what is done in the UNIX/Linux world.  When a file is deleted, the inode number in its directory entry is set to zero, indicating the entry is of a deleted file, and the link count in the inode is decremented.  If the link count reaches zero, the inode will be zeroed out, and both it and the data blocks that were allocated to the file will be immediately released back to the filesystem for reuse.  As the UNIX/Linux environment is fully multitasking, odds are good the released inode and at least one data block will be re-allocated in short order, making the deleted file unrecoverable.
Quote:
So MS recommend limiting the size of a directory. They do not give a recommended size - though there is a hard maximum of 64k files per directory. I'd probably recommend keeping it to as few files and directories as possible, ideally a few tens or dozens.
64K directories entries would make for a painfully slow search if the desired file is at the far end of the directory, especially when you consider the monkey-motion required to navigate the FAT itself to find the relevant clusters.

In the UNIX world, there is theoretically no limit to how many entries can be in a directory.  The practical limit comes only when you run out of inodes.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Adventures in FAT32 with 65c02

Post by barnacle »

Our Hero finds the treasure!
Having walked the path parlous, we left our hero looking a little confusedly at the map to where the treasure was, being now able to decode it to find _all_ the treasures. But we haven't quite got him to any particular treasure yet, so it's probably time we did.

This is really a small diversion on our journey, because the routines here can be partly considered as file system owned - how to move around and access the file system - and partly belonging to the operating system: how to get, for example, a directory listing, or how to display the contents of a file. I haven't yet made final decisions as to which bits fit where.

So, we've established that we can use fs_find_first and fs_find_next in a simple loop to get sequential file names from the directory. Each time we stop, we are presented with an offset into the transient area - fs_dir_ptr - which points to the first character of a directory record. If that character isn't $05, $e5, or $00 then we are looking at a valid file entry, and remember that fs_find_next knows about deleted files and doesn't trouble us about them. So either we're looking at a valid file, or we've reached the end and there are no more files.

What we don't have is a way to isolate one of them; a search function. That's not actually very difficult, but we do need to note that the filenames in the directory entry are in 8.3 format, and we're probably going to want to start with a string of some kind - this is what I meant about the split between filesystem and OS files. So, as a temporary measure, let's build something to turn a simple zero terminated string into an 8.3 string.

Code: Select all

	bss
str_83:		ds 11		; eleven bytes in the finshed string

	code
str_to_83:
	sta put_ptr
	sty put_ptr+1
	; first, clear the string to spaces
	ldy #0
	lda #' '
str_t_01:
	sta str_83,y
	iny
	cpy #11
	bne str_t_01
	; now start copying
	ldy #0
str_t_02:
	lda (put_ptr)
	cmp #0				; end of string?
	beq str_t_x			; we're done
	cmp #'.'
	beq str_t_03		; done with the name if we find a period
	jsr toupper
	sta str_83,y
	jsr inc_put_ptr
	iny
	cpy #8				; or if we get to eight characters
	bne str_t_02
str_t_03:
	; either we had eight chars copies, or we found a '.'
	; in either case if the filename is legal, we're looking at a '.'
	jsr inc_put_ptr		; so skip it (even if it isn't!)
	ldy #8				; just in case, point to ext part of 8.3
str_t_04:	
	lda (put_ptr)
	cmp #0
	beq str_t_x
	jsr toupper
	sta str_83,y
	jsr inc_put_ptr
	iny
	cpy #11				; finished?
	bne str_t_04
str_t_x:
	rts					; it's a data block, not a string, so no term.
Note that this is not a particularly robust bit of code; it doesn't check for illegal characters, or names that are too long, or periods in the wrong place: it reads from the input string and writes to the output string after making the character uppercase, and if it hasn't found a '.' or a zero by character 9, it assumes that it's really a '.' and skips over it without checking. It then copies the remaining three characters, if they're there.
There are many ways for this to fail but if the input string is sane, it works.

Our example filename - known to be in the directory - is

Code: Select all

file_4:		db "graphics.c",0
We convert it to 8.3 format in a reserved string area:

Code: Select all

	LYA file_4
	jsr str_to_83			; convert the file string to 8.3 format
Then we run the same find_first/find_next loop as we did for the directory listing:

Code: Select all

	LYA cluster_begin_lba
	jsr u32_tolba
	jsr fs_find_first		; so find the directory listing
find_1:
		lda (fs_dir_ptr)	; check the first character of name
		beq find_x			; quit if it's zero
		jsr fs_match_83
		beq find_2			; found it!
		jsr fs_find_next
		bra find_1
find_2:
and instead of pretty-printing the name, we let fs_match_83 worry at it. That returns the zero flag set if the two strings match (that is, the converted output of str_to_83, and the 8.3 format filename at (fs_dir_ptr)).
If we found a match, then we can directly copy the cluster number of the start of the file to lba:

Code: Select all

	ldy # dir_frstcllo
	lda (fs_dir_ptr),y
	sta lba
	iny
	lda (fs_dir_ptr),y
	sta lba+1
	ldy # dir_frstcluhi
	lda (fs_dir_ptr),y
	sta lba+2
	iny
	lda (fs_dir_ptr),y
	sta lba+3
and from there it is but the work of a moment to get the sector from the cluster number, load that sector, and view it.

Code: Select all

	jsr clus_to_1st_sec
	jsr cf_set_lba
	jsr cf_read					; see the file
	SHOWTRANS
which displays the contents of the first sector of the file. As you can see, that's part of a genuine C file!

Code: Select all

0200  2F 2A 0A 20 2A 20 67 72 61 70 68 69 63 73 2E 63 /*. * graphics.c          
0210  0A 20 2A 0A 20 2A 20 20 43 72 65 61 74 65 64 20 . *. *  Created           
0220  6F 6E 3A 20 44 65 63 20 32 30 2C 20 32 30 32 30 on: Dec 20, 2020          
0230  0A 20 2A 20 20 20 20 20 20 41 75 74 68 6F 72 3A . *      Author:          
0240  20 62 61 72 6E 61 63 6C 65 0A 20 2A 0A 20 2A 20  barnacle. *. *           
0250  43 6F 6E 74 61 69 6E 73 20 74 68 65 20 66 6F 75 Contains the fou          
0260  72 20 73 74 61 72 74 2D 75 70 20 69 6D 61 67 65 r start-up image          
0270  20 6D 61 70 73 20 61 6E 64 20 74 68 65 20 74 65  maps and the te          
0280  78 74 20 66 6F 6E 74 0A 20 2A 2F 0A 0A 23 69 6E xt font. */..#in          
0290  63 6C 75 64 65 20 3C 73 73 64 31 33 30 39 2E 68 clude <ssd1309.h          
02A0  3E 0A 23 69 6E 63 6C 75 64 65 20 3C 69 31 38 6E >.#include <i18n          
02B0  2E 68 3E 0A 23 69 6E 63 6C 75 64 65 20 3C 67 72 .h>.#include <gr          
02C0  61 70 68 69 63 73 2E 68 3E 0A 23 69 6E 63 6C 75 aphics.h>.#inclu          
02D0  64 65 20 3C 65 6E 75 6D 73 2E 68 3E 0A 23 69 6E de <enums.h>.#in          
02E0  63 6C 75 64 65 20 3C 65 65 70 72 6F 6D 2E 68 3E clude <eeprom.h>          
02F0  0A 23 69 6E 63 6C 75 64 65 20 3C 73 74 64 69 6F .#include <stdio          
0300  2E 68 3E 0A 23 69 6E 63 6C 75 64 65 20 3C 73 74 .h>.#include <st          
0310  64 62 6F 6F 6C 2E 68 3E 0A 23 69 6E 63 6C 75 64 dbool.h>.#includ          
0320  65 20 3C 6D 61 69 6E 2E 68 3E 0A 0A 63 6F 6E 73 e <main.h>..cons          
0330  74 20 75 69 6E 74 38 5F 74 20 69 6D 61 67 65 32 t uint8_t image2          
0340  30 76 74 5B 34 38 30 5D 20 3D 20 7B 0A 0A 09 30 0vt[480] = {...0          
0350  78 30 30 2C 20 30 78 30 30 2C 20 30 78 30 30 2C x00, 0x00, 0x00,          
0360  20 30 78 30 30 2C 20 30 78 30 30 2C 20 30 78 30  0x00, 0x00, 0x0          
0370  30 2C 20 30 78 30 30 2C 20 30 78 30 30 2C 20 30 0, 0x00, 0x00, 0          
0380  78 30 30 2C 20 30 78 30 30 2C 20 30 78 30 30 2C x00, 0x00, 0x00,          
0390  20 30 78 30 30 2C 20 30 78 30 30 2C 20 30 78 30  0x00, 0x00, 0x0          
03A0  30 2C 20 30 78 30 30 2C 20 30 78 30 30 2C 0A 09 0, 0x00, 0x00,..          
03B0  30 78 30 30 2C 20 30 78 30 30 2C 20 30 78 30 30 0x00, 0x00, 0x00          
03C0  2C 20 30 78 30 30 2C 20 30 78 30 30 2C 20 30 78 , 0x00, 0x00, 0x          
03D0  30 30 2C 20 30 78 30 30 2C 20 30 78 30 30 2C 20 00, 0x00, 0x00,           
03E0  30 78 33 38 2C 20 30 78 30 30 2C 20 30 78 30 30 0x38, 0x00, 0x00          
03F0  2C 20 30 78 30 30 2C 20 30 78 33 66 2C 20 30 78 , 0x00, 0x3f, 0x
At this point, I'm not going to go into what you might have to do to actually move within the file. After today's easy bit, it's perhaps a bit trickier... and I think we need to get into file pointers and suchlike if we want to get into, for example if we want to open multiple files at the same time. This is clearly an operating system thing, not a file system thing.

And we haven't even thought about writing files yet... :shock:

Neil
User avatar
BigDumbDinosaur
Posts: 9425
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: Adventures in FAT32 with 65c02

Post by BigDumbDinosaur »

barnacle wrote:
...We convert it to 8.3 format in a reserved string area:

Code: Select all

	LYA file_4
	jsr str_to_83			; convert the file string to 8.3 format
I take it the LYA file_4 statement is referring to a macro.  I can’t recall from reading code in your previous posts where that was defined.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Adventures in FAT32 with 65c02

Post by barnacle »

Oops, missed that; I use it a lot.

Code: Select all

LYA			macro value		; load a Y:A with 16-bit value
			lda # lo value
			ldy # hi value
			endm
Neil
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Adventures in FAT32 with 65c02

Post by barnacle »

Some thoughts on writing files
These are just idle thoughts, subject to me changing my mind at a later date, to let me get my head around writing to a sector.

There are a number of steps we need to make a new file, and we want to try and minimise the possibility of breaking the file system if the power goes down while we're doing it.

Let's look at creating a file first. This, I feel, is an operating system owned process, not part of the file system. If that seems a bit weird, look how I've managed things earlier: BIOS calls get data in and out to and from the UART and the CF devices at the lowest level (CF calls are at the same level but are separated into a different file for clarity). One level higher are the file system calls. That understands FAT32 and knows (or finds out) how the disc is arranged, and can read numbered sectors, calculate sectors from clusters, clusters from the directory, clusters from the FAT chain and so on. Writing a sector will fit in this section.

The highest level is the operating system: the OS. It's job is to present a neutral view of the file system to the user, so that the user can say 'give me a listing of the directory' or 'print file xxxx.yyy' or 'create a directory' or 'open a file for reading' and so on.

There are a lot of options at that level, but the point is that the user never has to know anything about how files are stored on the disc. There's nothing stopping the user accessing the file directly, but in general I would hold that to be a bad thing because there's too great a chance of trashing the disc.

Here's Elm-chan's list of functions for his FATFS code (which can handle all the FAT variants, not just FAT32 as I do):
  • File Access
    • f_open - Open/Create a file
    • f_close - Close an open file
    • f_read - Read data from the file
    • f_write - Write data to the file
    • f_lseek - Move read/write pointer, Expand size
    • f_truncate - Truncate file size
    • f_sync - Flush cached data
    • f_forward - Forward data to the stream
    • f_expand - Allocate a contiguous block to the file
    • f_gets - Read a string
    • f_putc - Write a character
    • f_puts - Write a string
    • f_printf - Write a formatted string
    • f_tell - Get current read/write pointer
    • f_eof - Test for end-of-file
    • f_size - Get size
    • f_error - Test for an error
  • Directory Access
    • f_opendir - Open a directory
    • f_closedir - Close an open directory
    • f_readdir - Read a directory item
    • f_findfirst - Open a directory and read the first item matched
    • f_findnext - Read a next item matched
  • File and Directory Management
    • f_stat - Check existance of a file or sub-directory
    • f_unlink - Remove a file or sub-directory
    • f_rename - Rename/Move a file or sub-directory
    • f_chmod - Change attribute of a file or sub-directory
    • f_utime - Change timestamp of a file or sub-directory
    • f_mkdir - Create a sub-directory
    • f_chdir - Change current directory
    • f_chdrive - Change current drive
    • f_getcwd - Retrieve the current directory and drive
  • Volume Management and System Configuration
    • f_mount - Register/Unregister the work area of the volume
    • f_mkfs - Create an FAT volume on the logical drive
    • f_fdisk - Create partitions on the physical drive
    • f_getfree - Get free space on the volume
    • f_getlabel - Get volume label
    • f_setlabel - Set volume label
    • f_setcp - Set active code page
I do not propose to implement the functions in italics, at least not immediately and possibly not at all. Some of them I will implement in ways different from Elm-chan... we shall see. Some are impossible because of lack of a clock, for example.

Creating a file
A number of things have to happen. I propose to create a file of zero length, to which we can add data at a later stage with f_write or f_putc. The file will always be added to the current working directory, and it will not be possible to use a relative or absolute path (i.e. no ../../new_file or /new_file) so the filename will always be a simple string not requiring parsing. (It won't be possible to read or write except in the current directory, too).

To write a file, we need to
  • make sure it doesn't already exist (I don't like the idea of 'creating' a file that already exists)
  • find a place to put it - that is, an unused cluster, which we can find from the FAT (and a hidden clue in the system)
  • create a directory entry in the right place in the directory (i.e. the end) with the correct 8.3 filename, time and date stamps, attribute byte, and file size of zero
  • mark the cluster as used in the FAT
That requires at least two separate reads of sectors (probably more, if the directory is busy, and possibly multiple sectors) so we do need to keep an eye on getting things in the right order.

Neil
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Adventures in FAT32 with 65c02

Post by barnacle »

Having slept on it, the sequence should probably be:
  • Identify the next free cluster from the FSInfo sector
  • Mark it as in use in the FAT
  • Replicate that in the second copy of the FAT
  • Update the FSInfo sector
  • Generate the directory entry linking to the cluster
    • Mark it as an empty file (file length = 0)
    • Time and date stamp it
    • Set the attribute byte
And after that, we can see about actually writing data to it.

Also, creating a new directory follows the same process - it's just a cluster, after all - but further requires the generation of two file references in that new directory: '.' and '..' which point to the directory and to its parent directory.

Neil
User avatar
BigDumbDinosaur
Posts: 9425
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: Adventures in FAT32 with 65c02

Post by BigDumbDinosaur »

barnacle wrote:
Having slept on it, the sequence should probably be:

  • Identify the next free cluster from the FSInfo sector
  • Mark it as in use in the FAT
  • Replicate that in the second copy of the FAT
  • Update the FSInfo sector
  • Generate the directory entry linking to the cluster
    • Mark it as an empty file (file length = 0)
    • Time and date stamp it
    • Set the attribute byte
Looks to be a good procedure.  However, I’d make a small alteration to the workflow:

  1. Identify the next free cluster from the FSInfo sector.
  2. Mark it as in use in the primary FAT.
  3. Update the FSInfo sector.
  4. Generate the directory entry linking to the cluster:
    1. Mark it as an empty file (file length = 0).
    2. Time and date stamp it.
    3. Set the attribute byte.
  5. Duplicate primary FAT changes in the backup FAT.


My rationale for this change is as filesystem operations are never atomic, steps that directly affect filesystem structures are vulnerable to the effects of a crash (or a program bug).  FAT redundancy will be of value only if an error that results in filesystem inconsistency doesn’t affect both FATs.  Updating the backup FAT as the final step minimizes the chances of both FATs reflecting the same error.

Should the machine barf during new file creation and not get beyond “cataloging” the file (steps 4a–4c), the backup FAT will indicate the filesystem’s state before the crash.  Your filesystem repair tool could then analyze the directory state to determine if all cataloging steps were completed.  From that, the filesystem repair tool can decide to sync the backup FAT with the primary FAT (cataloging completed, directory entry valid) or revert the primary FAT to the state of the backup FAT (cataloging not completed, directory entry invalid).
x86?  We ain't got no x86.  We don't NEED no stinking x86!
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Adventures in FAT32 with 65c02

Post by barnacle »

MS point out that modern solid state systems are so reliable that the second FAT is rarely if ever used (indeed, there's an option not to have one, though all their systems do).

The sequence is then an issue of how we deal with errors that we can't do much about... how do you know your write didn't work until you come to read it? I'm certainly not going to do a file system check (chkdisc) every time I power up, so my logic is that both FATs should be changed as close together as possible (i.e. read FAT1 sector, change allocation entry, write FAT1 sector, write FAT2 sector from the same transient area).

So to my thinking: if the machine has a wobble during a file creation, setting the FAT will indicate that the cluster is allocated, even if it dies immediately afterwards. That will result in a file that isn't written, but has a cluster allocated. If it gets as far as writing the directory entry, then it will end with the data we expect (because it will be a single write of the directory sector once the metadata is affixed) and, as designed, a zero length file.

To be honest - it's a bloody fragile system with too many pointers to pointers going on... And I think anything that uses FAT2 isn't going to be running on the 65c02 - more likely a temporary mount on the linux box.

(Annoying: at the moment, my C version is returning the right data for the first free cluster, but my assembly version isn't. So no update today :mrgreen: )

Neil
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Adventures in FAT32 with 65c02

Post by barnacle »

I can confirm by inspection, though, that the FAT2 on this linux-formatted disc image does appear to be a good copy of FAT1. So Linux is formatting to spec, and adding files and updating both the FATs and the link in the FSInfo sector.

The link in FSInfo appears to point to the last FAT record of the last file written ($0FFFFFFF) and the next record is the one we want. So we might as a first step reserve our space:
  • Update the FSInfo link by incrementing it and immediately writing it back
  • Read the appropriate FAT sector in to transient
  • Update the FAT1 link in transient by adding an $0FFFFFFF end-of-cluster marker at the location indicated by the incremented FSInfo link
  • Write that modified transient sector back to FAT1
  • Copy the same sector (from transient) to FAT2, ensuring that FAT2 mirrors FAT1
  • Only then, create a directory entry and populate it
You'd probably want to disable interrupts during this process, too, just on general safety principles, and accept that your software clock is going to lose time...

This gives perhaps the minimum time that the two FATs are out of sync, and if the system falls over after the FATs are written, you end up with a non-recoverable unused cluster - allocated, but with no directory entry. Attempting to write again would just take us to the new right place, the cluster following.

(In all this I am assuming an uncluttered disc; I'm not planning (at the moment) on using the space that would be released by deleted files. When we run out of clusters, we're full. Buy a new disc. Or defragment it on an MS system.)

Neil
Post Reply