So I'm reading up on the Block instruction set (
https://www.complang.tuwien.ac.at/forth ... locks.html) to see if it should be included in Tali Forth. After re-reading the description of BLOCK and BUFFER four times, I finally realized that it isn't my lack of sleep or some cut-and-past problem that is confusing -- the commands are really, truely a mess (gforth says "screw it" and just has BUFFER call BLOCK). Also, everybody admits this all is of limited use these days, while mumbling something about "backwards compatibility".
Nuts to that. I'm wondering if there might be a better use: How about recycling these commands not for 1k blocks, but for 64k segments that would nicely correspond to the banks of the 65816?
(Terminology for what follows: A
block is a unit on a "persistant mass storage device" (eg hard or flash drive); a
buffer is the corresponding unit of RAM that caches a block; a
bank is a segment of memory on the 65816. When you're on night shift, this kind of stuff really, really makes your brain hurt.)
Here is the idea: We divide our HD (Flash, whatever) into 64k blocks and use those as the units we work on, just as traditional Block does with 1k segments. This means we can have things like
Code:
EDIT ( u -- )
which starts the editor on block u -- except that now we have 1024 lines with 64 characters to work with and not 16 (64k is not divisible by 80, because life sucks). We can reuse most of the instructions such as LOAD, UPDATE and FLUSH this way. Use of the buffers is transparent -- we think in blocks and work with blocks. We do keep BLOCK to return the address of the first byte of the buffer where the block is located, but that is pretty much it.
In hardware, we can have (say) 512k of RAM for the buffers, or 8 x 64k. These correspond nicely to the 65816's banks. What happens under the hood with LOAD for example is that the system finds an unassigned buffer and loads the block's content into the corresponding bank. Then Forth starts reading from that buffer, just as it would with the old system.
Doing the math, using one byte for the block numbers gives 16 MB of hard drive space (256 * 64k). If we use 16 bit (one cell) for the numbers, we're at 4 GB. On the computer, each 512k RAM chip (say, AS6C4008) gives us eight buffers. The mechanisms for dealing with what amounts to a primitive version of virtual memory or a cache are well explored, and all the old stuff with
unassigned, assigned-clean and
assigned-dirty still applies. The blocks are still just raw data, not fancy formatted files. We're wasting memory like popcorn, of course, but this is the 21st century, and hard drive space is cheap.
This way, however, in most cases a block will be a complete program (email body, whatever), so most of the stringing along of blocks would be done away with. We can also add nice details like using block 0 as an automatic index -- every time a buffer is saved back to its block, the first line is copied to the line in block 0 with the same number (so the first line of block 13 goes to line 13 in block 0). Our editors will have to deal with scrolling, but that should be doable. ERASE-BLOCK might be a nice addition to the command set, as COPY-BLOCK ( u1 u2 -- ), but that's the next step.
Finally, I'd suggest simply dropping BUFFER completely -- the difference to BLOCK is trivial, and it is confusing because of BUFFER: anyway. Any thoughts?