6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat May 11, 2024 3:12 pm

All times are UTC




Post new topic Reply to topic  [ 10 posts ] 
Author Message
PostPosted: Fri Jan 02, 2015 3:17 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
So I'm reading up on the Block instruction set (https://www.complang.tuwien.ac.at/forth ... locks.html) to see if it should be included in Tali Forth. After re-reading the description of BLOCK and BUFFER four times, I finally realized that it isn't my lack of sleep or some cut-and-past problem that is confusing -- the commands are really, truely a mess (gforth says "screw it" and just has BUFFER call BLOCK). Also, everybody admits this all is of limited use these days, while mumbling something about "backwards compatibility".

Nuts to that. I'm wondering if there might be a better use: How about recycling these commands not for 1k blocks, but for 64k segments that would nicely correspond to the banks of the 65816?

(Terminology for what follows: A block is a unit on a "persistant mass storage device" (eg hard or flash drive); a buffer is the corresponding unit of RAM that caches a block; a bank is a segment of memory on the 65816. When you're on night shift, this kind of stuff really, really makes your brain hurt.)

Here is the idea: We divide our HD (Flash, whatever) into 64k blocks and use those as the units we work on, just as traditional Block does with 1k segments. This means we can have things like
Code:
EDIT ( u -- )
which starts the editor on block u -- except that now we have 1024 lines with 64 characters to work with and not 16 (64k is not divisible by 80, because life sucks). We can reuse most of the instructions such as LOAD, UPDATE and FLUSH this way. Use of the buffers is transparent -- we think in blocks and work with blocks. We do keep BLOCK to return the address of the first byte of the buffer where the block is located, but that is pretty much it.

In hardware, we can have (say) 512k of RAM for the buffers, or 8 x 64k. These correspond nicely to the 65816's banks. What happens under the hood with LOAD for example is that the system finds an unassigned buffer and loads the block's content into the corresponding bank. Then Forth starts reading from that buffer, just as it would with the old system.

Doing the math, using one byte for the block numbers gives 16 MB of hard drive space (256 * 64k). If we use 16 bit (one cell) for the numbers, we're at 4 GB. On the computer, each 512k RAM chip (say, AS6C4008) gives us eight buffers. The mechanisms for dealing with what amounts to a primitive version of virtual memory or a cache are well explored, and all the old stuff with unassigned, assigned-clean and assigned-dirty still applies. The blocks are still just raw data, not fancy formatted files. We're wasting memory like popcorn, of course, but this is the 21st century, and hard drive space is cheap.

This way, however, in most cases a block will be a complete program (email body, whatever), so most of the stringing along of blocks would be done away with. We can also add nice details like using block 0 as an automatic index -- every time a buffer is saved back to its block, the first line is copied to the line in block 0 with the same number (so the first line of block 13 goes to line 13 in block 0). Our editors will have to deal with scrolling, but that should be doable. ERASE-BLOCK might be a nice addition to the command set, as COPY-BLOCK ( u1 u2 -- ), but that's the next step.

Finally, I'd suggest simply dropping BUFFER completely -- the difference to BLOCK is trivial, and it is confusing because of BUFFER: anyway. Any thoughts?


Top
 Profile  
Reply with quote  
PostPosted: Fri Jan 02, 2015 4:33 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1928
Location: Sacramento, CA, USA
My first thought was "wasting memory", but then you reminded me about the cost of storage nowadays, and it clicked. I don't have any advice for you, except to spend a bit of time pursuing your idea, to see if it leads you someplace comfortable or someplace awkward. It occurs to me that hard-coded block sizes limit your flexibility, whether it's 1k or 64k, or whatever. Is there a way that you can make block size a run-time (or compile-time at least) option of some sort?

Best wishes,

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Fri Jan 02, 2015 7:11 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8433
Location: Southern California
I worked with blocks for a few months in 1990 (when I started on the automated test equipment) and absolutely hated it, so I took the time to change the source code into a text file, by hand, since the metacompiler I was using could handle text files too. (The original 6502 Forth kernel that came with it was in block files.) That's not to say blocks don't have their advantages, but they are severely outweighed by their disadvantages. Actually, the only advantage I can think of for blocks is you can have code in one block that says, "If such-and-such is true, then load block XYZ; otherwise, load this other range." You wouldn't want to have to load a full 64K at a time to keep that though. 2K or 4K or 8K blocks might be good. Such a capability could be built into a Forth text editor too though, then you could specify some kind of labels, rather than screen numbers or line numbers which change as you insert and delete screens or lines.

Quote:
except that now we have 1024 lines with 64 characters to work with and not 16 (64k is not divisible by 80, because life sucks).

One of the things I hated was the miserably short lines that did not leave room for real comments. You could use shadow screens, but they're a pain, too. I would go for lines of 128 characters, and possibly go from 16 lines to 32 or 64. Then a block would be 4K or 8K. My DOS screen can show 132 characters by 60 lines (although I usually have it in 43-line mode), so with 1K blocks, I could conceivably see a triad plus their shadow screens all at once, with room left over for menu bars, minimizing bar, dividing lines between screens, etc.. I made the input buffer on my workbench computer to be 128 characters, which leaves 126 after you take off <CR> and <LF> that comes at the end of a line. I probably should have made it enough to handle an entire line on the PC, but it's not reduced enough to be any big deal.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 04, 2015 11:22 am 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
Thanks for the feedback -- and after sleeping over it for a while, 8k sounds a lot more sane (the assembler in Forth I am working on will be less than 8k with the comments stripped out). In the mean time, I have also gotten sidetracked by realizing just how much NVRAM is out there (like the DS1265Y/AB, 1 MByte in DIP package 70ns) which could be used to similiar effect but would be lot more flexible. Those chips are rather expensive, though.

Another part of the problem could be solved with a better editor that respects line breaks for text, so you don't waste the last part of the line as whitespace.

One way or another, this is all starting to make the 65816 more interesting. I probably should go back and read up just how much hassle the memory access stuff actually was ...

Thanks again!


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 04, 2015 11:45 pm 
Offline

Joined: Sun Jul 28, 2013 12:59 am
Posts: 235
scotws wrote:
Another part of the problem could be solved with a better editor that respects line breaks for text, so you don't waste the last part of the line as whitespace.

Why not go all the way? Store your source pre-parsed with an explicit "comment" token, much as Applesloth basic did. Your compiler becomes a lot simpler and faster (no having to search through the dictionary to discover that a token isn't a word, then try to parse it as an integer, for example), you have a compact storage representation for code, and it looks okay in your editor.

Of course, this way lies ColorForth. Something to consider, at least.


Top
 Profile  
Reply with quote  
PostPosted: Mon Jan 05, 2015 5:58 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8433
Location: Southern California
Quote:
gotten sidetracked by realizing just how much NVRAM is out there (like the DS1265Y/AB, 1 MByte in DIP package 70ns) which could be used to similiar effect but would be lot more flexible. Those chips are rather expensive, though.

My 4Mx8 5V 10ns SRAM module takes less board space, is seven times as fast, and gives four times as much memory, for less money. You would need a backup circuit which can be pretty simple or you can use a Maxim or Dallas IC for it. [Edit: Unfortunately about the time I wrote that, Cypress changed the SRAM chip I use for the module such that the low-power data-retention mode is no longer very low-power.] It has eight chip-select inputs, one for each 512Kx8 SRAM IC, so you need to decode that outside the module. The module lets you write-protect half (2Mx8) at a time. It's shown on the front page of my website, with a link to the data sheet.

For mass storage though, an SPI flash memory might be appropriate. I've used the 25VF032 which has the same 4Mx8 quantity of memory in a tiny SO-8 package for a couple of bux. Unfortunately it does not exist in a 5V variation, but SPI is easy to translate voltages for since all signal lines are unidirectional. Let me recommend our SPI-10 hobbyist-friendly connector and pinout for tiny SPI modules. I make these half-postage-stamp-size flash PCBs available too. See the front page of my website.

I would say source code formatting needs to be left alone, with white space as written, although you could do some simple data compression. Source code has a lot of consecutive spaces, so if you have ten spaces in a row for example, you could probably record that with two bytes instead of ten.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 22, 2018 12:22 am 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 858
scotws wrote:
Finally, I'd suggest simply dropping BUFFER completely -- the difference to BLOCK is trivial, and it is confusing because of BUFFER: anyway. Any thoughts?

What's confusing? Where the Forth-83 standard states that BUFFER might not transfer the block from mass storage, I implemented it so that it never transfers the block from mass storage, therefore it does what BLOCK does, but without reading mass storage.
It probably wouldn't make any difference in a block buffer system designed to access a ram bank, but back in the day, BUFFER was useful. The best way to implement COPY, the editor word that copied the contents of one block to another, was to use both BLOCK and BUFFER in its definition:
Code:
: COPY  ( BLK1 BLK2 -- )
   SWAP BLOCK SWAP BUFFER
   B/BUF CMOVE UPDATE ;

If the destination block is not in a buffer, the use of BUFFER in COPY would avoid reading a block into memory that would just be overwritten anyway, saving time. Also, this version of COPY works even if the system is configured to have only one block buffer because BUFFER will not overwrite the contents of the buffer. If BUFFER were just an alias for BLOCK, it wouldn't work.
On my ITC system, if I redefined BUFFER as:
Code:
CODE BUFFER  ( BLK# -- ADR )
   -2 ALLOT
   ' BLOCK @ ,  END-CODE

Making it an alias for BLOCK, and removed the code in BLOCK that allows efficient support of BUFFER, it would save maybe a dozen bytes or so. BUFFER is actually defined on my system as:
Code:
CODE BUFFER  ( BLK# -- ADR )
   -2 ALLOT
   ' BLOCK @ 1+ ,  END-CODE

If you were wondering, my Forth's BLOCK starts out as a code definition. As long as the requested block is the most recently used block, BLOCK stays at code level to replace the block number with the address of the appropriate buffer, then jumps to NEXT. If the requested block is not the most recently used one, BLOCK transitions to high level to deal with it.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 22, 2018 3:11 am 
Offline

Joined: Sat Aug 21, 2010 7:52 am
Posts: 231
Location: Arlington VA
JimBoyd wrote:
If you were wondering, my Forth's BLOCK starts out as a code definition. As long as the requested block is the most recently used block, BLOCK stays at code level to replace the block number with the address of the appropriate buffer, then jumps to NEXT. If the requested block is not the most recently used one, BLOCK transitions to high level to deal with it.

I really like this idea. PETTIL has exactly one 1K block buffer that floats below the symbol table in upper memory, because the design is constrained by "what the live hardware that I have can do," and that means 32K max RAM. The user variable `PREV` keeps track of which block is in that buffer, and `BLOCK` won't reload if PREV == BLOCK. A pretty simple setup, but it's working out great, considering cassette tape is the persistent storage medium (again, hardware = design constraint)


Top
 Profile  
Reply with quote  
PostPosted: Mon Sep 24, 2018 6:25 pm 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
BLOCKs aren't particularly confusing, but their relationship to SCREENs can be, since a SCREEN can be made up of several BLOCKs.

A BLOCK is simple. You divide the mass storage up in to several BLOCKs (based on the block size).

1 BLOCK fetches BLOCK #1 in to memory. 2 BLOCK fetches block number 2. At the same time, it is possible that while fetching a BLOCK, other, updated BLOCKs will be written to disk.

If a BLOCK is already in memory, (I believe), it will NOT be fetched, rather you will get the BLOCK currently in memory.

BUFFER is just like BLOCK, except that it is not required to actually load the data from disk (it can, which is why many just make BUFFER an alias for BLOCK, but it does not have to).

After that, you have the update command (which I forget) to tell the system that a BLOCK/BUFFER is dirty so it can be written later, and then there's the other words that flush the buffers and reset the buffers (I don't have them on the tip of my tongue either).

Personally, if your BLOCK/BUFFER commands do not hit "permanent storage", then you should not reuse the names. Rather, you should just come up with your own names.


Top
 Profile  
Reply with quote  
PostPosted: Mon Sep 24, 2018 8:40 pm 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 858
whartung wrote:
BLOCKs aren't particularly confusing, but their relationship to SCREENs can be, since a SCREEN can be made up of several BLOCKs.

A BLOCK is simple. You divide the mass storage up in to several BLOCKs (based on the block size).

1 BLOCK fetches BLOCK #1 in to memory. 2 BLOCK fetches block number 2. At the same time, it is possible that while fetching a BLOCK, other, updated BLOCKs will be written to disk.

If a BLOCK is already in memory, (I believe), it will NOT be fetched, rather you will get the BLOCK currently in memory.

Correct.
Quote:
BUFFER is just like BLOCK, except that it is not required to actually load the data from disk (it can, which is why many just make BUFFER an alias for BLOCK, but it does not have to).

In my opinion, it is best if BUFFER does not load the data from disk so that COPY, by being defined with BLOCK and BUFFER, can copy one block of mass storage to another even if there is only one block buffer available.
Quote:
After that, you have the update command (which I forget) to tell the system that a BLOCK/BUFFER is dirty so it can be written later, and then there's the other words that flush the buffers and reset the buffers (I don't have them on the tip of my tongue either).

From the Forth-83 standard required wordset
UPDATE marks the currently valid block buffer as modified.
SAVE-BUFFERS writes all updated blocks to mass storage.
FLUSH performs function of SAVE-BUFFERS then unassigns all block buffers.
From the controlled reference word set:
EMPTY-BUFFERS unassigns all block buffers. Updated blocks are not written to mass storage.
Personally, I define FLUSH as:
Code:
: FLUSH  ( -- )  SAVE-BUFFERS EMPTY-BUFFERS ;

Quote:
Personally, if your BLOCK/BUFFER commands do not hit "permanent storage", then you should not reuse the names. Rather, you should just come up with your own names.

My BLOCK/BUFFER commands may not always hit "permanent storage"
Code:
block access for this Forth on a Commodore 64:
Block number:                       Storage device:     
    0 -  4095  (    0 - 0FFF )     drive 8
 4096 -  8191  ( 1000 - 1FFF )     drive 9
 8192 - 12287  ( 2000 - 2FFF )     drive 10
12288 - 16383  ( 3000 - 3FFF )     drive 11
16384 - 20479  ( 4000 - 4FFF )     drive 12
20480 - 65535  ( 5000 - FFFF )     Ram Expansion Unit ( REU ) (can be revectored to other ram)

On the Commodore 64, devices 8 and up are disk drives.
Obviously, the Commodore disk drives can't hold anywhere near 4096 blocks ( at least not the 1541, 1571, and 1581 ) and an attempt to access a block higher than a drive can access will abort with an error message. It is however, a convenient way to allow the use of multiple drives and the REU.
I do agree that if the BLOCK/BUFFER commands never hit "permanent storage" then different names should be used.

Cheers,
Jim


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: