dawnFORTH: Yet another crude Forth for the 65C02.

BigDumbDinosaur · Post by **BigDumbDinosaur** » Thu Dec 25, 2025 8:42 am

electricdawn wrote:

Well, it says the lineage continues for Postscript, but the black icon of Forth indicates that it is supposed to be dead. >(

Forth isn’t dead, but as one member around here once wisecracked, it “got voted off the island.”

JimBoyd · Post by **JimBoyd** » Fri Dec 26, 2025 9:05 pm

electricdawn wrote:

I know Pascal-type strings... I've worked with Turbo-Pascal way before I learned C in the 80's...

I just think that C-type strings are WAY easier to work with. And are not limited in length. I really do not understand why Charles Moore chose Pascal-type strings over C-type.

Forth's memory moving words take a source, destination, and length. They will work with a Forth string represented as an address and length (just need to tuck the destination under the length) as well as other data even if that data has null bytes.

gilhad · Post by **gilhad** » Fri Dec 26, 2025 10:06 pm

Back in my Pascals days I used a lot of "strings" that contained zero bytes inside ... default values, data, headers, nil pointers, graphics ... easy with len+address or Pascal style, problem with C-style

BigDumbDinosaur · Post by **BigDumbDinosaur** » Sat Dec 27, 2025 4:56 am

gilhad wrote:

Back in my Pascals days I used a lot of "strings" that contained zero bytes inside ... default values, data, headers, nil pointers, graphics ... easy with len+address or Pascal style, problem with C-style

Your comment reminds me that some of the older dot-matrix printers’ ESCape sequences used null bytes as parameters, e.g., PRINT#4,CHR$(27)CHR$(1)CHR$(0)...kind of tough to do with C-style strings. The Pascal method is generally more useful, since it isn’t a language-specific construct and only costs one extra byte if maximum permissible length is 64K. The C-style’s only real advantage is there is theoretically no limit to string length—I’ve never run into a situation in which that was useful.

electricdawn · Post by **electricdawn** » Sat Dec 27, 2025 12:21 pm

I gotta say, now that I'm thinking of it... I think I DO like having buffer/length (up to 64KB!) of Forth. But I also like to have a null-byte at the end for easier checking.

But how do you deal with FIND only accepting counted strings? It pretty much precludes using PARSE-WORD or PARSE unless your FIND does some shady tricks (like mine is doing). Do I have to implement SEARCH-WORDLIST? I do have two separate wordlists, the one returned by FORTH-WORDLIST which resembles the internal wordlist (most likely in ROM) and the one returned by GET-CURRENT, which is residing in RAM and has the additional and user-defined words.

Would still involve a lot of change to my code base...

electricdawn · Post by **electricdawn** » Sun Dec 28, 2025 9:52 am

Small update to "system.fth".

Code: Select all

: U/MOD	>R S>D R> UM/MOD ;		( u1 u2	-- u3 u4 )
: U/		U/MOD SWAP DROP ;		( u1 u2 -- u3 )

: lALIGNED  CELL C@ NEGATE AND ;	( addr - a-addr )
: ALIGNED   CELL+ 1- lALIGNED ;		( addr - a-addr )

: NAME>STRING ( nt -- c-addr1 u )
	6 + W@ COUNT ;

: > ( n1 n2 -- flag )
	OVER 0< OVER 0< SWAP - DUP
	0 = IF DROP SWAP U< ELSE
	1 = IF 2DROP FALSE ELSE
		   2DROP TRUE 
	THEN THEN
;

: SAVE-INPUT ( -- xn ... x1 n )
	SOURCE OVER + ALIGNED
	>IN W@ ROT + lALIGNED
	2DUP - CELL C@ U/ >R
	?DO I @ CELL C@ +LOOP
	SOURCE-ID
	SOURCE >IN W@
	R> 4 +
;

electricdawn · Post by **electricdawn** » Sun Dec 28, 2025 1:08 pm

More bug fixes. Please re-download.
- Fixed... annoyance in "ACCEPT", where ACCEPT would use the input buffer pointer instead of a temporary one, leading to the input buffer pointer (thus SOURCE) being overwritten and subsequent input getting stored into whatever buffer address you provided to ACCEPT.
- Fixed bug in "int_header" that would produce a random code address when called by ":NONAME".
- Fixed bug in "int_insertXT" that would produce an incorrect XT when paired with ":NONAME".
- Fixed bug in ">" that would produce incorrect flags with certain values.
- Fixed bug in "ALIGNED" that would incorrectly align an already aligned address, producing "a-addr CELL+".
- Changed "SAVE-INPUT" so it will compress multiple characters into one cell (depending on cell size).
- Added "NAME>STRING".
- Added "U/MOD" and "U/".

electricdawn · Post by **electricdawn** » Sun Dec 28, 2025 8:08 pm

Ok, I'm going to take a risk.

Right now my data stack resides right below the top-of-memory. It grows downward towards HERE. As long as these don't collide everyone's happy. That's a BIG data stack...

Do I really need it?

I already made some changes to my code (this hasn't been uploaded yet!) so that the data stack now resides right below the return stack, in the zero page. Which is already occupied by my variables. I calculated the maximum size of my data stack and it is 147 bytes.

Doesn't sound like much, doesn't it?

I've been reading a lot of Garth's pages lately, specifically - http://wilsonminesco.com/stacks/enoughstackspace.html - and a dangerous thought started creeping into my mind. Maybe, just maybe I don't need that huge data stack and I can squeeze it right below the 6502 return stack.

Well, I changed my code and tried it. It actually loaded and compiled the entire "system.fth" file without a hitch. Ok, this is not complicated code, but it worked. I checked the data stack pointer (points at $FF when stack is empty) and it never dropped below $E0. That is with a cell size of two bytes. That still leaves plenty of room before I reach the highest variable in my zero page "tempShdw" which ends at $6C.

Ok, I'm looking for input here. Do you think this is viable, or should I forget about it and leave the data stack pointer right below top-of-memory? It would make range-checking easier, since I only have to check a byte against another instead of two bytes. Also my code would get smaller, since it's all in zero page memory. And data stack movements are a huge part of Forth, so it would probably also be slightly faster.

SamCoVT · Post by **SamCoVT** » Sun Dec 28, 2025 9:00 pm

electricdawn wrote:

Ok, I'm looking for input here. Do you think this is viable, or should I forget about it and leave the data stack pointer right below top-of-memory? It would make range-checking easier, since I only have to check a byte against another instead of two bytes. Also my code would get smaller, since it's all in zero page memory. And data stack movements are a huge part of Forth, so it would probably also be slightly faster.

I think you've hit on a lot of the reasons why this is probably a good idea. I use a 33-word (for 16-bit words) data stack in zero page and that has been plenty even for more advanced software. I have never used it all.

I do arrange my variables in zero page so that the most important ones are at a lower address and the non-critical temporary variables are higher. That way, in the event I do have a lot on the data stack, it will overwrite the non-critical variables first. In practice, this was being overly cautious as I've never filled the data stack (with the possible exception of a runaway loop or other self-inflicted crash).

electricdawn · Post by **electricdawn** » Mon Dec 29, 2025 9:45 am

I'm starting to question my life choices...

Well, not really...

I'm continuing reading Garth's treatise on stacks, and it is really an eye-opener. I specifically like his structured approach to assembly programming with macros. As soon as I've figured out how to do this with vasm I will try it myself.

But... there's a huge problem with my approach that makes it unsuitable for building a stack the way Garth does. You can probably guess it by now:

Variable cell sizes.

Garth builds his stack around the idea that register X is the absolute stack pointer, and he uses ABSOLUTE addresses as OFFSETS to his ABSOLUTE stack pointer to get to the various cells on the stack. The idea is brilliant in itself, as you pretty much don't need any temporary variables. The stack IS your temporary variables as it's all in the ZP. But it completely fails when you have to deal with variable cell sizes.

Garth knows his cell size is, for example, two bytes. So to access, say, the third cell on his stack, he goes:

Code: Select all

lda  4,X    ; Get the low byte of third cell.
ldy  5,X    ; Get the high byte of third cell.
<dosomethingoranother>

This approach wouldn't work with variable cell sizes, because you just don't KNOW the correct offset right away. You need to check CELL first and then calculate the offset, which also would make the direct access indexed by X not work anymore.

So... I'm questioning my choice of having a variable cell size.

The original idea behind this was to allow for both speed and precision. The tradeoff being you can choose either one of them, but not both. But you could CHOOSE!

Reading Garth's treatise I'm not so sure anymore if this approach is valid any longer. I could use his approach, but choose a cell size of, say, four bytes, making it still fairly fast and "precise enough(tm)" for most work you'll probably do. And it would probably still be faster than using my approach with even just a two byte cell size.

*sigh* I'm confused.

BigEd · Post by **BigEd** » Mon Dec 29, 2025 10:47 am

Interesting one! You could trade off memory for speed, by keeping a stack with 4-byte cells but have some of them contain smaller values, perhaps? (Is it a mixed-size system in that way?)

Or, at cost of complexity, perhaps keep pointers-to-cells in the stack, presumably two bytes, and they will point into two or more pools where the actual cells are kept. Which introduces compaction and/or garbage collection, as extra problems to solve.

Just thinking aloud...

electricdawn · Post by **electricdawn** » Mon Dec 29, 2025 12:11 pm

dawnFORTH uses a truly variable cell size stack. That means that all calculations necessary to access cells are revolving around what's in CELL. You can pretty much arbitrarily change cell size to be, say, five bytes by just executing:

Code: Select all

5 CELL C!

All other operations will follow this new size. Everything. Stack operations, math, logic, all of it.

This is a cool concept, in theory, but it does incur (probably) massive speed (and size, due to the added complexity) penalties. That's why I'm currently lost on where to take this.

BigEd · Post by **BigEd** » Mon Dec 29, 2025 12:26 pm

Where is the cell size held? In the first byte of the cell maybe??

electricdawn · Post by **electricdawn** » Mon Dec 29, 2025 3:24 pm

In CELL, and CELL only. The size of a cell on stack CHANGES with a new value in CELL. All the routines that deal with cells adapt to it.

BigDumbDinosaur · Post by **BigDumbDinosaur** » Mon Dec 29, 2025 5:33 pm

electricdawn wrote:

I'm starting to question my life choices...

Uniformity of data size usually seems to win out over flexibility in many cases, mostly to avoid the hoop-jumping required to figure out data size on each access to a numeric variable. Language compilers, of course, offer several numeric data types that vary in size, but any given type is of a fixed size and is a compile-time characteristic.

The choice comes down to just how desperate you are to support a feature, and just how often do you use that feature. I’m not a Forth user, so I can’t opine on how useful a variable cell size might be. However, I am mindful of an old adage which says just because you can do something doesn’t mean you should do it.

dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.

Re: dawnFORTH: Yet another crude Forth for the 65C02.