FORTH - empty stack with TOS in register(s)

gilhad · Post by **gilhad** » Sun Aug 17, 2025 9:53 am

Problem: FORTH implementation of stack with TOS in register(s) and empty stack/underflow

I think something similar would be somewhere here on this site for sure, but I did not found it.

I am writing my own FORTH for atmega2560 and I have to decide, how to implement stack, where I want to keep TOS in register(s) for better speed, but I am not sure, how to manage the state, when the stack is empty. (As opposite of stack with 1 value inside (just registers) or more values (registers and normal stack).) I am able to write the code, but I do not know, what approach I want/can/should choose and I would like read something about it, how others solved such problems and why. (I know atmega2560 is not 65*02, but my question is about FORTH and philosophy, examples on 6502 are welcome and readable for me.)

I may keep some flag, if stack is empty or not, but that mean to update it everytime again and again at speed penalty.

I may use some "off one" value "outside of stack", which will be filled with "nonexistent previous value", when first push is done and which will be "poped in" when last pop is done, which mean somehow fake all stacks checks.

I may implement circular stack and just pretend, that the start is in middle of something, so no underflow/overflow will be even possible, just undefined values on underflow and lose of part of history on overflow. (and there will be the part problem with 3bytes values, which are not atomic and wraping it around)

Or there may be other approaches and there may be also other "edges and flaws" (good and bad sides) which I am not avare of, or where I do not fully know, how much good/bad it is in reality.

More background:

I am building my own 8bit computer around HD6309 (and maybe more varinants with 65*02 and Z80 too) and I want it to be able use VGA+keyboard, so I am building extra graphic card with ATmega2560 to provide this as simplified IO. I am developing the HW for this "graphic card" mainly here https://github.com/githubgilhad/MegaHomeFORTH

: Graphic card in developement/testing

and it can also work as SBC on its own

: Works also as SBC

and I found extremly usefull to use FORTH for interactive testing and "poking legs" / manipulating IO pins.

I first implemented something inspired by JonesForth but implemented in mix of C/C++/.ino/asm and with lot of debugging tools inside (memory dumps, range checking ...) and it somehow works, but I have constant struggle with data sizes (cell=16b, address=24b, double=32b, C uses __memx poniters, but C++ can use only uint32_t ...).

Also I am trying to have as much words in FLASH (~ROM) to save as much RAM as possible.

On atmega access to FLASH(~ROM) needs different instructions than access to RAM and program may be executed only from FLASH. So I am trying to write new, better implementation of FORTH in assembler (mainly), where everything will be 24bits/3bytes in size and rotines/macros will manage all important parts, where I know what I want to do, but C/C++ is not convenient for it. https://github.com/githubgilhad/memxFORTH-asm. I want to implement this one as fast running, as I hope to use it as part of firmware for the graphic card, which could be enhanced in runtime (where the 6309/6502/Z80 will send some routines to "atmega2560 coprocesor", both graphics and others to do in paralel). And generating VGA signal takes like 90% of time, for "normal work" can be used only blanks and borders, so speed is valued.

And later I will probably use the FORTH for the main (6309/6502/Z80) procesor too. It will be ofcurse written anew for the procesor, but probabely on the proven principles from its many predecessors

(I mean the variant for atmega328, the memxFORTH-core, the MegaHomeFORTH, the testing variant for PC, the memxFORTH-asm variant ... will continue with 6309 variant, 6502 variant and so ...)

barrym95838 · Post by **barrym95838** » Sun Aug 17, 2025 3:46 pm

So, you're going to have TOS in a register and NOS etc. in memory. It might not be easy to conceptualize what a stack with one or zero (or less than zero) entries looks like. but it's surprisingly simple in code. Say that you have a register R0 that holds TOS if present and another register R1 that points to NOS if present. If you consider NOS, you should be able to tell if R1 points to nonsense if it points one or more cells past your designated stack area and you should be able to tell if R0 holds nonsense if R1 points two or more cells past. I could try to make an ASCII art diagram, but I'm challenged for time at the moment. This is only slightly more complicated than pointing R1 directly at TOS (in memory), and comes with some advantages (or not), depending on your architecture.

P.S. I was going to refer you to Dr. Brad's excellent articles that touch on this subject and more, but they seem to have fallen by the wayside.

GARTHWILSON · Post by **GARTHWILSON** » Sun Aug 17, 2025 4:12 pm

gilhad wrote:

Problem: FORTH implementation of stack with TOS in register(s) and empty stack/underflow

I think something similar would be somewhere here on this site for sure, but I did not found it.

I am writing my own FORTH for atmega2560 and I have to decide, how to implement stack, where I want to keep TOS in register(s) for better speed, but I am not sure, how to manage the state, when the stack is empty. (As opposite of stack with 1 value inside (just registers) or more values (registers and normal stack).) I am able to write the code, but I do not know, what approach I want/can/should choose and I would like read something about it, how others solved such problems and why. (I know atmega2560 is not 65*02, but my question is about FORTH and philosophy, examples on 6502 are welcome and readable for me.)

This is specifically a 65xx forum, and we need to keep that focus; so please include some 65xx content. When we discussed this (TOS in register) years ago for the 65816 (whose registers can be set to 16-bit width when in native mode), I went through some exercises to see if having TOS in a register would be beneficial, and although there are places where it would be, there are others where it actually adds overhead. So I'd have to actually write an entire kernel and run programs on both kernels to find out if it was a wash or not. I'm not familiar with ATMegas, but I wonder if there's an economical way to keep a flag that would indicate if a zero or negative number of data-stack cells. In my '02 and '816 Forth, stack depth is only checked for underflow when execution is returned to the interpreter, not while running compiled code, and it's just not a problem. While X is used as the data-stack pointer, it will point to a memory location whose contents are not significant if the data stack is empty. The depth can of course be tested, with the word DEPTH; but it's seldom necessary.

barrym95838 · Post by **barrym95838** » Sun Aug 17, 2025 5:31 pm

GARTHWILSON wrote:

When we discussed this (TOS in register) years ago for the 65816 (whose registers can be set to 16-bit width when in native mode), I went through some exercises to see if having TOS in a register would be beneficial, and although there are places where it would be, there are others where it actually adds overhead. So I'd have to actually write an entire kernel and run programs on both kernels to find out if it was a wash or not.

Charlie's NMOS DTC Pettil kernel maintains a separate TOS, and I don't think I've seen anything more efficient for its primitive target.

BigEd · Post by **BigEd** » Sun Aug 17, 2025 6:46 pm

I think Garth is saying that a correct program won't access an empty stack, so it's not a problem.

If there were a need to check that the stack items being accessed are valid, then it feels to me that a stack pointer allows for that. Even if TOS and even if NOS is in registers, there would still be a bigger actual stack somewhere, and it would have a stack pointer. It's just that there are two or three values for that stack pointer which are slightly special, because they mean there's nothing in the actual stack, and also that there's something or nothing in the TOS and maybe NOS.

There's a nice table and explanation in Dr Brad's linked page, so thanks for that link!

BigDumbDinosaur · Post by **BigDumbDinosaur** » Sun Aug 17, 2025 8:45 pm

barrym95838 wrote:

P.S. I was going to refer you to Dr. Brad's excellent articles that touch on this subject and more, but they seem to have fallen by the wayside.

Are you referring to this page?

BigDumbDinosaur · Post by **BigDumbDinosaur** » Sun Aug 17, 2025 9:01 pm

gilhad wrote:

Problem: FORTH implementation of stack with TOS in register(s) and empty stack/underflow...I am writing my own FORTH for atmega2560 and I have to decide, how to implement stack, where I want to keep TOS in register(s) for better speed, but I am not sure, how to manage the state, when the stack is empty.

I’m not familiar with the 2560, so I can’t offer any advice on using it. I do know that in 6502 Forth kernels, the X-register often acts as a de facto stack pointer by virtue of its use in accessing a zero-page data stack. Obviously, if something needs to determine if the stack is empty it can do so by comparing .X to a TOS value determined at the time the kernel’s source code is being developed.

leepivonka · Post by **leepivonka** » Sun Aug 17, 2025 9:09 pm

An easy way to implement TOS in register:
Use the stack pointer as if TOS was on the stack.
Empty-stack & Depth work the same way as TOS on stack.
Replace references to TOS on stack with TOS in register.
Remember to update TOS when the stack ptr is changed.

Parameter TOS & NOS on stack =====================================================

Code: Select all

n entry array Parm[n]

Empty_Stack:  ptr = n

Depth:	Push(n-ptr)

Push:	ptr -= 1
	Parm[ptr] = value

Pop:	value = Parm[ptr]
	ptr += 1

Drop:	ptr += 1

Dup:	ptr -= 1
	Parm[ptr] = Parm[ptr+1]

Minus:	Parm[ptr+1] = Parm[ptr+1] - Parm[ptr]
	ptr += 1

Parameter TOS in register, NOS on stack ============================================

Code: Select all

n entry array Parm[n]  (Parm[0] never used)
Use TOS in register instead of Parm[ptr]

Empty_Stack:  ptr = n

Depth:	Push(n-ptr)

Push:	ptr -= 1
	Parm[ptr+1] = TOS
	TOS = value

Pop:	value = TOS
	TOS = Parm[ptr+1]
	ptr += 1

Drop:	TOS = Parm[ptr+1]
	ptr += 1

Dup:	ptr -= 1
	Parm[ptr+1] = TOS

Minus:	TOS = Parm[ptr+1] - TOS
	ptr += 1

Parameter TOS & NOS in register ==================================================

Code: Select all

n entry array Parm[n]  ( Parm[0] & Parm[1] never used)
Use TOS in register instead of Parm[ptr]
Use NOS in register instead of Parm[ptr+1]

Empty_Stack:  ptr = n

Depth:	Push(n-ptr)

Push:	ptr -= 1
	Parm[ptr+2]= NOS
	NOS = TOS
	TOS = value

Pop:	value = TOS
	TOS = NOS
	NOS = Parm[ptr+2]
	ptr += 1

Drop:	TOS = NOS
	NOS = Parm[ptr+2]
	ptr += 1

Dup:	ptr -= 1
	Parm[ptr+2] = NOS
	NOS = TOS

Minus:	TOS = NOS - TOS
	NOS = Parm[ptr+2]
	ptr += 1

Parameter stack without ptr =====================================================

Code: Select all

4 level stack: TOS, NOS, 3rd, 4th (HP RPN calculator style)

Empty_Stack: does not apply

Depth:	does not apply

Push:	4th = 3rd
	3rd = NOS
	NOS = TOS
	TOS = value

Pop:	value = TOS
	TOS = NOS
	NOS = 3rd
	3rd = 4th

Drop:	TOS = NOS
	NOS = 3rd
	3rd = 4th

Dup:	4th = 3rd
	3rd = NOS
	NOS = TOS

Minus:	TOS = NOS - TOS
	NOS = 3rd
	3rd = 4th

gilhad · Post by **gilhad** » Sun Aug 17, 2025 11:19 pm

Thank you all very much, yours answers helped me to find, what I was looking for (both in these answers and in the linked articles, mentioned discusses etc.). I will need some time to read everything and to think it through and then do some tests to see, how it applies to my situation.

I can "move bytes around" technically, but my problem was why to prefere some way and what to consider about it.

Publications by Bradford J. Rodriguez have many answers inside - here the Part 1: Design Decisions in the Forth Kernel is especially what I should read now

The visualisation of empty stack with TOS (NOS...) in registers and stack pointer "out of RAM stack area" is now in my head clear and make perfect sense

As GARTHWILSON wrote somewhere, I realise, that I need to check for stack underflow basically only on INTERPRETER when testing new words and there can be easily more expensive check, as it waits for slow human.

Also I can simply use lot of CANARY values as precausion (like pushing 10 values like DEAD BEAF C0FFEE BABE and then simply see, if they are intact at the stack after my tests).

GARTHWILSON wrote:

When we discussed this (TOS in register) years ago for the 65816 (whose registers can be set to 16-bit width when in native mode), I went through some exercises to see if having TOS in a register would be beneficial, and although there are places where it would be, there are others where it actually adds overhead. So I'd have to actually write an entire kernel and run programs on both kernels to find out if it was a wash or not.

Yes, this is what I am looking for - some discussion about why yes, why not and which way. I just cannot find the discussons easily and there is too much of good discussions here, so pointing me there is really welcome.

BDD: the Atmega is not important here for me, the reasons behind different stack implementations are what I am looking for. I can translate the meaning into code somehow, be it for atmega now, HD6309 later or 6502 after that. The code will not be the same, but the meaning will be and results of measuring the implemantation will be different, but it is OK, I will know, what I am doing and why. Make the stack work fast and effectively with tools I have at hands.

I have curently opened like 20 tabs which I need to read (an I already read some more) and all of that is relevant for me

(very happy), while just before asking here I could not find relevant source by my own

So I am glad you all helped me here so much

BigEd · Post by **BigEd** » Mon Aug 18, 2025 5:59 am

A couple of threads and posts within threads which might be of interest:

Post subject: Re: How text input works with ANSI Forth (with Gforth tricks
Post subject: Re: Using the Y-reg as the IP ptr
Checking for Data Stack under- and overflow on the 65816
Why oh why can't we use the 65816 stack as the Data Stack?

barrym95838 · Post by **barrym95838** » Mon Aug 18, 2025 7:00 am

BigDumbDinosaur wrote:

barrym95838 wrote:

P.S. I was going to refer you to Dr. Brad's excellent articles that touch on this subject and more, but they seem to have fallen by the wayside.

Are you referring to this page?

For reasons above my paygrade, my browser tells me I can't get there from here at present:

BigEd · Post by **BigEd** » Mon Aug 18, 2025 7:02 am

I think that's just for you. Try https://browserling.com/

GARTHWILSON · Post by **GARTHWILSON** » Mon Aug 18, 2025 7:16 am

Interesting. It doesn't work for me either, including on browserling.

BigEd · Post by **BigEd** » Mon Aug 18, 2025 7:32 am

Well that’s something I really can’t explain.

BigDumbDinosaur · Post by **BigDumbDinosaur** » Mon Aug 18, 2025 8:00 am

I just re-checked it and the page loaded, both with my regular browser (Seamonkey) and through Browserling.

FORTH - empty stack with TOS in register(s)

FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)

Re: FORTH - empty stack with TOS in register(s)