Reentrant Programming: Small versus Big Software Stacks

barrym95838 · Post by **barrym95838** » Fri Jan 26, 2018 6:27 am

Martin_H wrote:

The second is a 16 bit stack pointer in page zero that grows downward from ram top: https://github.com/Martin-H1/Lisp65/blo ... gstack.asm

You might end up encountering some difficulties assembling those stz (_sp),y instructions

Mike B.

GARTHWILSON · Post by **GARTHWILSON** » Fri Jan 26, 2018 6:28 am

Druzyek wrote:

Quote:

You can save X on the stack too, but then take that into account when formulating the number to add the index to.

Would you do that by having the macro assign a temporary value that is added to the hard coded base address and replace all push and pull arguments with macros that increase or decrease that value? It seems like it would be tough to keep adding and subtracting to the base yourself, especially if you go back and add a push or pull and have to redo the following ones.

I haven't thought about how to have a macro do that. I guess I'd like to get more code done with it to identify a pattern so the macro truly clarifies things. However, the following is from the parameter-passing chapter of the 6502 stacks treatise, about 30% of the way down the long page:

- If you want to access that output data that's on the stack before pulling it off, perhaps even in arbritrary order, remember that X is still valid from the subroutine using it for indexing into the stack, as shown below. If you've done TSX again, adjust your indexed numbers to account for the fact that the return address is no longer there, and in the case below, that there was a PLA after the TSX such that now after the subroutine return, the outputs are at 101,X to 104,X. If you have not done TSX again, the old indexed numbers will still be valid.
  
  The example above calls the UM_STAR subroutine below which takes Bruce Clark's improvement on my commented bug fix on the UM* multiplication in fig-Forth, and it modifies it for I/O on the hardware stack. The structures are per my structure macro article and source code linked there. They assemble exactly the same thing you would by hand, but make the conditions and branches in the source code clearer.
  Code: Select all
```
UM_STAR: LDA #0                 ; Unsigned, mixed-precision (16-bit by 16-bit input, 32-bit output)
         PHA                    ; multiply.  Add a variable byte to the stack, initializing it as 0.
         TSX                    ; Now 101,X holds that new variable, 102,X and 103,X hold the return
         LSR $107,X             ; address, and 104,X to 107,X holds the inputs and later the outputs.
         ROR $106,X
         FOR_Y  16, DOWN_TO, 0  ; Loop 16x.  The DEY, BNE in NEXT_Y below will drop through on 0.
             IF_CARRY_SET
                 CLC
                 PHA            ; Note that the PHA (and PLA below) doesn't affect the indexing.
                    LDA $101,X
                    ADC $104,X
                    STA $101,X
                 PLA
                 ADC $105,X
             END_IF
             ROR
             ROR $101,X
             ROR $107,X
             ROR $106,X
         NEXT_Y
         STA $105,X
         PLA                    ; Retrieve the variable byte we added at the top, cleaning up the stack.
         STA $104,X             ; Again note that the PLA changed S but not X, so the 104 is still 104.
         RTS
 ;------------------
```
  (Note: In many situations, it will be advantageous to give names to the items on the stack, using EQUates, so it's more clear what 101,X is, what 102,X is, 103,X, and so on. (But don't forget to still put the ,X after them.) More on that later.

Druzyek · Post by **Druzyek** » Fri Jan 26, 2018 8:12 am

Quote:

I haven't thought about how to have a macro do that.

I'm not sure if the ca65 macros, which are really great, could do it either. I could not get it to count how many arguments I needed in an argument and then push that many arguments, regardless of where the function was called. Instead, I pushed the count of arguments to the stack so my loop takes a few extra cycles. I do think it would be very doable with a custom macro system. I wonder if I could pass a source through a macro program without assembling then change some things and run it back through. That would be extremely powerful without having to write a new macro system.

Quote:

Note that the PHA (and PLA below) doesn't affect the indexing

This is really slick. I think what I'm afraid of with this kind of stack stuff is needing nested X and Y loops that also need to use X to index into the stack and the value of X for something else. It seems like you would have a lot of switching overhead.

Would you tend to use X indexing for everything until you absolutely need the extra speed? What do you like about doing it this way over zero page with no X index?

GARTHWILSON · Post by **GARTHWILSON** » Fri Jan 26, 2018 8:59 am

Something I liked about the 2500AD assembler I used at work from 1986 to about 1990 was that it allowed you to say in a macro definition in essence "if there's a fourth macro parameter, do this with it;" or "If there's a fifth one, do the following with it." The C32 assembler I use now requires you to use the same number of parameters in the macro invocation that the macro was written to use, although I suppose you could pad unused ones with 0's or whatever value would guarantee you're not using it (although that doesn't help readability).

I have never used a macro that I myself did not write though. They're very quick to write, assuming you know what you want. They can also have any amount of conditional assembly you could want. As much as I've used and written about them, I think there's still a lot more that macros could be used to do that I haven't thought of yet, or at least haven't thought of how to do, like parsing a string. Of course, it would depend on the particular assembler's macro capabilities, how powerful its macro language is. (Most assemblers are pretty similar in this respect though.) This is all part of extending the power and capabilities of assembly language. I don't think we're anywhere near having maxed it out yet. We just need to get more creative and resourceful and share our developments to build upon.

Much of what's been talked about in the preceding posts was about passing parameters on the hardware stack, C-style. Sometimes that will be appropriate; but it does come with penalties, as I describe in the 6502 stacks treatise. Virtually all of my own parameter-passing by stacks goes on in the ZP data stack, using X as the stack pointer. This is in Forth though, so I very seldom need X for anything else. When I do, I save and restore it. As they say, you can get 95% of max performance with only 5% of your code in assembly, provided it's the right 5%, the critical 5%. In Forth it's very easy to dip into assembly when you need the performance or the tightest control of the machine. In those cases, I have to remember to save and restore X if I use it for something besides the data-stack pointer.

Druzyek · Post by **Druzyek** » Sat Jan 27, 2018 1:56 am

Quote:

The C32 assembler I use now requires you to use the same number of parameters in the macro invocation that the macro was written to use, although I suppose you could pad unused ones with 0's or whatever value would guarantee you're not using it (although that doesn't help readability).

I swear no one is paying me to shill for the ca65 assembler, I am just really impressed. You can supply less than the number of arguments the macro was written to use and compile conditionally based on whether an argument is blank or how many arguemtns it receives: http://www.cc65.org/doc/ca65-12.html

Quote:

This is all part of extending the power and capabilities of assembly language. I don't think we're anywhere near having maxed it out yet. We just need to get more creative and resourceful and share our developments to build upon.

Could be. Maybe I am premature in saying you can't do what I want to with macros. There are two problems I have not been able to see how macros could solve without being designed specifically to do it: