Why is pulling from the stack slower than pushing?
-
jeffythedragonslayer
- Posts: 114
- Joined: 03 Oct 2021
Why is pulling from the stack slower than pushing?
Why is pulling from the stack on the 65816 always one cycle slower than pushing that same register to the stack?
I tend to think of these operations as pseudo-inverses of each other. Pushing and then pulling, or pulling and then pushing, generally restores the stack to the state it was before, but only pushing and then pulling keeps the state of the register the same; pulling and then pushing a register discards its old value.
But these operations still seem quite symmetric - what is the extra cycle spent doing during a pull?
I tend to think of these operations as pseudo-inverses of each other. Pushing and then pulling, or pulling and then pushing, generally restores the stack to the state it was before, but only pushing and then pulling keeps the state of the register the same; pulling and then pushing a register discards its old value.
But these operations still seem quite symmetric - what is the extra cycle spent doing during a pull?
Re: Why is pulling from the stack slower than pushing?
jeffythedragonslayer wrote:
what is the extra cycle spent doing during a pull?
(For a push, the decrement of S happens after the write has occurred... and the decrement can be overlapped with the fetch of the next opcode, making it appear to consume zero cycles.)
-- Jeff
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
- commodorejohn
- Posts: 299
- Joined: 21 Jan 2016
- Location: Placerville, CA
- Contact:
Re: Why is pulling from the stack slower than pushing?
That explains a bit; always wondered about that myself.
-
jeffythedragonslayer
- Posts: 114
- Joined: 03 Oct 2021
Re: Why is pulling from the stack slower than pushing?
Very fascinating. I would be interested in seeing what parts of the processor can be engaged simultaneously during a single clock cycle. Since do a lot of adjacent pulls is so slow, I'm finding it useful to create stack unwinding macros that increment the stack pointer back to where it was at the beginning of the subroutine.
Re: Why is pulling from the stack slower than pushing?
jeffythedragonslayer wrote:
I'm finding it useful to create stack unwinding macros that increment the stack pointer back to where it was at the beginning of the subroutine.
Are you talking about exception handling in assembly language? Something like this?
Code: Select all
\ let a subroutine in the nesting
\ return to this subroutine's caller
\ if there is an error
LABEL SOMEROUTINE
...
TSX
STX ERR_RETURN
JSR SOMEWHERE
...
LABEL SOMEWHERE
...
JSR SOMEWHERE_ELSE
...
LABEL SOMEWHERE_ELSE
...
BCC NOERROR
\ oops, there was an error
LDX ERR_RETURN
TXS
LABEL NOERROR
RTS
\ The main routine.
...
JSR SOMEROUTINE
...
-
jeffythedragonslayer
- Posts: 114
- Joined: 03 Oct 2021
Re: Why is pulling from the stack slower than pushing?
Not really - I just wanted to speed up restoring the stack pointer.
Re: Why is pulling from the stack slower than pushing?
You can use TSX and then save X somewhere, so then you can use TXS to restore the previous stack pointer.
Possibly you can even store the previous stack pointer on the stack, after having put the things you want on there.
Possibly you can even store the previous stack pointer on the stack, after having put the things you want on there.
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Why is pulling from the stack slower than pushing?
BigEd wrote:
Possibly you can even store the previous stack pointer on the stack, after having put the things you want on there.
Yes, but if you've changed the stack pointer how would you get back the old one that was written to the stack, a stack whose location was changed after saving the entry stack pointer?
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: Why is pulling from the stack slower than pushing?
I think it's fairly normal, isn't it, to have a stack frame and to push the previous value?
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Why is pulling from the stack slower than pushing?
BigEd wrote:
I think it's fairly normal, isn't it, to have a stack frame and to push the previous value?
Push the previous value to where?
If I arbitrarily change the stack pointer after pushing something, I no longer have the original stack pointer to tell me where to go to pull what I pushed. The only way in which one could use the stack to store the current stack pointer before changing it would be to treat the stack as memory, not a stack. That can be readily done with the 65C816 and some stack pointer arithmetic, but isn’t nearly as simple with the 65(C)02.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: Why is pulling from the stack slower than pushing?
Well, one of us is confused!
-
jeffythedragonslayer
- Posts: 114
- Joined: 03 Oct 2021
Re: Why is pulling from the stack slower than pushing?
I don't arbitrarily change the stack pointer. I use the stack for automatic variables like in C, and I can tell at edit-time how many bytes I have allocated, so stack unwinding becomes simply incrementing the stack pointer up that many bytes.
Re: Why is pulling from the stack slower than pushing?
Had you considered saving and restoring it with TSX and TXS?
-
jeffythedragonslayer
- Posts: 114
- Joined: 03 Oct 2021
Re: Why is pulling from the stack slower than pushing?
Yes that is how one of my stack unwinding macros works.
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Why is pulling from the stack slower than pushing?
jeffythedragonslayer wrote:
Yes that is how one of my stack unwinding macros works.
That's adequate, as long as the function expects a fixed number of elements in the stack frame, and no recursion is involved. Get either one of those into the picture and you will need a more advanced means of processing the stack.
x86? We ain't got no x86. We don't NEED no stinking x86!