6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 21, 2024 10:37 pm

All times are UTC




Post new topic Reply to topic  [ 8 posts ] 
Author Message
PostPosted: Fri Sep 03, 2021 4:05 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
Hi guys, I recently came across this pattern in the Acorn DFS 0.90 ROM (http://mdfs.net/Info/Comp/BBC/DFS/DFS090.zip), and wondered whether any of you have come across or used something like it before, and what you all thought of it.

The idea is to create a routine that you can "JSR" into at the start of one of your own routines, and have it save all the registers and arrange for them to be restored automatically when your routine returns, without your routine having to explicitly pull them from the stack.

I initially thought the purpose was to save code size, as I guess it reduces 10 bytes of prologue/epilogue to just 3 bytes of prologue, and also allows tail calls at the end of routines as there's no longer any epilogue needed. But I'm not sure, because that's not a huge amount of space saving. It also strikes me that this is a kind of continuation, and potentially knowing the exact stack shape might allow it to arrange a kind of deferred return later on, e.g. after a latent operation has finished.

I'm not convinced though, so if anyone has any better ideas of the value, do let me know!

The code below is a reformatting of the disassembled code, with my own label names and comments to try to explain what's going on.

In a nutshell though, on entry we have a return address on the stack as usual - which is usually right at the start of one of the ROM's API entry points. And above that is the return address of the caller of that routine. What this code does is first push all the registers to save their values, and then it shuffles the stack frame around so that there are two return frames on the stack. The deepest frame restores the registers and returns to the caller's caller; and the shallow one restores the registers and returns to the caller itself. Between the two is inserted the address of the new epilogue code.

So after rearranging the stack, this fragment's epilogue runs, restoring the registers and returning to the immediate caller; but leaving this epilogue's own address on the stack. So when the caller returns, now it runs this epilogue again, which restores the register from the deeper stack frame, and returns to the caller's caller.

Code:
saveregs:
                               ; stack contents:
                               ;     ...
                               ;     callercaller_hi
                               ;     callercaller_lo
                               ;     caller_hi
                               ;     caller_lo
    PHA                        ;     A
    TXA
    PHA                        ;     X
    TYA
    PHA                        ;     Y
    LDA #>(restoreregs-1)
    PHA                        ;     restoreregs_hi
    LDA #<(restoreregs-1)
    PHA                        ;     restoreregs_lo

   ; duplicate A,X,Y and the caller address
    LDY #&05
loop1:
    TSX
    LDA &0107,X
    PHA
    DEY
    BNE loop1
                               ; now also another copy each of:
                               ;     caller_hi
                               ;     caller_lo
                               ;     A
                               ;     X
                               ;     Y

    ; copy A,X,Y,restore_hi,restore_lo,caller_hi,caller_lo,A,X,Y up two positions,
    ; overwriting top copy of caller_hi, caller_lo, and leaving extra copies
    ; of X, Y still pushed at the end
    LDY #&0A
loop2:
    LDA &0109,X
    STA &010B,X
    DEX
    DEY
    BNE loop2

    PLA                        ; discard extra copies of X and Y at the end of the stack
    PLA

    ; When we fall through here, it restores Y,X,A and returns to the caller
    ; When this gets called a second time via the caller's RTS, it restores Y,X,A and returns to the caller's caller
restoreregs:
    PLA
    TAY
    PLA
    TAX
    PLA
    RTS


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 03, 2021 7:26 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
Wow the code in the zip file is hard to look at! Without more description of its goal, I'd just say there's a place for similar things but that for most subroutines it would be considered a ton of unnecessary overhead.

You'll be interested in the topic at viewtopic.php?f=2&t=4562, "Structure: Best Practices? Subroutines vs. Main Program."

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 03, 2021 7:55 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
GARTHWILSON wrote:
Wow the code in the zip file is hard to look at! Without more description of its goal, I'd just say there's a place for similar things but that for most subroutines it would be considered a ton of unnecessary overhead.

Yes, the zip is mostly just a disassembly, with very few annotations. It's the main disc filesystem driver for the BBC Micro, so it implements a wide range of OS APIs, handles how the filesystem is stored on the disc (directories, files, etc) and also deals directly with the disc controller hardware.

As you said, it is a lot of overhead, if the only purpose is to save a few bytes in each subroutine.

Quote:
You'll be interested in the topic at viewtopic.php?f=2&t=4562, "Structure: Best Practices? Subroutines vs. Main Program."

Ah yes that looks very interesting, thank you!


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 03, 2021 8:15 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
gfoot wrote:
the zip is mostly just a disassembly

Oh ok, that would explain the unreadable nature.

Quote:
As you said, it is a lot of overhead, if the only purpose is to save a few bytes in each subroutine.

Just the overhead there is longer than the majority of my entire subroutines.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 04, 2021 10:54 pm 
Offline

Joined: Tue Feb 24, 2015 11:07 pm
Posts: 81
I quite like the idea. A "JSR saveregs" at the start of a subroutine so I don't have to think about preserving registers (and maybe even the carry flag) at the moment. Then come back and then optimize later. It's a shame that the approach precludes using the registers for return values.

I especially like this in the linked discussion:
sark02 wrote:
I expect a subroutine to clobber A, X and Y unless it's an explicit feature of the subroutine to not do that. If you write a subroutine that promises not to clobber Y then, if it later evolves to do so, it's its responsibility to save/restore Y to maintain the promise.


I really struggle to settle on a policy for registers. At the moment, I use A and Y for parameters and expect them to be clobbered. It seems to me that parameters to subroutines are not often used after the JSR and can be saved by the caller when they are. Even though I don't use Forth, I use X as the data stack pointer, keeping it in "save_x" (in ZP) on the rare occasion when code needs X such as for the x,indirect addressing mode (indirect pointer in ZP at X). Extra parameters are then on the data stack at X+1, X+2, etc. X is mostly preserved naturally by the code by balanced DEX and INX, though branches can sometimes cause bugs here. The carry flag is useful as a return value if the caller will branch based on the return value. It's a relief to remember that leaf subroutines don't have to worry about being re-entrant.


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 05, 2021 7:26 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
unclouded wrote:
I really struggle to settle on a policy for registers. At the moment, I use A and Y for parameters and expect them to be clobbered.

In my own code I don't have a general policy, I adapt for the needs of what I'm writing.

Some subroutines are intended to have no side-effects, to make them ready to drop in - especially things like text and number output routines which are useful to add and remove during debugging.

Apart from those, I evolve local policies depending on what the code does. A lot of my subroutines are only called in one or a few places, so it's simple to make the caller responsible for saving anything it cares about which is also clobbered (no need to save things that aren't used in the called routine) but if all callers would need to save the same registers, and it would make the code more readable, I may save them in the callee instead

Quote:
The carry flag is useful as a return value if the caller will branch based on the return value. It's a relief to remember that leaf subroutines don't have to worry about being re-entrant.

Flags for binary returns are really useful. I never save the flags through a subroutine call. I often use A for returned values, but not if the subroutine needs to save other registers in the stack as my recent code has been for the BBC Micro and so PLX/PLY are not available. So in those cases, returning in X or Y is more straightforward. Again, local policies which are fine when routines are only used internally in a few places.

I've considered suffixing labels with details of which registers a subroutine clobbers, but not gone that far yet.


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 05, 2021 5:27 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
In my VTL02 interpreter I use X for a global ZP pointer and Y for a global statement offset pointer. Right before or right after interpreting a statement they're fair game, because moving on to the next statement (either program or command line) re-initializes them. Doing things this way is compact, but a little bit bug-prone. IIRC, all of my bugs while developing VTL02 involved not taking proper care of Y.

I have streamlined VTL02C as a back-burner project (smaller and slightly faster), but it's currently buggy, and I'm willing to bet it has something to do with Y ... I should have thoroughly tested my incremental changes more often, but hindsight's 20/20 ...

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 05, 2021 7:37 pm 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 730
Location: Tokyo, Japan
I can't say that the idea of spending so much to automate pushing and pulling of just three registers looks that useful to me, given that on most 8-bit CPUs you're constantly in a state of "what's my state" when programming anyway. But I suppose there are circumstances where that sort of thing might be useful, at least as a temporary thing. Actually, come to think of it, certain debugging situations do come to mind.

But what triggered my interest here was:
gfoot wrote:
It also strikes me that this is a kind of continuation, and potentially knowing the exact stack shape might allow it to arrange a kind of deferred return later on, e.g. after a latent operation has finished.

I've been experimenting a bit with this sort of thing in the last little while. I've not ported it over to 6502 yet, but you can some ideas in some 6800 code I consed up. I'm not entirely sure of the utility of this yet, but I do use continuations in a somewhat different form in my monitor, where I have both a return specifically set up to be a different location than the "JSR" (not actually that instruction) was done, and also store the stack pointer at a certain point to allow an unwind to that point from any arbitrary depth without having to know the depth.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: