6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Wed Apr 24, 2024 11:59 am

All times are UTC




Post new topic Reply to topic  [ 21 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Sun Mar 01, 2015 9:11 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Although I've not posted recently regarding my M65C02A processor core, I am still working on the core. I've implemented the Forth VM instructions and IP/W module as discussed on another thread. Testing of the core's instructions specific Forth VM have not been completed.

I've been distracted from that task by the task of documenting the overall architecture of my M65C02A core with all of its bells and whistles added. Along the way, I decided that I wanted to ensure that the core would provide support for the 65816 MVP/MVN instructions in an interruptable manner which will require saving the instruction state in a recoverable manner on the internal microprogram stack. In addition, I wanted the core to provide improved support for languages such as C and Pascal which rely on stack frames.

To understand this issue better, I've undertaken to port Ron Mak's Pascal compiler to the M65C02A. In the process, I've come to the conclusion that to effectively support the stack frame required by Pascal and C, the stack pointer relative addressing mode I implemented (taken from the 65816) would be best if it was implemented as a base pointer (BP) relative addressing mode.

In other words, I think that marking the stack on entry into a subroutine by pushing the current BP and copying the stack pointer into a BP would provide the necessary addressing infrastructure to more efficiently support C and Pascal on the core. I am thinking that the X register is the natural choice for the BP.

The issue, as I see it, with the stack-relative addressing mode is that of keeping track of the offsets to parameter and local variables as the system stack grows storage during execution. The solution implemented by most processors is to dedicate a register which can be used to mark the stack. Parameters and external variables (a la Pascal) are reference with positive offsets from this base register. Local parameters and temporary storage are accessed as negative offsets from the base register. With the stack being a natural place to allocate temporary storage during execution on the 6502/65C02, I think that the utility of the stack relative addressing mode will be limited.

Thus, I am thinking that ORA/ANL/EOR/ADC/STA/LDA/CMP/SBC s,S might be better implemented as ORA/ANL/EOR/ADC/STA/LDA/CMP/SBC b,B, where B is the X register. With X functioning as the base pointer, temporaries can still be used without affecting the (constant) offset to the functions parameters and local variables. In addition, the post-indexed stack-relative indirect addressing mode would become the post-indexed base-relative indirect addressing mode: ORA/ANL/EOR/ADC/STA/LDA/CMP/SBC (b,B),Y. This addressing mode would be somewhat unique for the 6502/65C02 architecture in that both index registers would be used in its implementation. When coupled with the register override prefix instructions, I see that a base-relative addressing mode, even one limited by using only a signed 8-bit offset (a la Bcc instructions), would provide improved access to parameters and variables on the stack (or even on the M65C02A's auxiliary stack).

I am unfamiliar with the implementation of C on the 6502, and so I am looking for comments of the basic concept and from those familiar with a 6502/65C02/65816 C implementation. My familiarity with a C on an 8-bit microprocessor is only for Dunfield's Micro-C and the Kiel's C implementations for the 8051 architecture.

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 02, 2015 1:48 am 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3346
Location: Ontario, Canada
Quote:
I've been distracted from that task by the task of documenting the overall architecture of my M65C02A core with all of its bells and whistles added.
Ah yes, the documentation -- this can be some of the hardest work of all. :| Glad you're making the effort, though, Michael! :D

Quote:
I am looking for comments of the basic concept and from those familiar with a 6502/65C02/65816 C implementation.
I am not such a person! But can I ask why a well-placed TSX instruction doesn't answer the base-pointer problem? Of course I realize it'd be faster if a TSX weren't needed, but is the benefit worth pursuing? (Maybe I'm missing something.)

cheers,
Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 02, 2015 3:18 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8141
Location: Midwestern USA
Upon entry into a subroutine in which local stack storage will be used, the stack pointer can be an implicit base pointer. For example, if the sub needs to use 10 bytes of stack space for a local stack frame, the procedure on the 65C816 is:

Code:
         rep #%00110000        ;16 bit registers
         sec
         tsc                   ;current stack pointer
         sbc #$000a            ;create 10 byte local stack frame
         tcs                   ;change stack pointer

The base of the stack frame is at SP+1, and any byte in the stack frame can be addressed with LDA <byte_num>,S, where <byte_num> is 1-based.

If you wish to maintain an absolute pointer to the base of the stack frame and read or write elements via direct addressing instead of stack pointer relative, add the following to the above code:

Code:
         tax                   ;adjusted stack pointer
         inx                   ;points to bottom of local stack frame
         ---
         sep #%00100000        ;8 bit accumulator
         lda $00,x             ;reads 1st byte in stack frame
         inc $02,x             ;increments 3rd byte in stack frame
         etc.

If necessary, define an extra two bytes at the top of the local stack frame to stash .X if it is temporarily needed for something else.

Is this what you are thinking?

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 02, 2015 3:56 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
I think some higher-level languages like 'C' will make use of zero page storage as a substitute for registers, since the '02 isn't register oriented. The frame pointer, stack pointer, etc would be located in zero page memory and the higher-level language implemented using a kind of an inline virtual machine. So assuming the frame pointer is in zero page memory, then stack elements would be accessed using (zp),y mode.

Slightly worse code for accessing parameters and locals relative to a frame pointer in zero page memory would be something like:

Code:
    rep #%00110000        ;16 bit registers
    lda framePointerinZp   ; stack the current frame pointer
    pha
         tsc
         sta framePointerInZp  ; set frame pointer = stack pointer
         sec
         sbc #$000a            ;create 10 byte local stack frame
         tcs     

        ; access parameters / vars on stack
         ldy #$00          ; index to first byte of stack frame
         lda  (framePointerInZp),y   ; get 16 bit word frame relative

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 02, 2015 8:38 am 
Offline

Joined: Sat Mar 27, 2010 7:50 pm
Posts: 149
Location: Chexbres, VD, Switzerland
Personally I came to the conclusion the only reasonable alternative to implement "stack-frames" on 6502 family processor would be to rely on stack-frame allocation at link time rather than at run-time, and suppress the effective stack addressing as much as possible by using fixed location, unless it is not possible to do so. You could also make automatic use of ZP as often as possible for those overlapping fixed locations. I have a pretty clear idea how this should be done and should result in reasonably efficient code, however I currently lack the time/skill to implement this in the backend of a well known open source compiler, since those are quite complex.

Note that it might be different on the '816. Personally I'm no '816 expert, but I'd say you should put good use the relocatable zero-page, and make it point to your stack-frame, keeping X register indexing the top of the stack frame. Whenever you push N bytes, you decrease the X register by N, and if X underflows you should make the ZP pointer point to a new page and adjust X accordingly. This would definitely not be as efficient as link-time stack allocation, but might be simpler and a reasonable tradeoff between efficiency and simplicity.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 02, 2015 2:25 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Thanks everyone for your responses. I think everyone is in general agreement that a stack frame is required. However, the question that I was hoping would be addressed centers around whether accessing the stack frame would be best done using the stack-relative addressing mode of the 65816, or a different addressing mode which I am calling the base pointer relative addressing mode.

In the M65C02A core, I have implemented the stack relative addressing mode originally defined in the 65816 with the exception that the stack offset is 0 based rather than 1 based. Since the stack relative addressing mode is not a 6502/65C02 addressing mode, I opted to compensate for the -1 offset in the stack pointer by adding 1 to the address calculation; the extra addition is free in the M65C02A because it is provided as part of the microcode field controlling the address generator. In addition, a 0 offset for the top element of the stack just appeals to me. :)

It appears to me that re-entrant functions must use storage on the stack rather than in zero page. With the paucity of registers in the 6502/65C02 architecture, I think that the stack pointer will not remain fixed as temporary storage is pushed and popped from the stack. If the routine uses zero page or other memory for this type of temporary storage, then implementing re-entrant functions would become more difficult.

Thus, I am interested in reactions to my proposed base pointer relative addressing mode where a single byte following the instruction is specified to be a signed offset. Positive offsets would be used to reference the functions parameters on the stack, and negative offsets would reference local variables. Use of the stack pointer to dynamically create temporary storage would not have an affect on the pre-calculated offsets to the parameters or the local variables.

An example program using base pointer relative addressing is provided below:

Code:
;   Assume the stack frame is defined as: (all values are two bytes)
;
;       Pn                  ; BP + 2 * (N - 1) + 6, Parameter #N
;       
;       P1                  ; BP + 6, Parameter #1
;       SL                  ; BP + 4, Static Link
;       Return Address      ; BP + 2,
;       DL                  ; BP + 0, Dynamic Link = Previous BP
;       L1                  ; BP - 2, Local Variable #1
;   

Function_prolog:
        SIZ PHX             ; push current 16-bit BP (assume all stack entries 16-bit)
        SIZ TSX             ; capture SP in X as new BP (including current page)
       
        PHW #0              ; allocate storage on stack and initialize to 0

Function_code:
        SIZ LDA 6,B         ; Load Parameter #1
        CLC
        SIZ ADC 8,B         ; Add Parameter #2
        SIZ STA -2,B        ; Store Result in Local #1
        SIZ LDA 10,B        ; Load Parameter #3
        SIZ TAY             ; Transfer to Y as an index
        SIZ LDA (-2,B),Y    ; Load Value from array pointed to by Local #1
        SIZ STA 12,B        ; Store Return Value

Function_epilog:
        SIZ TXS             ; remove all locals and temporary storage from stack
        SIZ PLX             ; remove prev BP from stack
        RTS                 ; exit function

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 02, 2015 3:12 pm 
Offline

Joined: Sun Nov 08, 2009 1:56 am
Posts: 387
Location: Minnesota
As BDD and Rob point out, it's already possible to adjust the stack pointer in a way that allocates local variable space that allows access to them using unsigned offsets. If the accumulated size of putting that sort of code in every routine became a concern, I might make it a subroutine with an argument of how many bytes I wanted.

But if you're set on signed offsets, you've already slightly adjusted the behavior of stack-relative addressing in your core. Might you go all the way and make signed offsets the normal behavior for stack-relative addressing? That would lose access to the upper half of the unsigned range, but how often would that really be a problem? Or you might make the offsets signed or unsigned depending on a processor flag bit or opcode.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 02, 2015 4:22 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1925
Location: Sacramento, CA, USA
I love the idea of signed offsets, but I'm a bit eccentric, so I shouldn't try to speak for everyone.

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 02, 2015 5:38 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8141
Location: Midwestern USA
barrym95838 wrote:
I love the idea of signed offsets, but I'm a bit eccentric, so I shouldn't try to speak for everyone.

Mike B.

Pardon me for being dense (after all, I'm just a big dumb dinosaur), but I'm not seeing any benefit to signed offsets. The stack pointer always points to the first unused location on the stack, which is below the active part of the stack. So any stack addressing that would be going on would be using positive offsets. You can't use stack space below the stack pointer without decrementing the latter. Otherwise, subroutine calls and interrupt processing would step on whatever is being stored at or below the stack pointer.

I think the 65C816's method of stack pointer relative addressing makes perfect sense, as it is based entirely upon a positive offset, making the pointer arithmetic very simple to implement. Yes, adding a stack frame base register would make stack frame manipulation a little more convenient, but not that much more.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 02, 2015 10:39 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8427
Location: Southern California
Hopefully this won't be repeating things. If it is, maybe a different way of putting it will give further ideas and clarity.

MichaelM wrote:
It appears to me that re-entrant functions must use storage on the stack rather than in zero page.

As you know, Forth uses a virtual stack in ZP for data, to avoid certain overhead and problems that crop up with trying to mix data and return addresses on the hardware stack. It works super well; and being a stack, it automatically accommodates re-entrancy. Although it takes up precious ZP space, it also reduces the need to have so many ZP variables; so I think it pays for itself in that respect.

Quote:
With the paucity of registers in the 6502/65C02 architecture, I think that the stack pointer will not remain fixed as temporary storage is pushed and popped from the stack. If the routine uses zero page or other memory for this type of temporary storage, then implementing re-entrant functions would become more difficult.

A virtual stack in ZP actually makes it easier. But if you want it all on the hardware stack, what you do, even on the '02, is to adjust S to accommodate all the needed temporary storage, either by pushing dummy bytes (like with the appropriate number of PHA's before the TSX) which will later get used for temporary variables, or if you need a lot of them, it may be more efficient to do TSX, TXA, SEC, SBC#nn, TAX, TXS. (BDD showed an example of the latter for the '816 further up.) Then the new S is already in X, ready for indexing. (If you want to index with Y instead, add a TAY.) Now interrupts won't interfere with the locals on the stack. A bonus is that subsequent changes in S will not affect the operand used in instructions that index into the stack, because the changes in S are not reflected in X unless you do TSX again. For example suppose you need to preserve both X and A as you get S into Y on the '02:
Code:
        PHA
           PHX
              TSX
              TXA
              TAY
           PLX
        PLA

        LDA  105,Y          ; Access a particular parameter
        <do_stuff>
        <push n bytes onto the stack>
           LDA  105,Y       ; Access the same parameter again as 3 lines up.
           <do_stuff>
        <pull n bytes back off the stack>
        STA  105,Y          ; Access that same parameter again.

If you only do TSX first instead of the first seven lines above, the 105,Y becomes 103,X. The stack depth and the position of the variable are the same in both cases, but the index value difference will need to be compensated in the operand. There can be subsequent pushes and pulls in the subroutine without affecting that 105,Y to access the particular parameter in the stack. This BTW is on the '02, not even the '816 which has TSC and TXY instructions.

Quote:
Thus, I am interested in reactions to my proposed base pointer relative addressing mode where a single byte following the instruction is specified to be a signed offset. Positive offsets would be used to reference the functions parameters on the stack, and negative offsets would reference local variables.

I can see where that may have value, as long as the BP is not tied directly to S, being loaded with perhaps a TSB instruction. (Woops-- that mnemonic is taken for "Test and Set Bits." You'll need another one.) Without re-reading again, it seems related to my desire to have a second hardware stack, where instructions for stack operations would have a bit telling which hardware stack to operate on. The two hardware stack areas could overlap, as long as certain rules are observed. The only change needed to do the signed stack offset on the 816's stack-relative addressing modes is to have a 16-bit "Sr" operand so it wraps (which it would do since the stack is always in bank 0), but there would again be the problem with interrupts or even subroutines stepping on the local variables. Another way to implement the signed-offset idea with what's already in place is to do it in macros that would adjust the numbers and assemble all positive ones and do the same job.

Quote:
Use of the stack pointer to dynamically create temporary storage would not have an effect on the pre-calculated offsets to the parameters or the local variables.

It doesn't anyway, as shown (possibly poorly) above.

Bregalad wrote:
Note that it might be different on the '816. Personally I'm no '816 expert, but I'd say you should put good use the relocatable zero-page, and make it point to your stack-frame, keeping X register indexing the top of the stack frame. Whenever you push N bytes, you decrease the X register by N, and if X underflows you should make the ZP pointer point to a new page and adjust X accordingly.

Note that the 816's direct-page register is 16-bit, and the direct page does not have to start at a page boundary. If you want to increment or decrement it by some arbitrary number like 4 or 7, you can.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 03, 2015 1:00 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
Quote:
Thanks everyone for your responses. I think everyone is in general agreement that a stack frame is required. However, the question that I was hoping would be addressed centers around whether accessing the stack frame would be best done using the stack-relative addressing mode of the 65816, or a different addressing mode which I am calling the base pointer relative addressing mode.


Base pointer addressing mode is one of the most used modes in other processors. It's definitely the way to access information on the stack, so I would say use a base pointer relative mode. But it requires another register. The samples above are the 6502's way of getting around the fact it doesn't have a base pointer register. Using base pointer addressing is starting to get outside the solution domain of the 6502; using the '02 in a way it was never intended. One could also make an argument for other registers such as a global pointer register or a thread register, but once again these are available in other processor's which are more powerful and are available to solve different problems.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 03, 2015 2:25 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1925
Location: Sacramento, CA, USA
BigDumbDinosaur wrote:
barrym95838 wrote:
I love the idea of signed offsets, but I'm a bit eccentric, so I shouldn't try to speak for everyone.

Mike B.

Pardon me for being dense (after all, I'm just a big dumb dinosaur), but I'm not seeing any benefit to signed offsets. The stack pointer always points to the first unused location on the stack, which is below the active part of the stack. So any stack addressing that would be going on would be using positive offsets. You can't use stack space below the stack pointer without decrementing the latter. Otherwise, subroutine calls and interrupt processing would step on whatever is being stored at or below the stack pointer.

I think the 65C816's method of stack pointer relative addressing makes perfect sense, as it is based entirely upon a positive offset, making the pointer arithmetic very simple to implement. Yes, adding a stack frame base register would make stack frame manipulation a little more convenient, but not that much more.

Yeah, I was certainly thinking of X, Y, PC (and A?!?!) as negative-offset-capable index registers, as well as any other auxiliary registers MichaelM would like to add to his core. Doing so with S would be fraught with danger, but could conceivably be of some use to an expert who likes to live on the edge. If I was going to implement it, I would allow all registers except P to participate (in the interest of orthogonality) and let the programmer decide what was useful or not.

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 03, 2015 7:38 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8141
Location: Midwestern USA
Bregalad wrote:
Note that it might be different on the '816. Personally I'm no '816 expert, but I'd say you should put good use the relocatable zero-page, and make it point to your stack-frame, keeping X register indexing the top of the stack frame. Whenever you push N bytes, you decrease the X register by N, and if X underflows you should make the ZP pointer point to a new page and adjust X accordingly.

However, there are potential booby traps in pointing direct page to the stack. Firstly, you have to be careful that stack activity outside of your routine doesn't scribble all over direct page, which could happen when the '816 is interrupted, or when a function higher up does something. What you have to do to avoid such goings-on is to lower the stack pointer enough to provide local storage and then load the adjusted stack pointer plus one into DP. You may also want to save the entry value of DP for later restoration, viz:

Code:
         rep #%00110000        ;16 bit everything
         sec
         tsc                   ;get current stack pointer
         sbc #$20              ;make room for 32 local bytes
         tcs                   ;set new stack pointer
         phd                   ;save DP for later...note that...
                               ;DP will be pushed below the...
                               ;local storage
         inc a                 ;bottom of local storage
         tcd                   ;relocate direct page to it
;
;   —————————————————————————————————————————————————————————————————————
;   At this point, any direct page accesses will be reads & writes to the
;   the stack, not the real direct page.  You will not be able to read or
;   write data that is on the real direct page.
;   —————————————————————————————————————————————————————————————————————
;
         lda $02               ;loads from SP + 5, not $02 in direct page
         tsb $06               ;sets bits at SP + 9
         lda ($00,x)           ;indirect load from SP + 3
         ...etc...
;
;   ————————————————————————————————————————————————————————————————
;   When the routine no longer needs the relocated DP, the following
;   code will put things back the way they were.
;   ————————————————————————————————————————————————————————————————
;
         pld                   ;restore old DP value
         clc
         tsc                   ;current stack pointer
         adc #$20              ;discard local storage
         tcs
         ...etc...

The other booby trap isn't as obvious, but is even more insidious. Suppose your interrupt service routine needs access to the real direct page because, say, a buffer index used by a device driver happens to be there. The interrupt from the device arrives, but when the ISR goes to access the index, it instead reads either from the stack, depending on how far up the direct page address range the index is located, or from somewhere else in absolute RAM. This is because even though you may have only allocated 32 bytes on the stack for your ephemeral direct page, as shown above, direct page actually extends from SP+3 to SP+258.

Therefore, your ISR must save DP on the stack along with other MPU registers and then reload DP with the real direct page starting address. At the completion of the ISR and before RTI, DP must be restored from the stack to what was set by the foreground task. Otherwise, at best, you are going to have a major mess on your hands. Most likely the machine will crash.

GARTHWILSON wrote:
Note that the 816's direct-page register is 16-bit, and the direct page does not have to start at a page boundary. If you want to increment or decrement it by some arbitrary number like 4 or 7, you can.

Also note that if direct page does not start on an even page boundary, that is, DP has been set to $xxYY, where YY is not $00, an extra Ø2 cycle will be used for all direct page loads and stores, essentially erasing the performance advantage of direct page. This will not affect the ability to use direct page indirect instructions, such as LDA (<dp>) or STA [<dp>],Y, just the speed at which they are executed.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 03, 2015 1:29 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
barrym95838 wrote:
Yeah, I was certainly thinking of X, Y, PC (and A?!?!) as negative-offset-capable index registers, as well as any other auxiliary registers MichaelM would like to add to his core. Doing so with S would be fraught with danger, but could conceivably be of some use to an expert who likes to live on the edge. If I was going to implement it, I would allow all registers except P to participate (in the interest of orthogonality) and let the programmer decide what was useful or not.

In the question being discussed, for a set of opcodes representing an addressing mode not present in the 6502/65C02, I was looking to use X as the base pointer instead of S. With X as the base pointer, the offset contained in the instruction would be treated as a signed number.

Big Dumb Dinosaur wrote:
Pardon me for being dense (after all, I'm just a big dumb dinosaur),
Not hardly. :)
Big Dumb Dinosaur wrote:
but I'm not seeing any benefit to signed offsets.
It's more than likely me who is seeing boogey men where there are none. Consider a routine with some parameters placed on the stack by the caller, and some local variables allocated on entry to the function on the stack. I clearly see that at this point all of the (positive) offsets from the stack pointer are known. However, let's assume that one of the local variables requires the evaluation of a complex expression which will require one or more temporary storage locations to hold the partial results of the expression.

If this evaluation of the expression was being done by a assembly language programmer, I would agree that prior to the start of the evaluation of the expression, the programmer would allocate additional space on the stack for each of the expression's partial results. Further, I agree that with this type of preallocation of stack space, all of the offsets from the top of stack, i.e. S, are known. In which case, the stack relative addressing mode of the 65816 should work just fine.

However, in a compiler or interpreter for languages such as C or Pascal, the allocation of temporary registers (or stack variables) proceeds blindly from the inner most operation and partial result to the outer most operation and final result. With a register rich processor, a register allocation routine would allocate registers for temporary and partial results until the available registers are exhausted. At which time, some of the temporary values held in registers would be "spilled" onto the stack. When this happens, all of the offsets that the compiler has been using to access the function's parameters and local variables must be adjusted for the number of elements pushed onto the stack.

Perhaps a compiler with some lookahead or a "smarter" expression evaluator would be able to pre-determine the number of temporary variables that would be required for each expression found in the source code. However, it is more likely that optimization is performed after the expression has been evaluated using the reliable, brute force approach which relies on spilling partial results onto the stack whenever necessary. With this type of expression evaluator, I believe that it is more advantageous to mark the stack using a base pointer soon after entry to the function. Local variables would be negative offsets from this base, and parameters would be positive offsets. (Note that it is possible to use only positive offsets by simply waiting to mark the stack until after the local variables had been allocated.)

All of the techniques discussed in the responses are all valid approaches, the intent of my question was to solicit responses as to whether accessing the stack frame for a language like C or Pascal would best be performed using a base pointer or the stack pointer. Since the addressing mode I posed the question about is not a native 6502/65C02 addressing mode, I'm not that concerned about its definition with respect to the standard addressing modes of the 6502/65C02. From the responses posted, there appear to be three types: (1) use existing instructions and addressing modes; (2) use stack relative addressing, and (3) use base pointer addressing.

I think that I'm going to pursue using the third type with X as the base pointer. Thus, column 3 instructions will use a signed offset from the X register to access variables. To support this mechanism, the function prolog must push the current 16-bit value of X and then load the 16-bit value of S into X. The function epilog must reverse these operations.

The M65C02A core provides a 3 register stack for A, X, and Y. Further, the top-of-stack register for X also provides the programmer a second HW stack. The OSX prefix instruction is required to access this feature. When the auxiliary stack is being actively used, the next-on-stack register of the X register stack can function as the base pointer; the OAX XCH instruction sequence will swap the TOS and NOS registers of the X register stack. These instructions and the register stack were discussed in another M65C02A-related thread. The M65C02A instruction set table is provided in this post.

_________________
Michael A.


Last edited by MichaelM on Sun Nov 15, 2015 2:35 pm, edited 4 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 03, 2015 3:22 pm 
Offline

Joined: Sun Nov 08, 2009 1:56 am
Posts: 387
Location: Minnesota
Quote:
Perhaps a compiler with some lookahead or a "smarter" expression evaluator would be able to pre-determine the number of temporary variables that would be required for each expression found in the source code.


OT:

I dunno how smart that would have to be. I tend to parse expressions into Reverse Polish Notation before trying to evaluate them. Although I've never had occasion to do it, it seems to me that it wouldn't be that hard to scan the RPN form just to see what the maximum number of pushes outstanding during evaluation will be. Of course a compiler would have to also track the maximum number of pushes found in ALL expressions appearing in a function to be able to pre-allocate the appropriate amount of stack space at function entry.

But in principle it doesn't look especially difficult.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 21 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: