Adding a new command is surprisingly easy. Here's how it goes...
There are a minimum of three places that you will need to change; in most cases, four. I'll use the example of the USR statement mentioned above.
- Step one - enumerate the instruction
Add a new enumeration value to the ENUM table at the start of the program. The only thing to remember here is that the six comparison operators must remain in the same order, and as a group. This provides a value, starting at $80, which will be used both by the tokenisation routine, and as a reference when a line containing our new instruction is executed.Code: Select all
; token enums bss org 0 enum_cnt set $80 ENUM LET ENUM REM ENUM PRINT ENUM LIST ENUM NEW ENUM RUN ENUM INPUT ENUM IF ENUM ELSE ENUM ENDIF ENUM FOR ENUM TO ENUM NEXT ENUM WHILE ENUM ENDWHILE ENUM GOTO ENUM GOSUB ENUM RETURN ENUM USR ; <--- our new entry ENUM END ; these comparisons _must_ be in this order ENUM LTE ENUM GTE ENUM EQUAL ENUM NEQUAL ENUM LT ENUM GT ENUM POKE ENUM LAST_KW - Step two - provide an asciiz name
It is critical that the name of the instruction is entered in the token table in the same order as the enumeration since this links the text which will be tokenised with the enum which will replace it in program memory. Note that the text here does not need to match the enumeration - see, for example, poke - but it will probably confuse other readers if it doesn't.Code: Select all
tokens: ; must match the enum order above ; where tokens share starting characters, the longer must ; be defined first (e.g. >= before >) db "let", 0 db "'", 0 db "print", 0 db "list", 0 db "new", 0 db "run", 0 db "input", 0 db "if", 0 db "else", 0 db "endif", 0 db "for", 0 db "to", 0 db "next", 0 db "while", 0 db "wend", 0 db "goto", 0 db "gosub", 0 db "return",0 db "usr", 0 ; <-- our new instruction's name db "end", 0 db "<=", 0 db ">=", 0 db "=", 0 db "!=", 0 db "<", 0 db ">", 0 db "!", 0 ; poke db 0 ; end of the list - Step three: add a call in execute()
Execute(), elided here for brevity, is called on each line by the interpreter. The structure of a line in memory consists of the line number (two bytes); the length of the line (one byte); the enum value of the instruction (one byte); any further associated text; and a final CR character (one byte). A global variable mem_ptr keeps track of where the interpreter is looking, and is passed to execute() each cycle. A long sequence of comparisons is used to find the next instruction to execute.
An instruction must return - in Y:A - the start of the next line to execute, or zero. There are a few options here... in all cases, though, Y:A contains the current value of mem_ptr.- GOSUB, USR, and POKE show the general case. An initial comparison with the enum, and a skip around if it isn't for us; a call to the appropriate subroutine; and a final jump to ex_99 which sorts out updating mem_ptr and returning to the main loop.
- RETURN shows the method used when a word should always return zero, to mark either the end of the program, or of a control structure. In these cases, a jump to ex_131 will handle that.
- GOTO demonstrates that it's not always necessary to have a separate routine to implement a statement. In this case it
- reads the character following the GOTO token and skips any white space
- calls expression() to evaluate its target
- calls find_line() to get the address of the line at that target number
- and returns that value with the final jump to ex_99
The REPTOS macro simply replaces the current top-of-stack - i.e. the initial mem_ptr value - with Y:A returned from a routine.Code: Select all
;char * execute (char * where) ;{ ; /* Execute a single program line, pointed to by 'where' ; * Return the next line to execute, or zero. ; */ ; execute: phx phy pha ; save where on stack <...> ex_10: ; case GOTO: cpx #GOTO bne ex_11 ; GetChar(); jsr getchar ; SkipWhite(); jsr skipwhite ; where = find_line (Expression()); jsr expression jsr find_line REPTOS ; break; bra ex_99 ex_11: ; case GOSUB: cpx #GOSUB bne ex_12 ; where = gosub (mem_ptr); jsr gosub REPTOS ; break; bra ex_99 ex_12: ; case RETURN: cpx #RETURN bne ex_121 ; where = NULL; ; break; bra ex_131 ex_121: ; case USR: cpx #USR bne ex_122 jsr usr REPTOS bra ex_99 ex_122: ; case POKE: cpx #POKE bne ex_13 ; where = poke (mem_ptr); jsr poke REPTOS ; break; bra ex_99 <...> - Add the executable function
Usr() is a statement which executes a machine code subroutine at a specified address. The address is specified in decimal, and may be any valid expression. The routine does not return data.
The x register must be preserved across any call. The address of the instruction itself is passed in, so we subtract four from that to get the start of the current line. By convention, that's the last thing that goes on the stack, so any local variables required for recursive routines - see for, if, and while - must be allocated stack locations before pushing this.With the stack set up correctly, we can start decoding our instruction. We already know what it is, or we wouldn't be here, but it may - as in this case - have further data associated with it which we need to handle.Code: Select all
;----------------------------------------------------------------------- ; usr - call a subroutine at the specified address usr: phx sec sbc #4 bcs usr_01 dey ; where = where - 4 usr_01: phy pha ; save where
Getchar() and skipwhite() set up the interpreter pointers so that mem_ptr is pointing at the text following the instruction. In this case, this is an expression indicating the target address. Expression() understands that and returns the evaluated expression in Y:A.We need a mechanism to make our call to the target. Because I want to keep this code runable from EEPROM (and because I hate self-modifying code) I chose to re-use the maths variables. They're safe to use any time they're not being used by one of the arithmetic routines, and they don't hold data for later use. Conveniently, they're adjacent in memory.Code: Select all
jsr getchar jsr skipwhite jsr expression ; the target?
So we build a jsr and rts block there, and call it.Finally, we need to know where the next line is. We grab the start of this line from the stack, call find_next_line() which leaves the required address in Y:A, restore X, and return to execute().Code: Select all
sta maths1+1 sty maths2 ; save as indirect call address lda #$20 ; jsr instruction sta maths1 lda #$60 ; rts instruction sta maths2+1 ; maths variables are now: ; jsr (expression) ; rts jsr maths1 ; do the usr callNon-recursive variables can be allocated using the bss section; by convention, immediately before the routine that uses them though a few are shared by multiple routines (such as the maths variables).Code: Select all
pla ply jsr find_next_line plx rts
Using this method does not provide a mechanism to return any value from a statement. If you wish to write a function that does return a value, you have two choices. The simplest is just to have the routine write its return value into a predefined variable, but that lacks clarity and flexibility. The other way is to make it recognisable to factor(), the lowest level of evaluation.
This is getting a bit advanced, but the idea is that factor() is extended to recognise your new function, and provide an evaluation there. I've done it for @ (peek) in the current code (fac_2:)... you would need to add something similar for your new function, perhaps just calling your existing routine, and returning the value in Y:A.
Code: Select all
;-----------------------------------------------------------------------
factor: ; return the value of a factor, including
; recursive brackets
phx
;int16_t factor; ; we define this later
;if (Look == '(')
lda look
cmp #'('
bne fac_1
;Match ('(');
jsr match ; a is already '('
;factor = Expression ();
jsr expression ; ooh, recursion!
phy
pha ; into 'factor' on stack
;Match (')');
lda #')'
jsr match
pla
ply
bra fac_x
;else if (isalpha(Look))
fac_1:
lda look
jsr isalpha
bcc fac_2
;factor = vars[GetName() - 'A'];
jsr getname
sec
sbc #'A'
asl a
tax
lda vars,x ; vars are stored low byte first
ldy vars+1,x
bra fac_x
;else if (Look == '@')
fac_2:
lda look
cmp #'@' ; peek
bne fac_3 ; nope, proceed as normal
jsr getchar
jsr skipwhite
jsr expression ; get the address
jsr star_int16 ; get the 16 bit value at that address
ldy #0 ; but we only want the lower 8 bits
bra fac_x
fac_3:
;factor = GetNum();
jsr getnum
fac_x
;return factor; it's in Y:A already :)
plx
rts
Enjoy!
Neil