Neolithic Tiny Basic

Programming the 6502 microprocessor and its relatives in assembly and other languages.
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Neolithic Tiny Basic

Post by barnacle »

Adding a new command to Neolithic Tiny Basic

Adding a new command is surprisingly easy. Here's how it goes...

There are a minimum of three places that you will need to change; in most cases, four. I'll use the example of the USR statement mentioned above.
  • Step one - enumerate the instruction
    Add a new enumeration value to the ENUM table at the start of the program. The only thing to remember here is that the six comparison operators must remain in the same order, and as a group. This provides a value, starting at $80, which will be used both by the tokenisation routine, and as a reference when a line containing our new instruction is executed.

    Code: Select all

    ; token enums
    	bss
    	org 0
    enum_cnt	set $80 
    	ENUM	LET
    	ENUM	REM
    	ENUM	PRINT
    	ENUM	LIST
    	ENUM	NEW
    	ENUM	RUN
    	ENUM	INPUT
    	ENUM	IF
    	ENUM	ELSE
    	ENUM	ENDIF
    	ENUM	FOR
    	ENUM	TO
    	ENUM	NEXT
    	ENUM	WHILE
    	ENUM	ENDWHILE
    	ENUM	GOTO
    	ENUM	GOSUB
    	ENUM	RETURN
    	ENUM	USR        ; <--- our new entry
    	ENUM	END
    ; these comparisons _must_ be in this order
    	ENUM	LTE
    	ENUM	GTE
    	ENUM	EQUAL
    	ENUM	NEQUAL
    	ENUM	LT
    	ENUM	GT
    	ENUM	POKE
    	ENUM	LAST_KW
    
  • Step two - provide an asciiz name
    It is critical that the name of the instruction is entered in the token table in the same order as the enumeration since this links the text which will be tokenised with the enum which will replace it in program memory. Note that the text here does not need to match the enumeration - see, for example, poke - but it will probably confuse other readers if it doesn't.

    Code: Select all

    tokens:		; must match the enum order above
    			; where tokens share starting characters, the longer must
    			; be defined first (e.g. >= before >)
    	db "let",	0
    	db "'", 	0
    	db "print",	0
    	db "list",	0
    	db "new",	0
    	db "run",	0
    	db "input",	0
    	db "if",	0
    	db "else",	0
    	db "endif",	0
    	db "for",	0
    	db "to",	0
    	db "next",	0
    	db "while",	0
    	db "wend",	0
    	db "goto",	0
    	db "gosub",	0
    	db "return",0
    	db "usr",	0        ; <-- our new instruction's name
    	db "end",	0
    	db "<=",	0
    	db ">=",	0
    	db "=",		0
    	db "!=",	0
    	db "<",		0
    	db ">",		0
    	db "!",		0		; poke
    	db 0				; end of the list
    
  • Step three: add a call in execute()
    Execute(), elided here for brevity, is called on each line by the interpreter. The structure of a line in memory consists of the line number (two bytes); the length of the line (one byte); the enum value of the instruction (one byte); any further associated text; and a final CR character (one byte). A global variable mem_ptr keeps track of where the interpreter is looking, and is passed to execute() each cycle. A long sequence of comparisons is used to find the next instruction to execute.
    An instruction must return - in Y:A - the start of the next line to execute, or zero. There are a few options here... in all cases, though, Y:A contains the current value of mem_ptr.
    • GOSUB, USR, and POKE show the general case. An initial comparison with the enum, and a skip around if it isn't for us; a call to the appropriate subroutine; and a final jump to ex_99 which sorts out updating mem_ptr and returning to the main loop.
    • RETURN shows the method used when a word should always return zero, to mark either the end of the program, or of a control structure. In these cases, a jump to ex_131 will handle that.
    • GOTO demonstrates that it's not always necessary to have a separate routine to implement a statement. In this case it
      - reads the character following the GOTO token and skips any white space
      - calls expression() to evaluate its target
      - calls find_line() to get the address of the line at that target number
      - and returns that value with the final jump to ex_99
    Note that an insertion into this chain of test-and-skip may require earlier 'bra' instructions to be replaced by 'jmp'.
    The REPTOS macro simply replaces the current top-of-stack - i.e. the initial mem_ptr value - with Y:A returned from a routine.

    Code: Select all

    ;char * execute (char * where)
    ;{
    ;	/* Execute a single program line, pointed to by 'where'
    ;	 * Return the next line to execute, or zero.
    ;	 */
    ;
    execute:
    	phx
    	phy
    	pha			; save where on stack
       <...>
    ex_10:
    ;			case GOTO:
    			cpx #GOTO
    			bne ex_11
    ;				GetChar();
    				jsr getchar
    ;				SkipWhite();
    				jsr skipwhite
    ;				where = find_line (Expression());
    				jsr expression
    				jsr find_line
    				REPTOS
    ;				break;
    				bra ex_99
    ex_11:
    ;			case GOSUB:
    			cpx #GOSUB
    			bne ex_12
    ;				where = gosub (mem_ptr);
    				jsr gosub
    				REPTOS
    ;				break;
    				bra ex_99
    ex_12:
    ;			case RETURN:
    			cpx #RETURN
    			bne ex_121
    ;				where = NULL;
    ;				break;
    				bra ex_131
    ex_121:
    ;			case USR:
    			cpx #USR
    			bne ex_122
    				jsr usr
    				REPTOS
    				bra ex_99
    ex_122:
    ;			case POKE:
    			cpx #POKE
    			bne ex_13
    ;				where = poke (mem_ptr);
    				jsr poke
    				REPTOS
    ;				break;
    				bra ex_99
    	   <...>
    
  • Add the executable function
    Usr() is a statement which executes a machine code subroutine at a specified address. The address is specified in decimal, and may be any valid expression. The routine does not return data.

    The x register must be preserved across any call. The address of the instruction itself is passed in, so we subtract four from that to get the start of the current line. By convention, that's the last thing that goes on the stack, so any local variables required for recursive routines - see for, if, and while - must be allocated stack locations before pushing this.

    Code: Select all

    ;-----------------------------------------------------------------------
    ; usr - call a subroutine at the specified address
    usr:	
    	phx
    	sec
    	sbc #4
    	bcs usr_01
    	dey					; where = where - 4
    usr_01:	
    	phy
    	pha					; save where
    
    With the stack set up correctly, we can start decoding our instruction. We already know what it is, or we wouldn't be here, but it may - as in this case - have further data associated with it which we need to handle.
    Getchar() and skipwhite() set up the interpreter pointers so that mem_ptr is pointing at the text following the instruction. In this case, this is an expression indicating the target address. Expression() understands that and returns the evaluated expression in Y:A.

    Code: Select all

    	jsr getchar
    	jsr skipwhite
    	jsr expression		; the target?
    
    We need a mechanism to make our call to the target. Because I want to keep this code runable from EEPROM (and because I hate self-modifying code) I chose to re-use the maths variables. They're safe to use any time they're not being used by one of the arithmetic routines, and they don't hold data for later use. Conveniently, they're adjacent in memory.
    So we build a jsr and rts block there, and call it.

    Code: Select all

    	
    	sta maths1+1	
    	sty maths2			; save as indirect call address
    	lda #$20			; jsr instruction
    	sta maths1
    	lda #$60			; rts instruction
    	sta maths2+1		; maths variables are now:
    						; jsr (expression)
    						; rts
    	jsr maths1			; do the usr call
    
    Finally, we need to know where the next line is. We grab the start of this line from the stack, call find_next_line() which leaves the required address in Y:A, restore X, and return to execute().

    Code: Select all

    	pla
    	ply
    	jsr find_next_line
    	plx
    	rts
    
    Non-recursive variables can be allocated using the bss section; by convention, immediately before the routine that uses them though a few are shared by multiple routines (such as the maths variables).
The subroutine called by usr() can be anywhere in memory and can access any memory location. It's dangerous. It's up to the user to make sure it's placed somewhere safe (top of memory is a good bet, but there is nothing to stop Basic text entry overwriting it!). There are around sixty locations free at the top of zero page (check the listing; it's subject to change); page two is used for the string variable, and a basic statement won't use more than the first 80 bytes, so a lot of that page is available.

Using this method does not provide a mechanism to return any value from a statement. If you wish to write a function that does return a value, you have two choices. The simplest is just to have the routine write its return value into a predefined variable, but that lacks clarity and flexibility. The other way is to make it recognisable to factor(), the lowest level of evaluation.

This is getting a bit advanced, but the idea is that factor() is extended to recognise your new function, and provide an evaluation there. I've done it for @ (peek) in the current code (fac_2:)... you would need to add something similar for your new function, perhaps just calling your existing routine, and returning the value in Y:A.

Code: Select all

;-----------------------------------------------------------------------	
factor:						; return the value of a factor, including
							; recursive brackets
	phx
	;int16_t factor;		; we define this later
	
	;if (Look == '(')
	lda look
	cmp #'('
	bne fac_1
		;Match ('(');
		jsr match			; a is already '('
		;factor = Expression ();
		jsr expression		; ooh, recursion!
		phy
		pha					; into 'factor' on stack	
		;Match (')');
		lda #')'
		jsr match
		pla
		ply
		bra fac_x
	;else if (isalpha(Look))
fac_1:
	lda look
	jsr isalpha
	bcc fac_2
		;factor = vars[GetName() - 'A'];
		jsr getname
		sec
		sbc #'A'
		asl a
		tax
		lda vars,x			; vars are stored low byte first
		ldy vars+1,x
		bra fac_x
		;else if (Look == '@')
fac_2:
		lda look
		cmp #'@'		; peek
		bne fac_3		; nope, proceed as normal
		jsr getchar
		jsr skipwhite
		jsr expression	; get the address
		jsr star_int16	; get the 16 bit value at that address
		ldy #0			; but we only want the lower 8 bits
		bra fac_x
fac_3:
		;factor = GetNum();
		jsr getnum
fac_x
	;return factor;	it's in Y:A already :)
	plx
	rts
In draughting this screed, I've realised that it should be possible to factor out the bits where four is subtracted from y:a at the start of each statement. I'll have a look at that and then post another 'final' version later, with USR included.

Enjoy!

Neil
User avatar
BB8
Posts: 57
Joined: 01 Nov 2020
Location: Tatooine

Re: Neolithic Tiny Basic

Post by BB8 »

Well... I don't know if USR was a good idea: now you have to "feed" it! :P
You need to give the user a way to launch their own routines, which have to be put in memory. And unless you want to enter them with an endless string of lines, each one with a "!" (poke) statement, you need two new instructions: READ and DATA... :) :mrgreen:
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Neolithic Tiny Basic

Post by barnacle »

Yeah, I know... and a third, to reset the read? Or maybe a line that looks like

Code: Select all

1000 data 12, 24, 35, 99...
and read with a line number starts at the start of the line, and without a line number continues?

I dunno. Going to need to think about it... am I allowed to blame Ed?

Neil :mrgreen:
User avatar
drogon
Posts: 1671
Joined: 14 Feb 2018
Location: Scotland
Contact:

Re: Neolithic Tiny Basic

Post by drogon »

barnacle wrote:
Yeah, I know... and a third, to reset the read? Or maybe a line that looks like

Code: Select all

1000 data 12, 24, 35, 99...
and read with a line number starts at the start of the line, and without a line number continues?

I dunno. Going to need to think about it... am I allowed to blame Ed?

Neil :mrgreen:
In the olden days.. Well, lets start with Apple Integer BASIC - that doesn't have READ/DATA, so what happened was that people simply appended the binary to the end of the Basic tokenised program code, poked the extended length into the ZP locations that had the program end and SAVEd the program. Then at run time the code would relocate itself, or be copied or run in-situ - whatever. Sometimes it was inside a REM statemnt at the start of the program, so if you knew the length was under 200 characters, then REM ****** ... for 200 characters, then you copied the binary into the REM statement and SAVEd it.

That was fiddly, error prone and needed the fix-up every time the program was changed. It also made LIST go bonkers.

Latterly, the READ/DATA/POKE was used in e.g. Applesoft. Very wasteful of program space, but ah well. (Apple also provided a binary load/run mechanism which helped in some places).

My own TB doesn't have DATA/READ but does have CALL and for a little experiment I'm looking at, I'm encoding strings in ASCII (Actually using a base 16 code, but using @ through O for 0-15 just to make the Basic decode easier, then a GOSUB to poke them into RAM. The "bonus" I have is that I can CHAIN programs, so one program is the poker to load up the machine code, then the 2nd is the one that can call it. Still fiddly, still needs something to generate those lines of

Code: Select all

10A$="@@AABB" : GOSUB 2
lines... but that's easy on my desktop.

-Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Neolithic Tiny Basic

Post by barrym95838 »

drogon wrote:
Well, lets start with Apple Integer BASIC - that doesn't have READ/DATA, so what happened was that people simply appended the binary to the end of the Basic tokenised program code, poked the extended length into the ZP locations that had the program end and SAVEd the program. Then at run time the code would relocate itself, or be copied or run in-situ - whatever.
YES! I think that Bob Bishop pioneered that strategy with his wonderful and chimeric "APPLE-VISION" in 1978. Many nerds back in the day scratched their heads, did some investigation, then applauded.

"Dancing Demon" on the TRS-80 Model I was iconic, but came a year later and made no effort to hide the fact that it was 100% Z-80 machine language.
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Neolithic Tiny Basic

Post by BigEd »

Thanks for the USR! And the recipe for adding extensions.

(If you do add DATA and READ, then RESTORE is the means in BBC Basic for placing the pointer on a specific line, or to the first DATA line if there's no argument.)
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Neolithic Tiny Basic

Post by barnacle »

The usr was the sort of thing you wanted, then? I was unsure about the format...

I was thinking of data starting a line of comma delimited values; read xxx moving the read pointer to a given line and returning the first value, and read with no parameter just grabbing the next data, moving to successive lines as required. If it fits...

The bad news about factoring the 'where -= 4' out is that while it mostly makes things work, and certainly makes them smaller, it's broken 'while' so I'm scratching my head at the moment; possibly I broke something while I was modifying the code.

Neil
User avatar
BB8
Posts: 57
Joined: 01 Nov 2020
Location: Tatooine

Re: Neolithic Tiny Basic

Post by BB8 »

USR is usually used as a Right-Hand Side of an expression, like V = USR(dest).
But knowing how is structured your code I see that it would not be easy to have it that way: you have no provision for functions yet.

As for the launch of internal routines, you could have a pointer in ZP that you fill with the address passed by the basic instruction. That pointer is used by a (isolated) JMP(indirect), and you can reach that JMP(indirect) with a JSR.
The optional result could be inserted in the same pointer, as a 16bit return value.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Neolithic Tiny Basic

Post by BigEd »

That's true, USR is usually a function. Not sure what it usually returns though. (In BBC Basic it returns several register values packed into an integer.) (BBC Basic also offers CALL, which is a command. It takes parameters in a more sophisticated way)
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Neolithic Tiny Basic

Post by barrym95838 »

In the 8-bit MS BASICs I know, USR() is a function accepting a float and returning a float. Its vector needs to be POKEd in before use, and the default vector points to an ILLEGAL QUANTITY ERROR or similar. Michael J Mahon used it with some superior coding to replace the stock SQR() with a custom routine that is much faster and more accurate. Fast, small, accurate ... pick any two ...
Last edited by barrym95838 on Thu Mar 27, 2025 8:18 pm, edited 1 time in total.
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Neolithic Tiny Basic

Post by barnacle »

Well, I'm definitely not doing FP maths in this one :mrgreen:

I'd seen USR as a function; it may come to that. But that's one more special case for factor() and that will already need one to make READ work... lemme get while working again, first.

Neil
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Neolithic Tiny Basic

Post by barnacle »

(It doesn't need to be a function as a usr function can fill any variable with two saves...)
User avatar
drogon
Posts: 1671
Joined: 14 Feb 2018
Location: Scotland
Contact:

Re: Neolithic Tiny Basic

Post by drogon »

barnacle wrote:
Well, I'm definitely not doing FP maths in this one :mrgreen:

I'd seen USR as a function; it may come to that. But that's one more special case for factor() and that will already need one to make READ work... lemme get while working again, first.

Neil
Anything that has a USR() and/or CALL (and & in Applesoft) will need to know some internal details to make the most of it. These are the things traditionally hard to find out in the early days, but were published in various magazines (dead tree technology) before The Internet took over (but also by then the world had moved on from Basics, in general)

So data that a machine code routine might need - ie. how to access the argument(s) and how to return a result at the minimum - and maybe the addresses of internal math routines to save space if the machine code needed to do math for example. Stuff like that.... Then the documentation get longer and longer and where do you draw the line...

-Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Neolithic Tiny Basic

Post by barnacle »

I believe Pterry Pratchett pointed out that any device whose documentation weighs more than it does should be avoided. Fortunately, electrons don't weigh a great deal...

The locations of the variables are well-defined, and there's a variable that stores the highest used memory. The lack of dynamically allocated variables helps a lot. The complete code is of course available; a listing There's also a routine which returns the address of the start of a line, given the number of that line, so you could be creative... but sooner or later, if you're not careful, your 'simple' tiny basic turns into an IDE or a complete OS... not my aim. 4k and running from eeprom is the limit (and there, ideally, room for the vectors so it can live at $f000).

Neil
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: Neolithic Tiny Basic

Post by barnacle »

OK, I fixed 'while' (I think!); it helps no end if I follow my own rule about the sanctity of X.

So here's tonight's version, with USR (but no DATA or READ) and occupying a mere $edf (3807) bytes. It is most definitely not fully tested yet, particularly with respect to nested code, but a quick check hints it's working.

Neil
Attachments
tiny.asm
(74.9 KiB) Downloaded 206 times
Post Reply