Neolithic Tiny Basic

barnacle · Post by **barnacle** » Thu Mar 27, 2025 6:55 am

Adding a new command to Neolithic Tiny Basic

Adding a new command is surprisingly easy. Here's how it goes...

There are a minimum of three places that you will need to change; in most cases, four. I'll use the example of the USR statement mentioned above.

Step one - enumerate the instruction
Add a new enumeration value to the ENUM table at the start of the program. The only thing to remember here is that the six comparison operators must remain in the same order, and as a group. This provides a value, starting at $80, which will be used both by the tokenisation routine, and as a reference when a line containing our new instruction is executed.

Code: Select all

; token enums
	bss
	org 0
enum_cnt	set $80 
	ENUM	LET
	ENUM	REM
	ENUM	PRINT
	ENUM	LIST
	ENUM	NEW
	ENUM	RUN
	ENUM	INPUT
	ENUM	IF
	ENUM	ELSE
	ENUM	ENDIF
	ENUM	FOR
	ENUM	TO
	ENUM	NEXT
	ENUM	WHILE
	ENUM	ENDWHILE
	ENUM	GOTO
	ENUM	GOSUB
	ENUM	RETURN
	ENUM	USR        ; <--- our new entry
	ENUM	END
; these comparisons _must_ be in this order
	ENUM	LTE
	ENUM	GTE
	ENUM	EQUAL
	ENUM	NEQUAL
	ENUM	LT
	ENUM	GT
	ENUM	POKE
	ENUM	LAST_KW

Step two - provide an asciiz name
It is critical that the name of the instruction is entered in the token table in the same order as the enumeration since this links the text which will be tokenised with the enum which will replace it in program memory. Note that the text here does not need to match the enumeration - see, for example, poke - but it will probably confuse other readers if it doesn't.

Code: Select all

tokens:		; must match the enum order above
			; where tokens share starting characters, the longer must
			; be defined first (e.g. >= before >)
	db "let",	0
	db "'", 	0
	db "print",	0
	db "list",	0
	db "new",	0
	db "run",	0
	db "input",	0
	db "if",	0
	db "else",	0
	db "endif",	0
	db "for",	0
	db "to",	0
	db "next",	0
	db "while",	0
	db "wend",	0
	db "goto",	0
	db "gosub",	0
	db "return",0
	db "usr",	0        ; <-- our new instruction's name
	db "end",	0
	db "<=",	0
	db ">=",	0
	db "=",		0
	db "!=",	0
	db "<",		0
	db ">",		0
	db "!",		0		; poke
	db 0				; end of the list

Step three: add a call in execute()
Execute(), elided here for brevity, is called on each line by the interpreter. The structure of a line in memory consists of the line number (two bytes); the length of the line (one byte); the enum value of the instruction (one byte); any further associated text; and a final CR character (one byte). A global variable mem_ptr keeps track of where the interpreter is looking, and is passed to execute() each cycle. A long sequence of comparisons is used to find the next instruction to execute.
An instruction must return - in Y:A - the start of the next line to execute, or zero. There are a few options here... in all cases, though, Y:A contains the current value of mem_ptr.
- GOSUB, USR, and POKE show the general case. An initial comparison with the enum, and a skip around if it isn't for us; a call to the appropriate subroutine; and a final jump to ex_99 which sorts out updating mem_ptr and returning to the main loop.
- RETURN shows the method used when a word should always return zero, to mark either the end of the program, or of a control structure. In these cases, a jump to ex_131 will handle that.
- GOTO demonstrates that it's not always necessary to have a separate routine to implement a statement. In this case it
  - reads the character following the GOTO token and skips any white space
  - calls expression() to evaluate its target
  - calls find_line() to get the address of the line at that target number
  - and returns that value with the final jump to ex_99
Note that an insertion into this chain of test-and-skip may require earlier 'bra' instructions to be replaced by 'jmp'.
The REPTOS macro simply replaces the current top-of-stack - i.e. the initial mem_ptr value - with Y:A returned from a routine.
Code: Select all
```
;char * execute (char * where)
;{
;	/* Execute a single program line, pointed to by 'where'
;	 * Return the next line to execute, or zero.
;	 */
;
execute:
	phx
	phy
	pha			; save where on stack
   <...>
ex_10:
;			case GOTO:
			cpx #GOTO
			bne ex_11
;				GetChar();
				jsr getchar
;				SkipWhite();
				jsr skipwhite
;				where = find_line (Expression());
				jsr expression
				jsr find_line
				REPTOS
;				break;
				bra ex_99
ex_11:
;			case GOSUB:
			cpx #GOSUB
			bne ex_12
;				where = gosub (mem_ptr);
				jsr gosub
				REPTOS
;				break;
				bra ex_99
ex_12:
;			case RETURN:
			cpx #RETURN
			bne ex_121
;				where = NULL;
;				break;
				bra ex_131
ex_121:
;			case USR:
			cpx #USR
			bne ex_122
				jsr usr
				REPTOS
				bra ex_99
ex_122:
;			case POKE:
			cpx #POKE
			bne ex_13
;				where = poke (mem_ptr);
				jsr poke
				REPTOS
;				break;
				bra ex_99
	   <...>
```
Add the executable function
Usr() is a statement which executes a machine code subroutine at a specified address. The address is specified in decimal, and may be any valid expression. The routine does not return data.

The x register must be preserved across any call. The address of the instruction itself is passed in, so we subtract four from that to get the start of the current line. By convention, that's the last thing that goes on the stack, so any local variables required for recursive routines - see for, if, and while - must be allocated stack locations before pushing this.
Code: Select all
```
;-----------------------------------------------------------------------
; usr - call a subroutine at the specified address
usr:	
	phx
	sec
	sbc #4
	bcs usr_01
	dey					; where = where - 4
usr_01:	
	phy
	pha					; save where
```
With the stack set up correctly, we can start decoding our instruction. We already know what it is, or we wouldn't be here, but it may - as in this case - have further data associated with it which we need to handle.
Getchar() and skipwhite() set up the interpreter pointers so that mem_ptr is pointing at the text following the instruction. In this case, this is an expression indicating the target address. Expression() understands that and returns the evaluated expression in Y:A.
Code: Select all
```
	jsr getchar
	jsr skipwhite
	jsr expression		; the target?
```
We need a mechanism to make our call to the target. Because I want to keep this code runable from EEPROM (and because I hate self-modifying code) I chose to re-use the maths variables. They're safe to use any time they're not being used by one of the arithmetic routines, and they don't hold data for later use. Conveniently, they're adjacent in memory.
So we build a jsr and rts block there, and call it.
Code: Select all
```
	
	sta maths1+1	
	sty maths2			; save as indirect call address
	lda #$20			; jsr instruction
	sta maths1
	lda #$60			; rts instruction
	sta maths2+1		; maths variables are now:
						; jsr (expression)
						; rts
	jsr maths1			; do the usr call
```
Finally, we need to know where the next line is. We grab the start of this line from the stack, call find_next_line() which leaves the required address in Y:A, restore X, and return to execute().
Code: Select all
```
	pla
	ply
	jsr find_next_line
	plx
	rts
```
Non-recursive variables can be allocated using the bss section; by convention, immediately before the routine that uses them though a few are shared by multiple routines (such as the maths variables).

The subroutine called by usr() can be anywhere in memory and can access any memory location. It's dangerous. It's up to the user to make sure it's placed somewhere safe (top of memory is a good bet, but there is nothing to stop Basic text entry overwriting it!). There are around sixty locations free at the top of zero page (check the listing; it's subject to change); page two is used for the string variable, and a basic statement won't use more than the first 80 bytes, so a lot of that page is available.

Using this method does not provide a mechanism to return any value from a statement. If you wish to write a function that does return a value, you have two choices. The simplest is just to have the routine write its return value into a predefined variable, but that lacks clarity and flexibility. The other way is to make it recognisable to factor(), the lowest level of evaluation.

This is getting a bit advanced, but the idea is that factor() is extended to recognise your new function, and provide an evaluation there. I've done it for @ (peek) in the current code (fac_2:)... you would need to add something similar for your new function, perhaps just calling your existing routine, and returning the value in Y:A.

Code: Select all

;-----------------------------------------------------------------------	
factor:						; return the value of a factor, including
							; recursive brackets
	phx
	;int16_t factor;		; we define this later
	
	;if (Look == '(')
	lda look
	cmp #'('
	bne fac_1
		;Match ('(');
		jsr match			; a is already '('
		;factor = Expression ();
		jsr expression		; ooh, recursion!
		phy
		pha					; into 'factor' on stack	
		;Match (')');
		lda #')'
		jsr match
		pla
		ply
		bra fac_x
	;else if (isalpha(Look))
fac_1:
	lda look
	jsr isalpha
	bcc fac_2
		;factor = vars[GetName() - 'A'];
		jsr getname
		sec
		sbc #'A'
		asl a
		tax
		lda vars,x			; vars are stored low byte first
		ldy vars+1,x
		bra fac_x
		;else if (Look == '@')
fac_2:
		lda look
		cmp #'@'		; peek
		bne fac_3		; nope, proceed as normal
		jsr getchar
		jsr skipwhite
		jsr expression	; get the address
		jsr star_int16	; get the 16 bit value at that address
		ldy #0			; but we only want the lower 8 bits
		bra fac_x
fac_3:
		;factor = GetNum();
		jsr getnum
fac_x
	;return factor;	it's in Y:A already :)
	plx
	rts

In draughting this screed, I've realised that it should be possible to factor out the bits where four is subtracted from y:a at the start of each statement. I'll have a look at that and then post another 'final' version later, with USR included.

Enjoy!

Neil

BB8 · Post by **BB8** » Thu Mar 27, 2025 7:37 am

Well... I don't know if USR was a good idea: now you have to "feed" it!

You need to give the user a way to launch their own routines, which have to be put in memory. And unless you want to enter them with an endless string of lines, each one with a "!" (poke) statement, you need two new instructions: READ and DATA...

barnacle · Post by **barnacle** » Thu Mar 27, 2025 8:04 am

Yeah, I know... and a third, to reset the read? Or maybe a line that looks like

Code: Select all

1000 data 12, 24, 35, 99...

and read with a line number starts at the start of the line, and without a line number continues?

I dunno. Going to need to think about it... am I allowed to blame Ed?

Neil

drogon · Post by **drogon** » Thu Mar 27, 2025 9:05 am

barnacle wrote:

Yeah, I know... and a third, to reset the read? Or maybe a line that looks like

Code: Select all

1000 data 12, 24, 35, 99...

and read with a line number starts at the start of the line, and without a line number continues?

I dunno. Going to need to think about it... am I allowed to blame Ed?

Neil

In the olden days.. Well, lets start with Apple Integer BASIC - that doesn't have READ/DATA, so what happened was that people simply appended the binary to the end of the Basic tokenised program code, poked the extended length into the ZP locations that had the program end and SAVEd the program. Then at run time the code would relocate itself, or be copied or run in-situ - whatever. Sometimes it was inside a REM statemnt at the start of the program, so if you knew the length was under 200 characters, then REM ****** ... for 200 characters, then you copied the binary into the REM statement and SAVEd it.

That was fiddly, error prone and needed the fix-up every time the program was changed. It also made LIST go bonkers.

Latterly, the READ/DATA/POKE was used in e.g. Applesoft. Very wasteful of program space, but ah well. (Apple also provided a binary load/run mechanism which helped in some places).

My own TB doesn't have DATA/READ but does have CALL and for a little experiment I'm looking at, I'm encoding strings in ASCII (Actually using a base 16 code, but using @ through O for 0-15 just to make the Basic decode easier, then a GOSUB to poke them into RAM. The "bonus" I have is that I can CHAIN programs, so one program is the poker to load up the machine code, then the 2nd is the one that can call it. Still fiddly, still needs something to generate those lines of

Code: Select all

10A$="@@AABB" : GOSUB 2

lines... but that's easy on my desktop.

-Gordon

barrym95838 · Post by **barrym95838** » Thu Mar 27, 2025 2:30 pm

drogon wrote:

Well, lets start with Apple Integer BASIC - that doesn't have READ/DATA, so what happened was that people simply appended the binary to the end of the Basic tokenised program code, poked the extended length into the ZP locations that had the program end and SAVEd the program. Then at run time the code would relocate itself, or be copied or run in-situ - whatever.

YES! I think that Bob Bishop pioneered that strategy with his wonderful and chimeric "APPLE-VISION" in 1978. Many nerds back in the day scratched their heads, did some investigation, then applauded.

"Dancing Demon" on the TRS-80 Model I was iconic, but came a year later and made no effort to hide the fact that it was 100% Z-80 machine language.

BigEd · Post by **BigEd** » Thu Mar 27, 2025 4:50 pm

Thanks for the USR! And the recipe for adding extensions.

(If you do add DATA and READ, then RESTORE is the means in BBC Basic for placing the pointer on a specific line, or to the first DATA line if there's no argument.)

barnacle · Post by **barnacle** » Thu Mar 27, 2025 5:16 pm

The usr was the sort of thing you wanted, then? I was unsure about the format...

I was thinking of data starting a line of comma delimited values; read xxx moving the read pointer to a given line and returning the first value, and read with no parameter just grabbing the next data, moving to successive lines as required. If it fits...

The bad news about factoring the 'where -= 4' out is that while it mostly makes things work, and certainly makes them smaller, it's broken 'while' so I'm scratching my head at the moment; possibly I broke something while I was modifying the code.

Neil

BB8 · Post by **BB8** » Thu Mar 27, 2025 5:39 pm

USR is usually used as a Right-Hand Side of an expression, like V = USR(dest).
But knowing how is structured your code I see that it would not be easy to have it that way: you have no provision for functions yet.

As for the launch of internal routines, you could have a pointer in ZP that you fill with the address passed by the basic instruction. That pointer is used by a (isolated) JMP(indirect), and you can reach that JMP(indirect) with a JSR.
The optional result could be inserted in the same pointer, as a 16bit return value.

BigEd · Post by **BigEd** » Thu Mar 27, 2025 5:55 pm

That's true, USR is usually a function. Not sure what it usually returns though. (In BBC Basic it returns several register values packed into an integer.) (BBC Basic also offers CALL, which is a command. It takes parameters in a more sophisticated way)

barrym95838 · Post by **barrym95838** » Thu Mar 27, 2025 6:29 pm

In the 8-bit MS BASICs I know, USR() is a function accepting a float and returning a float. Its vector needs to be POKEd in before use, and the default vector points to an ILLEGAL QUANTITY ERROR or similar. Michael J Mahon used it with some superior coding to replace the stock SQR() with a custom routine that is much faster and more accurate. Fast, small, accurate ... pick any two ...

barnacle · Post by **barnacle** » Thu Mar 27, 2025 6:33 pm

Well, I'm definitely not doing FP maths in this one

I'd seen USR as a function; it may come to that. But that's one more special case for factor() and that will already need one to make READ work... lemme get while working again, first.

Neil

barnacle · Post by **barnacle** » Thu Mar 27, 2025 6:36 pm

(It doesn't need to be a function as a usr function can fill any variable with two saves...)

drogon · Post by **drogon** » Thu Mar 27, 2025 7:03 pm

barnacle wrote:

Well, I'm definitely not doing FP maths in this one

I'd seen USR as a function; it may come to that. But that's one more special case for factor() and that will already need one to make READ work... lemme get while working again, first.

Neil

Anything that has a USR() and/or CALL (and & in Applesoft) will need to know some internal details to make the most of it. These are the things traditionally hard to find out in the early days, but were published in various magazines (dead tree technology) before The Internet took over (but also by then the world had moved on from Basics, in general)

So data that a machine code routine might need - ie. how to access the argument(s) and how to return a result at the minimum - and maybe the addresses of internal math routines to save space if the machine code needed to do math for example. Stuff like that.... Then the documentation get longer and longer and where do you draw the line...

-Gordon

barnacle · Post by **barnacle** » Thu Mar 27, 2025 9:13 pm

I believe Pterry Pratchett pointed out that any device whose documentation weighs more than it does should be avoided. Fortunately, electrons don't weigh a great deal...

The locations of the variables are well-defined, and there's a variable that stores the highest used memory. The lack of dynamically allocated variables helps a lot. The complete code is of course available; a listing There's also a routine which returns the address of the start of a line, given the number of that line, so you could be creative... but sooner or later, if you're not careful, your 'simple' tiny basic turns into an IDE or a complete OS... not my aim. 4k and running from eeprom is the limit (and there, ideally, room for the vectors so it can live at $f000).

Neil

barnacle · Post by **barnacle** » Thu Mar 27, 2025 10:01 pm

OK, I fixed 'while' (I think!); it helps no end if I follow my own rule about the sanctity of X.

So here's tonight's version, with USR (but no DATA or READ) and occupying a mere $edf (3807) bytes. It is most definitely not fully tested yet, particularly with respect to nested code, but a quick check hints it's working.

Neil

Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic