<2kbyte 6502 Tiny Basic?

Chromatix · Post by **Chromatix** » Fri Sep 15, 2023 10:05 pm

As a progress report, I've already implemented most of the statements, as well as a routine to execute a stored program (or, for that matter, an immediate-mode command). It's interesting to see how GOSUB can mostly be implemented with a little preprocessing - which turns out to be identical to the entire work that REPEAT does - followed by just falling into the complete GOTO routine. I'll work on another major subsystem before finishing the job with PRINT and INPUT.

That major subsystem is the expression parser. Without it, very little else will actually work…

I've also implemented a BRK handler which does the error messages I previously mentioned. Here's the list of error types I've provisionally allocated:

Code: Select all

	; BRK indicates an error from the interpreter
	; Print a basic readable message indicating the
	; error code (BRK signature), line number & column,
	; then return control to the prompt.

	; Error codes:
	; B - Mismatched brackets (during expression parsing)
	; E - Escape
	; K - Keyword error (line didn't start with a recognised statement or assignment)
	; N - Number expected (decimal digit not found when parsing number)
	; S - Syntax error (expression did not parse)
	; U - Underflow (of stack - probably mismatched GOSUB-RETURN or REPEAT-UNTIL)
	; V - oVerflow (numeric literal was too big)

barrym95838 · Post by **barrym95838** » Sat Sep 16, 2023 8:35 am

Those errors are more comprehensive than the typical Palo Alto Tiny BASIC "WHAT?" "HOW?" and "SORRY". But you don't seem to have "SORRY" covered yet.

barnacle · Post by **barnacle** » Sat Sep 16, 2023 11:34 am

"EH?"

Neil

drogon · Post by **drogon** » Sat Sep 16, 2023 11:58 am

barnacle wrote:

"EH?"

Neil

BBC Basic has a "Silly" message.

Code: Select all

* /basic/roms/bbc4
>AUTO 100,0

Silly
>

-Gordon

Chromatix · Post by **Chromatix** » Sat Sep 16, 2023 1:38 pm

The intent is that it produces messages that provide a coherent indication of both what the error is, and where it was found. Not all potential errors are even checked for, but here is a simple example where the user has completely misunderstood what type of language he's trying to write in:

Code: Select all

>10 RETURN PI
>RUN
U! 10:7
>

There are multiple errors in that one line of code, but the one that is checked first (or at all) is whether there's a GOSUB on the stack to return to. There isn't, so the Underflow error triggers, with chapter and verse. The fact that RETURN doesn't even take a parameter isn't checked for, and the parameter is so thoroughly ignored (even if the stack is in good shape) that its syntactic incomprehensibility to this version of BASIC is likewise not detected.

VinCBR900 · Post by **VinCBR900** » Sat Sep 16, 2023 2:38 pm

Chromatix wrote:

As a progress report, I've already implemented most of the statements, as well as a routine to execute a stored program (or, for that matter, an immediate-mode command). It's interesting to see how GOSUB can mostly be implemented with a little preprocessing - which turns out to be identical to the entire work that REPEAT does - followed by just falling into the complete GOTO routine. I'll work on another major subsystem before finishing the job with PRINT and INPUT.

Excellent work. Do you have a binary code size estimate yet?

Chromatix · Post by **Chromatix** » Sat Sep 16, 2023 2:44 pm

It'll be difficult to make any firm estimates until all the major components are in place. Some basic housekeeping stuff turned out much larger than I would normally expect, just because the 6502 makes those things more verbose to implement. Then some stuff I thought would take a lot of space turned out to be surprisingly simple to do. I think we're basically still on track to fit it all into 2KB, but I'll temporarily accept an initial full-function build being a bit larger than that, since there are likely places I can save some bytes if I look for them.

As for testing, I've got cc65 and Kowalski Simulator both running on the same machine, the latter under Wine. This should greatly simplify the process of debugging. I just need to remember how to make ca65 spit out a binary in a format that Kowalski understands…

Chromatix · Post by **Chromatix** » Sat Sep 16, 2023 5:25 pm

It appears that we're presently at just over 1KB of code. This is before adding the expression parser, but it confirms that we're on track for under 2KB in total.

Chromatix · Post by **Chromatix** » Sun Sep 17, 2023 12:49 pm

The first programs have been successfully executed:

: ItsAlive.png (6.72 KiB) Viewed 33425 times

This is using a "mock" expression parser that only parses integer literals. GOTO is also confirmed to work.

barrym95838 · Post by **barrym95838** » Sun Sep 17, 2023 3:54 pm

Congratulations! Are you tokenizing your keywords? If not, are you allowing abbreviations like G. for GOTO? Will there be any facility for interfacing with machine language, such as PEEK()/POKE/USR()? My new acquaintance Dave Hassler is spending a lot of time and effort porting and writing games in these tiny languages, and he has stated that those features (along with at least a primitive array) are the keys to a pleasant experience.

Chromatix · Post by **Chromatix** » Sun Sep 17, 2023 7:29 pm

There's no tokenisation, because I'm quite focused on code size rather than "compressing" the BASIC code in memory or making it fast at runtime. There are lots of tricks one could apply in those directions, but it's not the goal for this project. As for speed, in the above program I'm seeing about 2500 cycles per * printed, which would be 400 per second on a 1MHz CPU. Fast enough for toy programs, at least.

BASIC arrays are not a planned feature. It would complicate memory management and expression parsing, at the very least. Same goes for floating point, and any arithmetic functions that essentially assume the presence of floating point. You can do fixed-point arithmetic by making use of the 32-bit internal precision, even though the variables are 16-bit, so some transcendentals can be implemented "the hard way" if you really need them.

At the moment I don't plan to support abbreviations, though the place to add them is fairly well self-contained. Some kind of PEEK/POKE facility is planned, and you would be able to use the integer variables to index through memory with them - so you could use them to implement ersatz arrays, I suppose. A simple CALL facility to enter machine code would also be straightforward, relative to something which attempted to return values as I think USR would.

My focus for now is getting the expression parser completely implemented. Debugging what I'd already done took up most of yesterday!

VinCBR900 · Post by **VinCBR900** » Mon Sep 18, 2023 12:21 am

The goal of this thread has turned into create a 6502 Tinybasic that is smaller than 2kbytes inc vectors & Serial comms, in the spirit of period correct Tiny Basic SBC as in 1975 EPROM was more expensive than RAM. Tinybasic did/does not have more than 26 vars, strings, for/next or repeat/until, and used 16bit signed integers, so congrats to Chromatix for cramming in more functionality than original Tiny Basic e.g. more variables, 32bit signed ints, repeat/until, VDU and more as above.

I think EHBasic already has Floating points, data/restore, IRG interface, Call/USR machine language interface and many more features, but is significantly larger than 2kbytes

EDIT: Correction as below

Chromatix · Post by **Chromatix** » Mon Sep 18, 2023 12:53 am

I'm not promising FOR-NEXT just yet, but as you can see, REPEAT-UNTIL already works. There may well be room to add FOR-NEXT, but it's actually a relatively complex construct to implement, as the parameters of the loop are typically put on the stack so that NEXT can fall through when appropriate. I will need to consider how best to do it, after looking at how much space is left over.

Honestly the experience of making this is valuable in itself. It might be surprising, but this is actually the first high-level language interpreter I've written from scratch. In future, I may well leverage ideas in this to implement different (more capable?) languages, or maybe write a full-featured and faster BASIC to run natively on the '816. That, however, is very much for later.

I expect the following program to be my next test case:

Code: Select all

 10 x=1
 20 REPEAT
 30  y = (x/3*3)=x
 40  z = (x/5*5)=x
 50  IF y PRINT "FIZZ";
 60  IF z PRINT "BUZZ";
 70  IF NOT(y OR z) PRINT x;
 80  PRINT " ";
 90  x=x+1
100 UNTIL x > 100
110 PRINT

Chromatix · Post by **Chromatix** » Fri Sep 22, 2023 6:14 pm

More detailed design within the expression parser:

I realised that there are essentially two parsing contexts to consider, and that the set of lexical tokens I should expect in each context is completely disjoint. This will simplify error handling, since I don't have to figure out which context I'm in ex-post-facto, and I think the stack management is also made more robust this way. One criticism of the shunting-yard algorithm is that it allows parsing expressions with the tokens in the "wrong" order and thus appearing syntactically nonsensical, but this measure neatly fixes that while also simplifying the code a little.

After pushing a Stop operator token, the parser starts in the "Operand" context, since a single operand is itself a valid expression, while a single operator is not. The Operand context allows a numeric literal, a variable reference, a unary operator (negate or NOT), or an open bracket. The latter two push operator tokens on the stack but then remain in Operand context; literals and variables are actually operands and cause the parser to shift to "Operator" context. If the token at the cursor is not recognised in Operand context, a syntax error is thrown. Stacked operators are not executed from Operand context, since unary operators always have higher precedence than binary operators and their operands are not yet available.

In "Operator" context, the parser looks for infix operators and close brackets, and causes the immediate execution of equal-or-higher precedence operators already on the stack. This, for example, causes unary operators to be executed as soon as their operands are available. If the token at the cursor is not recognised in Operator context, the end of the expression is inferred, all stacked operators are executed, and control is returned to the statement parser with the cursor suitably advanced. Mismatched brackets would be recognised by attempted execution of a stacked bracket token outside of a close-bracket context, or encountering the Stop token on the stack while processing a close-bracket.

Several operators will be implemented as combinations of other operators to reduce code size and the number of unique operator tokens. In particular:

Negate and NOT can use the same code implementing a subtraction from zero, entered with the Carry flag set and cleared respectively. This is a 6502-specific optimisation that would not be obvious for code ported from some other 8-bit CPU.
Subtract can be implemented by pushing Add and Negate operators in that order, so that the top operand is negated before being added to its partner.
Unequal can be implemented by pushing NOT and Equal operators in that order, so that the Equal condition is tested and then the result inverted.
Similarly, >= is NOT <, <= is NOT >. This equivalence is only valid for integers, and would break for IEEE-754 floating point, but is a valid optimisation for us.

Obviously, these are all optimisations for code size, and many will result in lower execution speed. That's the kind of tradeoff needed to keep the code within 2KB with a reasonably complete feature set.

PEEK will likely be a unary operator, since it fits neatly into the above scheme. I may use the BBC BASIC indirection operators, ? and !, instead of the PEEK keyword, since they are easier to write recognition code for. However, implementing the indirection operators instead of POKE may be more complicated, since I have a very restricted concept of an lvalue. It might work if ? and ! are implemented as statement keywords, essentially duplicating the assignment statement.

barrym95838 · Post by **barrym95838** » Sat Sep 23, 2023 3:34 pm

I have never used Acorn Atom BASIC, but it looks like it might have some valuable ideas on how you could proceed. Atom BASIC reportedly resided in a 4KB ROM, but it's not clear to me how much of that was dedicated to the editor/interpreter.

<2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?

Re: <2kbyte 6502 Tiny Basic?