6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 10, 2024 10:50 am

All times are UTC




Post new topic Reply to topic  [ 22 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: BASIC JIT Compiler?
PostPosted: Thu May 26, 2016 7:12 pm 
Offline
User avatar

Joined: Mon May 04, 2015 10:55 am
Posts: 26
Location: UK
Does a BASIC JIT compiler exist for the 6502? If not, I might create one.

It would be a nice way to write code that is both performant, and easy to write.


Last edited by DavidBuchanan on Sat May 28, 2016 8:02 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Thu May 26, 2016 9:33 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10976
Location: England
I think that would be a new thing - good luck, and do keep us posted!
(Some early BASICs did not tokenise, so that had to repeat that work each time they visited each line at runtime. But most BASICs do tokenise at least the keywords. Some will also tokenise line numbers, so a GOTO does not need to convert decimal to binary each time. I'm not aware of one which tokenises constants or variable addresses, but those would be possible tactics too, somewhat short of JIT but helping execution speed.)


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Fri May 27, 2016 4:54 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
BigEd wrote:
I'm not aware of one which tokenises constants or variable addresses, but those would be possible tactics too, somewhat short of JIT but helping execution speed.

Business BASICs, such as BBx and Thoroughbred, not only tokenize verbs, they generate a symbol table that points to the address of every declared variable, as well as variables generated at runtime. Even embedded strings (e.g., PRINT "This is a test.") are organized in this fashion. Also in the symbol table are the addresses of subroutines and branch targets. Hence the interpreter avoids the considerable overhead of having to scan the program text to find, for example, a GOSUB target at runtime.

The generation of the symbol table occurs as each program statement is entered and "compiled." Furthermore, the symbol table is stored with the program when it is SAVEd, which means the table doesn't have to be regenerated each time the program is run. A side-effect of this methodology is that programs can be chained and variables passed from one program to the next, especially since variables are stored separately from the program itself.

It's all very different than Microsoft BASIC, or other microcomputer BASIC implementations, and results in excellent performance, even on relatively pokey hardware.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Fri May 27, 2016 8:32 pm 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
Historically, 6502s never had a lot of bandwidth available to do actual JIT.

Also, 6502 suffers from being a poor target for high level languages, especially in terms of memory space. With the constrained memory space, that's a bit of a double whammy.

That said, there was BASIC XL on the Atari, back in the day. It had the BEST BASIC statement ever: "FAST".

"FAST" was essentially a post processor. You added FAST as the first line of your program, and it would scan the program and make some calculations that it needs for runtime, but it only does it once rather than during processing.

Specifically, one of the things that it did was that in the original Atari BASIC, when you had something like:

Code:
10 FOR I = 1 to 10
20 PRINT "HELLO"
30 NEXT I


When the interpreter hits the FOR I, it pushes the line number of the statement, then when it hits NEXT I, it grab that line number, 10 in this case, then it searches the program for line 10. It doesn't "know" where line 10 is, so it goes and looks for it. It does this every. single. time. It does this by going to the start of the program, and just traversing the linked list that makes up lines of code, until it finds the appropriate line number.

You can see for a large program, if you have a simple FOR loop in the last lines of the program, that the NEXT has to scan the entirety of the code for each iteration. So, for performance, you would put common code at the top of the program.

The FAST statement calculated those destinations. It would sweep through the tokens and "link" the destinations for all of the FOR/NEXT, GOSUB, and GOTO statements. It was a noticeable improvement in performance.

Avalon Hill had a dungeon game, Telengard, for the Atari. We managed to break in to it and added the FAST statement. The game was borderline unplayable after that. We had to put in strategic delay loops (like FOR I = 1 TO 5000 : NEXT I) to make it work for us.

The FAST statement took a noticeable amount of time to run, there was a distinct pause at the beginning of the program.

I think a better approach is to do something like FAST, a static analysis phase that may be able to make some assertions and other computations, but not necessarily creating binary code. Just a phase that tries to eliminate lookups and indirection. Try and resolve as many of those as possible.


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Fri May 27, 2016 9:07 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10976
Location: England
There's a book - link below - explaining the construction of Atari Basic with a full listing. It's quite a gentle introduction to the principles of what they did. It does a more thorough job of tokenising lines than I'm used to, but I don't fully understand why the FOR structure (which is stacked) includes the line number, rather than the address of the tokenised line in memory. Surely at runtime the program is fixed, and when you hit a FOR you know exactly what its address is. So there should be no need to search for the line...
https://archive.org/details/ataribooks- ... ource-book


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Sat May 28, 2016 9:16 am 
Offline

Joined: Tue Jul 24, 2012 2:27 am
Posts: 679
What benefit would a JIT give you? Variable types are explicit in the source code (A = float, A% = int, A$ = string), so there's not much to introspect & optimize that you couldn't get in a AOT compiler.

_________________
WFDis Interactive 6502 Disassembler
AcheronVM: A Reconfigurable 16-bit Virtual CPU for the 6502 Microprocessor


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Sat May 28, 2016 6:56 pm 
Offline
User avatar

Joined: Mon May 04, 2015 10:55 am
Posts: 26
Location: UK
White Flame wrote:
What benefit would a JIT give you? Variable types are explicit in the source code (A = float, A% = int, A$ = string), so there's not much to introspect & optimize that you couldn't get in a AOT compiler.


I used JIT in a rather loose sense. My plan is simply to compile into memory when the program is started, and then jump into the compiled binary - so the compilation is actually AOT.


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Mon Jun 06, 2016 9:13 pm 
Offline

Joined: Wed Nov 18, 2015 8:36 am
Posts: 102
Location: UK
Hi. I haven't been able to look on the forum for a while, but this is a fascinating thread for me as I have been working on this very problem for the past few months. Check my site out on hackaday (https://hackaday.io/project/5789-6502-homebrew-computer), but this is a brief summary of my set up:

I have designed and built a BASIC like interpreted language which makes use of extensive tokenisation to help with the performance of the runtime. The language itself is inspired by Atari and BBC BASIC, with a key 'modern' feature that line numbers are not used to transfer control, only as unique IDs in to keep program lines in memory ordered. The particular optimisations I have done are:
- Integer only 16 bit math (actually, this is probably the biggest single factor in performance for my interpreter)
- All variables are tokenised on entry (similar to Atari BASIC)
- Procedures are tokenised too. On first call of a procedure, the address is cached so that subsequent calls are direct jumps
- For, repeat and while loops push the statement address on to the stack not the line number. Therefore when it is time to loop, it is a direct jump back.

I tried out some benchmarks (documented on my hackaday site) of PCW tests from the early 1980's and my interpreter is always the fastest by some margin. I can't be too boastful on this achievement - my 65C02 is clocked at 2.68Mhz plus as mentioned there is no floating point support. However some of the benchmarks are not testing maths speed but things like looping and calling subroutines and the above strategies I used have also made my interpreter a fair bit quicker (even accounting for the higher clock speed of my hardware).

I've also dumped a fairly recent version of the entire source code (although I hope to do a refresh in the next week or so). The language, plus kernel fits in to 16KB with about 1KB spare (which is actually quite plenty to add a few more features!).


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Mon Jun 06, 2016 9:17 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10976
Location: England
Has anyone considered some kind of threaded rewrite of expressions? I'm thinking the original expression source could be followed by a short series of subroutine calls or similar, which have the effect of evaluating the expression.

Or, more importantly surely, anyone done any profiling of which parts of Basic interpretation takes most time?


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Mon Jun 06, 2016 9:39 pm 
Offline

Joined: Wed Nov 18, 2015 8:36 am
Posts: 102
Location: UK
My experience in the old days is with the Oric-1 (licensed version of Microsoft BASIC) and Atari BASIC. As was mentioned by someone earlier, one of the biggest performance hits is transfer of control e.g. GOTO, GOSUB, FOR..NEXT. These versions of BASIC searched for the target line number from the beginning and thus for best performance one needed to put often used subroutines near the beginning of the program to minimise the search time.

Another optimisation I did was to tokenise all numeric constants. IIRC, Atari BASIC maintained the ASCII (ATASCII to be pedantic) entered at edit time and then did a conversion from ASCII to numeric in real time during execution. To speed up Atari BASIC, the tip was to use variables set to a constant rather than the constant e.g. Z1=1,Z2=2 etc. Using this technique, the statement A=Z1+Z2 was much quicker than A=1+2 at runtime.


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Wed Jun 08, 2016 6:55 pm 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
The reason that transfer of control internally relies on line numbers, etc. is because those are effectively static. Actual addresses etc are not. As the program changes, all of those computed references are obsoleted and must be recomputed. Also, don't forget, many BASICs allowed the program to halted and then CONTinued from where they left off, even if you changed the code. So it's a very real scenario that a running program is stopped, changed, and then continued.

The Atari BASIC XL FAST command demonstrated how expensive the pre-calculation of these values was with a noticeable delay at the start of programs.

Atari BASIC retained the ATASCII version of lines that it found had errors. If the lines were syntactically correct, they were tokenized on entry. This was in order to facilitate the loading of text programs (vs loading binary images of tokenized programs) that perhaps had errors in them, allowing the program to be mass loaded and then edited and corrected rather than failing the entire process.

Referencing a variable has the advantage of taking less space in the code, since the variable reference is simply 1 byte (there can only be 128 unique variables in an Atari BASIC program). I don't know if using variables is actually faster, however. The evaluator used a stack, and the arguments to the operators are pushed on that stack. So any numeric constants would be copied on to that stack (thus copying 6 bytes). For a variable, a reference to the variable is pushed, not its value. A reference is smaller, but you also have the indirection to look up the value. So, one may be faster than the other, or it's just a wash.


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Wed Jun 08, 2016 6:59 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10976
Location: England
Yes, for sure, when code is changed there will be a need to recompute some of the precompiled information. (Maybe not all of it.) I think it's almost a truism that you get a tradeoff between immediacy and runtime performance. In fact there have been times when interpreted languages are favoured precisely because you can get early results sooner after making a change, even if a long run will take longer than a compiled version.


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Wed Jun 08, 2016 7:54 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8538
Location: Southern California
whartung wrote:
The reason that transfer of control internally relies on line numbers, etc. is because those are effectively static. Actual addresses etc are not. As the program changes, all of those computed references are obsoleted and must be recomputed. Also, don't forget, many BASICs allowed the program to halted and then CONTinued from where they left off, even if you changed the code. So it's a very real scenario that a running program is stopped, changed, and then continued.

The Atari BASIC XL FAST command demonstrated how expensive the pre-calculation of these values was with a noticeable delay at the start of programs.

I have not looked into the innards of any BAISCs, but the following might be relevant, about the two programmable calculator families I've used whose user-language programs are interpreted. My first programmable anything was a TI-58c calculator which I used heavily for a year or two and then traded up for a TI-59 and printer. These basically held only one program in memory at a time, starting at address 0. You could use labels for jumping, but that was slower because every time they're referenced, the calc had to search for the labels starting from address 0. To improve the execution speed, you could write your code with jumps to absolute program addresses. This made program development much more tedious though.

A few years later I stepped up to the HP-41 calculator. It allowed having gobs of programs in main memory at the same time (plus more in program files in extended memory), so referencing absolute addresses would have been out of the question. If you resize or delete a program, the ones after it in memory would get scooted up or down in memory, changing all the addresses. Addresses may also change if you edit the program you're working on. The solution was to go with labels only. As I understand it however, each jump to a label (except indirects) also keeps an address field which is hidden from the user. When the program is run and there's a jump to a label, the calculator looks where the address field points, and looks to make sure that yes, that's the label. If so, time is saved. But if it's not, it goes searching. When it finds it, it records the address back at the jump instruction, to save time in the future. I think this happens for both local and global labels. The difference in speed is definitely noticeable.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Wed Jun 08, 2016 7:59 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10976
Location: England
It's a good point that some compilations could be done at run time - indeed it starts to look a bit like JIT. But if the compiled information is to be held inline next to the relevant source, you'd need to allocate the space for it ahead of time. Perhaps just set aside a couple of bytes for a pointer to the compiled information.


Top
 Profile  
Reply with quote  
 Post subject: Re: BASIC JIT Compiler?
PostPosted: Wed Jun 08, 2016 9:54 pm 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
GARTHWILSON wrote:
A few years later I stepped up to the HP-41 calculator. … When the program is run and there's a jump to a label, the calculator looks where the address field points, and looks to make sure that yes, that's the label. If so, time is saved. But if it's not, it goes searching. When it finds it, it records the address back at the jump instruction, to save time in the future.


That's a clever memoization technique.

BigEd wrote:
It's a good point that some compilations could be done at run time - indeed it starts to look a bit like JIT. But if the compiled information is to be held inline next to the relevant source, you'd need to allocate the space for it ahead of time. Perhaps just set aside a couple of bytes for a pointer to the compiled information.


Specifically in Atari BASIC, the arguments to the GOTO/GOSUB routines were numeric literals. All literals in Atari BASIC are 6 byte floats.

BASIC XL was backward compatible with Atari BASIC, so it could load binary BASIC save files. But, as I recall, internally, after FAST was run, it would change the 6 byte float used for GOTO/GOSUB in to something else. Since you were limited to a max BASIC line number of 32767, it would shove the referenced line number in to 2 of the bytes, then it could use the other 4 for status information (such as the new direct address, etc.). So, BASIC XL already had the space available to add information for its internal shenanigans, while still remain backward compatible.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 22 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: John West and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: