6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 21, 2024 2:00 pm

All times are UTC




Post new topic Reply to topic  [ 123 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7 ... 9  Next
Author Message
PostPosted: Mon Oct 05, 2015 8:48 am 
Offline

Joined: Mon Mar 25, 2013 9:26 pm
Posts: 183
Location: Germany
Hey Mike,

thanks for the flowers :-)
Maybe I take the time and record the thing in English again at home. At this time with a working copy-paste demo of the VTL language. I'm still wondering about the problem with pasting the code to the terminal. I did the exact same thing one hour before for testing and it worked without any problem. The terminal is able to make a small break after each character to give the machine time to read it and it is set to 5ms what should quite long enough.

Mario.

_________________
How should I know what I think, until I hear what I've said.


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 05, 2015 9:43 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Perhaps you just forgot to set your & system variable back to a NULL program before pasting, and the editor was taking too long to process the line, causing a few bits from the first character of the following line to get dropped. The editor can scan through existing lines and append rather quickly, but it slows down considerably when it has to insert a line anywhere in the middle, due to its naive but very compact insertion algorithm.

Are you using interrupts and buffers to send and receive terminal characters, or just brute-forcing them one at a time?

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 07, 2015 4:56 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
I'm pleased to report that, with Mike's assistance, I've got his latest VTL02B version running on a Beeb - or at least, on an in-browser emulation of a Beeb. Here it is with Klaus' earlier, shorter prime number program:

Attachment:
jsbeeb-vtl-klaus-primes-1.png
jsbeeb-vtl-klaus-primes-1.png [ 119.69 KiB | Viewed 3289 times ]


You can run the interpreter in your browser using this link. (The keyboard mapping is idiosyncratic, to help people play games - use the menu thing to choose between two modes. I tend to mash about until I find the character I need.)

Edit: this emulation doesn't have support for copy and paste at present, or I think for load and save, so there's an incentive to try only short VTL programs. Also, it doesn't react well to Escape or ^C - that's something I should be able to address.)


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 08, 2015 2:54 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Congratulations for getting it working, and for the longest URL I've ever seen! Crtl-C seems to work as expected over here, but ESC causes an instant crash.

Could it be that the Beeb's OS doesn't tolerate resetting the hardware stack at every OK prompt, a strategy that VTL02 uses to abort out of a few different sticky situations? I admit that the Apple 1 and Apple 2 are the only platforms on which I did extensive testing, and they didn't seem to mind at all, but they could be a lot simpler than the Beeb in that area.

Mike B.

[Edit: Oh, I see ... Ctrl-C only seems to work at the command prompt, not during a newline print check.]


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 08, 2015 10:12 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
I think I originally commented out the stack reset, but I see it's still there in the present version. I don't think it should cause trouble - running in user code we know we're not in the middle of an interrupt or an OS call. We can't RTS to Basic anyway because we overwrote its zero page allocation. (We can restart into Basic anyhow.)

I haven't touched the code which aims to do abort handling or checking for a current keypress - the OS provides a flag to say that ESC has been pressed, which is probably the most appropriate approach on a Beeb.

Would it be best to ensure that VTL only sees input characters within the range it expects? That is, printable characters. I suppose there's return, escape, erase and line kill to deal with too.

As a nice side-effect of not taking over the machine entirely, the cursor keys and Copy key act in a Beeb-native way, which makes it easier to edit an existing program - if it's visible on-screen.

I'll post some Beeb resources in a new thread - could be more generally useful. Here we are: viewtopic.php?f=3&t=3482


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 17, 2015 1:13 pm 
Offline

Joined: Sat Jul 28, 2012 11:41 am
Posts: 442
Location: Wiesbaden, Germany
I added Kowalski simulator specific I/O and syntax to VTL02B.
Attachment:
vtl02b_for_Kowalski.zip [11.2 KiB]
Downloaded 171 times
It was really easy to make the necessary changes due to the clear structure and mostly compatible syntax of the source file. Thank you very much Mike!

Tomorrow I will put the original VTL02B and the Kowalski version on my GitHub account. A version for the Kingswood AS65 will follow.

_________________
6502 sources on GitHub: https://github.com/Klaus2m5


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 17, 2015 3:24 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Thank you very much for your efforts, Klaus. Is there any other feature we can add without breaking the 1K barrier? I'm starting to doubt it, but I've definitely been wrong before.

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 18, 2015 7:44 am 
Offline

Joined: Sat Jul 28, 2012 11:41 am
Posts: 442
Location: Wiesbaden, Germany
Hi Mike,
let me start by saying that with the added logical operators and with direct memory access there is nothing really required to be added to VTL functionality. Everything that I can think of doing with 16-bit unsigned integers can be done with VTL, although it may require some creative programming.

So almost everything I can think of to be added to VTL has to do with improving run time performance of VTL and it will probably fill 2k of code space. And I haven't looked at wether it is possible to make the change and what magnitude the change is going to have. So here it goes.

1. Change the operator decode from a chain to a tree style.
    VTL is currently decoding operators one at a time with the last possible operator being the default operator. One could order the operators by their ASCII value and start in the middle, then in the middle of both halves and so on. This also means, that there is not just one default operator but many depending on which side of the tree the operator is. Bringing it back to a single default operator would require additional compares and branches.

2. Add a statement separator to allow multiple statements in a line.
    This would allow for tighter loops as you can go back to the same line number with the last statement (#=) in the line not having to comb through the program for a different line number. Also VTL would have to comb through less lines in overall tighter code to find the target of a #= statement.

3. Store constants in the program as 16-bit binaries.
    As you said before a lot of run time is devoted to converting the constants in the program over and over again. The unused bit 7 of the preceeding operator could be used to signal wether the following byte or bytes is a variable or are a binary constant.

4. Add a go to cache or other means to improve #= performance.
    I am rather vague here as I have no idea how this could be achieved. The problem is, that VTL allows fuzzy line numbers even in variables (#=! for example) and the program continues with the next higher valid line number.

5. Add operator(s) to replace a multiply in a go to statement
    A simple compare 0 or >= 1 should outperform the currently required multiply. I call them the "then" or "else" operator and would even work on a bit map as the possible input is not limited to 0 or 1. A limitation of course would be, that the condition has to be evaluated on the left side of the operator and the line number must be on the right side of the operator. In a "then" statement the result would be identical to a multiply with 1 or 0 while "else" would reverse the result (like an xor 1 prior to the compare).

6. Add shift operators.
    Now this is just a very low priority issue as there is a performance gain over shifting by multiplying or dividing, but I think shifts will rarely be needed. It may just supplement the logical operators.

7. Revisit some of the space saving changes.
    The add routine as an example:
    VTL02A
    Code:
    add
        clc             ;2
        lda  0,x        ;4 var[x] += var[x+2]
        adc  2,x        ;4
        sta  0,x        ;4

        lda  1,x        ;4
        adc  3,x        ;4
        sta  1,x        ;4
        rts             ;6
                        ;---
                        ;32 cycles, 14 bytes
                        ;===
    VTL02B
    Code:
    plus:
        clc             ;2 var[x] += var[x+2]
        dex             ;2
        jsr  plus2      ;6
        inx             ;2
    plus2:
        lda  1,x        ;4*2
        adc  3,x        ;4*2
        sta  1,x        ;4*2
        rts             ;6*2
                        ;---
                        ;48 cyles, 13 bytes
                        ;===
    So you actually saved 1 byte (7%) but increased the cycle count by 16 cycles (50%). Tuning for space takes a toll, sometimes a very big toll. Of course this is not a representative sample. The overall VTL code is still very fast. This is just a reminder, that code density is not everything.

Have a nice and quiet sunday!

cheers, Klaus

P.S.: I just ran my prime number program for 1000 primes on my emulator (~2MHz) and compared the performance. VTL02A = 72 seconds, VTL02B = 85 seconds

_________________
6502 sources on GitHub: https://github.com/Klaus2m5


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 18, 2015 4:07 pm 
Offline

Joined: Mon Mar 25, 2013 9:26 pm
Posts: 183
Location: Germany
barrym95838 wrote:
I found your video here. I must say that you are a very handsome geek! It's a pity that I don't speak more than a dozen words of German, but I enjoyed it as much as I could, under the circumstances. It's also a pity that your copy/paste was dropping characters at the beginning of each line, spoiling your VTL02 presentation a bit.

Mike B.


I've rerecorded my talk today at home with translated slides. It has become a 1 hour video. https://www.youtube.com/watch?v=-lN57MKi0bo
The aspect ratio is broken by whatever reason, the original video is OK. My microphone is not the best, so the volume could be a little bit higher.

Mario.

_________________
How should I know what I think, until I hear what I've said.


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 18, 2015 9:06 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
@Klaus:

Quote:
1. Change the operator decode from a chain to a tree style.
VTL is currently decoding operators one at a time with the last possible operator being the default operator. One could order the operators by their ASCII value and start in the middle, then in the middle of both halves and so on. This also means, that there is not just one default operator but many depending on which side of the tree the operator is. Bringing it back to a single default operator would require additional compares and branches.

2. Add a statement separator to allow multiple statements in a line.
This would allow for tighter loops as you can go back to the same line number with the last statement (#=) in the line not having to comb through the program for a different line number. Also VTL would have to comb through less lines in overall tighter code to find the target of a #= statement.

I like your ideas, and I can cautiously state that I could try implementing both of these within the 1KB limit. I know that it might be a silly notion, but I really like the <1KB "feature", and would like to keep it for narcissistic reasons. Regarding #= ... you may have noticed that a statement like 50 #=50 will not loop, because the interpreter only branches if # actually changes.

Quote:
3. Store constants in the program as 16-bit binaries.
As you said before a lot of run time is devoted to converting the constants in the program over and over again. The unused bit 7 of the preceeding operator could be used to signal wether the following byte or bytes is a variable or are a binary constant.

This could provide a performance boost, but would require some extra effort during program (line) insertion, because a binary two-byte constant can be represented by one to five ASCII characters, requiring some adjustments to the remainder of the line, as well as slightly complicating the program listing feature. It is certainly doable, but its benefits could also be enjoyed by assigning the constants to variables at run-time initialization, as you did with your revised prime number generator program.

Quote:
4. Add a go to cache or other means to improve #= performance.
I am rather vague here as I have no idea how this could be achieved. The problem is, that VTL allows fuzzy line numbers even in variables (#=! for example) and the program continues with the next higher valid line number.

This would be awesome, and I feel (without actually knowing) that backwards branches are where the interpreter spends most of its time, especially in larger programs. I don't think that I have the talent to pull it off, though.

Quote:
5. Add operator(s) to replace a multiply in a go to statement
A simple compare 0 or >= 1 should outperform the currently required multiply. I call them the "then" or "else" operator and would even work on a bit map as the possible input is not limited to 0 or 1. A limitation of course would be, that the condition has to be evaluated on the left side of the operator and the line number must be on the right side of the operator. In a "then" statement the result would be identical to a multiply with 1 or 0 while "else" would reverse the result (like an xor 1 prior to the compare).

Another good point, although I would be concerned about breaking existing programs by doing something like making "true" as 65535 and using OR [edit: I mean AND] instead of *. I do think that I can play with my multiply and make it much faster if one of the multiplicands is a one or a zero, and that should help performance without sacrificing backwards-compatibility.

Quote:
6. Add shift operators.
Now this is just a very low priority issue as there is a performance gain over shifting by multiplying or dividing, but I think shifts will rarely be needed. It may just supplement the logical operators.

I'll have to think about that.

Quote:
7. Revisit some of the space saving changes.
The add routine as an example:
[ ... snip ... ]
Tuning for space takes a toll, sometimes a very big toll. Of course this is not a representative sample. The overall VTL code is still very fast. This is just a reminder, that code density is not everything.

You're correct, of course. I often suffer from a one-track mind.

Quote:
P.S.: I just ran my prime number program for 1000 primes on my emulator (~2MHz) and compared the performance. VTL02A = 72 seconds, VTL02B = 85 seconds

I don't have an easy way to prove it, but I think that the new space-skipping feature is the significant factor in the measured slow-down, not the other changes. It would take a bit of effort to prove it.

Thank you for your help.

@Mario:

Liked! 8)

Mike B.

P.S. In light of Klaus' performance report, I'm going to in-line all instances of space-skipping, add six bytes to mul: to improve performance for small multipliers, add one byte back to plus:, and take a long hard look at find: ... my gut tells me that speed improvements in these four areas will provide the highest overall performance benefits. I'll report back when I complete the experiment.


Last edited by barrym95838 on Tue Oct 20, 2015 3:55 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 19, 2015 11:03 am 
Offline

Joined: Sat Jul 28, 2012 11:41 am
Posts: 442
Location: Wiesbaden, Germany
Quote:
Quote:
P.S.: I just ran my prime number program for 1000 primes on my emulator (~2MHz) and compared the performance. VTL02A = 72 seconds, VTL02B = 85 seconds

I don't have an easy way to prove it, but I think that the new space-skipping feature is the significant factor in the measured slow-down, not the other changes. It would take a bit of effort to prove it.
I tested with all jsr getbyte replaced by lda (at),y and jsr skpbyte replaced by iny lda (at),y completely disabling the skip space feature. It took 76 seconds to run the same program as above. So most of the performance loss comes from this feature.

Quote:
Regarding #= ... you may have noticed that a statement like 50 #=50 will not loop, because the interpreter only branches if # actually changes.
(scratching head) No, I haven't noticed but yes, it is obvious and is a good reason to not have multiple statement lines.

Quote:
Quote:
5. Add operator(s) to replace a multiply in a go to statement
A simple compare 0 or >= 1 should outperform the currently required multiply. I call them the "then" or "else" operator and would even work on a bit map as the possible input is not limited to 0 or 1. A limitation of course would be, that the condition has to be evaluated on the left side of the operator and the line number must be on the right side of the operator. In a "then" statement the result would be identical to a multiply with 1 or 0 while "else" would reverse the result (like an xor 1 prior to the compare).

Another good point, although I would be concerned about breaking existing programs by doing something like making "true" as 65535 and using OR instead of *. I do think that I can play with my multiply and make it much faster if one of the multiplicands is a one or a zero, and that should help performance without sacrificing backwards-compatibility.
No change to existing behavior would be required. A demonstration is in order. Let's say [ is then and ] is else:
Code:
just performance
now  100 #=A>B*1000
then 100 #=A>B[1000

test for zero
now  200 #=A=0*1000
then 200 #=A[1000

test for not equal
now  300 #=A=99=0*1000
then 300 #=A=99]1000

test bit is set
now  400 #=A&128>1*1000
then 400 #=A&128[1000

test bit is clear
now  500 #=A&64=0*1000
then 500 #=A&64]1000
All existing code would still work as there is no change to true = 1. However, then and else accept true >=1. On the other hand, making existing code run faster by optimizing multiply by one or zero is of course also a good move.

Quote:
add one byte back to plus:
Don´t forget minus:.

_________________
6502 sources on GitHub: https://github.com/Klaus2m5


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 19, 2015 3:34 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Would
#=#-1
Do the trick?

Would it be easier or better to speed up loops not by speeding up go to, but by removing it: using a for..next or repeat..until (or maybe a while..done) - the interpreter keeps a stack of loop-tops instead of looking for line numbers?


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 19, 2015 4:05 pm 
Offline

Joined: Sat Jul 28, 2012 11:41 am
Posts: 442
Location: Wiesbaden, Germany
BigEd wrote:
Would
#=#-1
Do the trick?
Yes, but at the same time it causes VTL to start searching for the line from the beginning, so no performance gain.

BigEd wrote:
Would it be easier or better to speed up loops not by speeding up go to, but by removing it: using a for..next or repeat..until (or maybe a while..done) - the interpreter keeps a stack of loop-tops instead of looking for line numbers?
Maybe a variable could represent the stack. Another variable might represent the physical address rather then the line number of the currently executing VTL line. Howerver, a go to can not completely be avoided as a gosub is still needed and there is no proper conditional structure like if then else.

_________________
6502 sources on GitHub: https://github.com/Klaus2m5


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 20, 2015 3:46 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Klaus2m5 wrote:
No change to existing behavior would be required. A demonstration is in order. Let's say [ is then and ] is else:
Code:
just performance
now  100 #=A>B*1000
then 100 #=A>B[1000

test for zero
now  200 #=A=0*1000
then 200 #=A[1000

test for not equal
now  300 #=A=99=0*1000
then 300 #=A=99]1000

test bit is set
now  400 #=A&128>1*1000
then 400 #=A&128[1000

test bit is clear
now  500 #=A&64=0*1000
then 500 #=A&64]1000
All existing code would still work as there is no change to true = 1. However, then and else accept true >=1.

Ah, got it!

A[B gives 0 if A=0, and B otherwise.
A]B gives B if A=0, and 0 otherwise.

Very nice, and certainly worth a serious attempt!

Quote:
Don´t forget minus:.

mul: and plus: are used heavily by the interpreter for internal operations ... minus: and the others, not. I don't think that making minus: and the others 33% faster would have any measurable effect on performance, so I'm going to leave them in their shortened forms until it becomes obvious that I can afford to restore them to full speed without breaking the 1KB limit. find: needs some help, and I get the feeling that it's going to cost me some bytes.

Thanks again,

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 21, 2015 8:46 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Well, I added THEN and ELSE ( [ and ] ), rearranged some stuff to shave a few cycles here and there, and I threw some effort into speeding up * and find with a few additional instructions. I also used Klaus' technique to avoid double new-lines during program entry. It weighs in at 1021 bytes in Apple ][ trim, but I haven't had a chance to test it yet. I was unable to in-line the space-skipping feature occurrences, because it was gobbling up too many bytes, and probably not saving much in the way of cycles anyway.

I'll post my updated source after I get it debugged to my satisfaction, which for me means a successful short-range sensor scan in my Super-StarTrek port (still incomplete).

Mike B.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 123 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7 ... 9  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 9 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: