6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 21, 2024 6:58 pm

All times are UTC




Post new topic Reply to topic  [ 123 posts ]  Go to page Previous  1 ... 5, 6, 7, 8, 9  Next
Author Message
PostPosted: Sun Jul 28, 2019 12:06 pm 
Offline
User avatar

Joined: Tue Mar 21, 2017 6:57 pm
Posts: 81
barrym95838 wrote:
Well, here's a suggestion that might work:
Give it a go and let us know if it crashes! If you decide that you really like it, it shouldn't be super difficult to make a "native" Gigatron version.

Thanks! I just tried the following (it's a bit simpler and it gives me more stack space):
Code:
bang     = $82      ; {!}  return line number
         [...etc...]
nulstk   = $007f    ; [Gigatron] v6502 stack in page 0
         [...snip...]
simple:
    asl             ; form simple variable address
    adc  #$40       ; [Gigatron] sp..'_' -> $80..$fe
    bne  oper8d     ; [Gigatron] (always taken)

With that it looks like we have a new interactive language on the Gigatron:
Attachment:
Screenshot 2019-07-28 at 13.51.40.png
Screenshot 2019-07-28 at 13.51.40.png [ 72.73 KiB | Viewed 3093 times ]

User program space is from $700 to $7ff, and the input buffer is 34 bytes. So not a whole lot yet. Both can be improved by rearranging screen memory. That wins a couple of KB, but that's something for after summer.


Top
 Profile  
Reply with quote  
PostPosted: Sun Jul 28, 2019 4:32 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Congratulations! I raise my bottle of sparkling water to your success!

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 31, 2019 10:31 am 
Offline
User avatar

Joined: Tue Mar 21, 2017 6:57 pm
Posts: 81
Everybody likes a good shootout. Finding primes under 1000:
Code:
VTL02 on v6502     TinyBASIC on vCPU
--------------     -----------------
10 N=7             10N=7
20 M=4             20M=4
30 D=5             30D=5
40 E=2             40E=2
50 X=N/D           50IFN%D=0GOTO100
55 #=%=0*100
60 D=D+E           60D=D+E
70 E=6-E           70E=6-E
80 #=N>(D*D)*50    80IFD*D<=NGOTO50
90 ?=" ";          90?" ";N;
95 ?=N
100 N=N+M          100N=N+M
110 M=6-M          110M=6-M
120 #=N<999*30     120IFN<999GOTO30
#=1                RUN
--------------     -----------------
Elapsed:           Elapsed:
1m18.5s            1m16.0s

The TTL system is clocked at its standard 6.25 MHz while put in the fast VGA "mode 3" (25% of scanlines are drawn). vCPU is the 16-bit virtual CPU optimised for Gigatron. v6502 is the 8-bit virtual CPU interpreting 6502 opcodes.

VTL02C is indeed quite speedy! v6502 and TinyBASIC nicely balance out their inefficiencies. This bodes well for the next phase of running MS-BASIC on v6502...


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 31, 2019 10:56 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Unexpectedly close!


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 31, 2019 12:04 pm 
Offline
User avatar

Joined: Tue Mar 21, 2017 6:57 pm
Posts: 81
BigEd wrote:
Unexpectedly close!
Absolutely, and it gets better: I missed the conditional operators specific to VTL02: [ and ]. At first just replacing * with [ in line 55, 80 and 120 didn't make any difference on the run time. But then also simplifying line 55 from
Code:
55 #=%=0[100
to
Code:
55 #=%]100
won a few seconds, for a new total of 1m14.9s and VTL and taking the lead.

Code:
VTL02 on v6502     TinyBASIC on vCPU
--------------     -----------------
10 N=7             10N=7
20 M=4             20M=4
30 D=5             30D=5
40 E=2             40E=2
50 X=N/D           50IFN%D=0GOTO100
55 #=%]100
60 D=D+E           60D=D+E
70 E=6-E           70E=6-E
80 #=N>(D*D)[50    80IFD*D<=NGOTO50
90 ?=" ";          90?" ";N;
95 ?=N
100 N=N+M          100N=N+M
110 M=6-M          110M=6-M
120 #=N<999[30     120IFN<999GOTO30
#=1                RUN
--------------     -----------------
Elapsed:           Elapsed:
1m14.9s            1m16.0s


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 31, 2019 7:12 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Yeah, I spent six bytes for a little check inside the multiply loop to finish early for small left-factors like 0 and 1, so it's probably not much slower than [ and ] for your use case. Large left-factors suffer a tiny penalty (128 additional cycles worst case), but I felt that it was a reasonable trade-off.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Sun May 23, 2021 9:23 pm 
Offline
User avatar

Joined: Sat Sep 29, 2012 10:15 pm
Posts: 904
Good news! I just got VTL02 (Mike's version C) working on Daryl's SBC-4 (10MHz W65C816).

I am somewhat surprised that the prime-number shootout code ran in just over 3 seconds.

Thank you, Mike!
Attachment:
temp.png
temp.png [ 597.36 KiB | Viewed 2066 times ]


Code is here: repo at Gitlab (no microsoft for me).

Porting notes:

It appears that the main stumbling block to porting is VTL02's zero page usage. As I understand, it needs a solid block of 128 bytes at $80, and the routine called simple must be matched. Mike has instructions for relocating zero-page block.

May I suggest making the code slightly more portable:
Code:
BASE = $0
at   = BASE+$00   ; {@}* internal pointer / mem byte
; VTL02C standard user variable space
;                     {A B C .. X Y Z [ \ ] ^ _}
; VTL02C system variable space
space    = BASE+$40      ;
bang     = BASE+$42      ; {!}  return line number
quote    = BASE+$44      ; {"}  user ml subroutine vector
...

One must still adjust simple, but it makes the job a little easier, and clearly communicates the need for all the variables to stay together as a block (the labeling led me to think otherwise until I wised up)...

The difference in assembler syntaxes is infuriating. Luckily there was only a handful of replacements for low and high address syntax.

Overall, porting is trivial. It took me a few hours to rediscover that the OS uses top of zero page, tracing weird stack corruption that ensued. It took me longer than it should have to notice that I can't just move the vars in $F0-$FF down arbitrarily (more strange bugs). If I was at the top of my game I could've done this in 10 minutes.

_________________
In theory, there is no difference between theory and practice. In practice, there is. ...Jan van de Snepscheut


Top
 Profile  
Reply with quote  
PostPosted: Tue May 25, 2021 4:33 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Congratulations! I have an optimized NMOS version C (somewhere) that has identical features but is a bit smaller and faster, but the death of my desktop PC and my huge backlog of distractions has prevented me from debugging it properly ... I know it was still buggy when I stalled on it, but it sure was a tight little 8-bit ball of twine. A 16-bit '802/'816 version would definitely be faster, but I don't see it being any smaller, due to all of the annoying REP# and SEP# housekeeping. I would likely do a 'c02 version if I found the time, but a native 16-bit version isn't likely, at least not from me ... the '816 just doesn't really flip my switch.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Tue May 25, 2021 4:55 am 
Offline
User avatar

Joined: Sat Sep 29, 2012 10:15 pm
Posts: 904
I may psych myself into doing an '816 port. I think the trick with '816 is to keep the indices in 8-bit mode, and only diddle the accumulator size here. The big win is the memory move takes almost no code, and 16-bit variable access and arithmetic, of course. I suspect it will be smaller as well as faster. I am actually running in '816 mode, with 8-bit registers, so experimenting should be fairly simple.

The '816 allows the zero page to be anywhere in the low 64K bank... This begs for VTL-OS, with multiple VTL-02 programs, each with its own zpage and data... These can be pre-emptively switched, or coersively multitasked via the interpreter loop.

_________________
In theory, there is no difference between theory and practice. In practice, there is. ...Jan van de Snepscheut


Top
 Profile  
Reply with quote  
PostPosted: Wed May 26, 2021 6:17 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
enso wrote:
The big win is the memory move takes almost no code, and 16-bit variable access and arithmetic, of course.

The code at skp2: deals directly with deleting and inserting program lines, and has been 151 bytes of NMOS for a long time. I'll nominate you for employee of the month if you can figure out a way to make it more than a few bytes smaller on any 65xx.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Wed May 26, 2021 3:11 pm 
Offline
User avatar

Joined: Sat Sep 29, 2012 10:15 pm
Posts: 904
I re-coded most of the operators with a 16-bit accumulator, which shaved off about 100 bytes. It's true, the 4-byte hit for switching into and out of 16-bit mode adds up. I think I can temporarily use Y for character dispatch and stay in 16-bit mode, saving 4 bytes per operator.

I choked on the really clever < = > code, it is all tangled there, in the final case that handles all three. I think I can subtract the two values in advance of the final 3 operators, handle the = case, then rotate the sign bit to low. For < I can AND 1 (negative is true). For > I need to flip the result. I didn't quite understand how you did all 3 in something like 10 bytes and went to bed frustrated. If you have a minute, a quick lesson on efficiently converting positive and negative values to 0 and 1 would be helpful.

I was also considering constructing a jump table for operator dispatch, for characters $20-$3F. That's 64 bytes to cover almost all cases, much faster and probably close to the size of the current dispatcher. Actually $20-$2F contain most operators, especially if < = > are handled separately as a last resort.

As for moving memory, the MVP instruction should do most of it, and I am looking to recover 100+ bytes right there.

I have a feeling that the '816 version should fit into 768 bytes!

Do you have the original 6800 code perchance? All I could find was a bitmap of a printout that was barely comprehensible. Not that my 6800-foo is any good.

_________________
In theory, there is no difference between theory and practice. In practice, there is. ...Jan van de Snepscheut


Top
 Profile  
Reply with quote  
PostPosted: Thu May 27, 2021 5:13 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
enso wrote:
As for moving memory, the MVP instruction should do most of it, and I am looking to recover 100+ bytes right there.

It looks as though you would use MVP for inserting a line of code, but MVN for deleting a line of code. There would be some performance gain as program size increases, especially if the line being inserted or deleted is near the beginning of program text

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Thu May 27, 2021 6:42 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
enso wrote:
I choked on the really clever < = > code, it is all tangled there, in the final case that handles all three. I think I can subtract the two values in advance of the final 3 operators, handle the = case, then rotate the sign bit to low. For < I can AND 1 (negative is true). For > I need to flip the result. I didn't quite understand how you did all 3 in something like 10 bytes and went to bed frustrated. If you have a minute, a quick lesson on efficiently converting positive and negative values to 0 and 1 would be helpful.

Bruce helped me with that, so it's a bit above my pay grade to help you completely understand it. The corresponding part from version A is all mine, but it has been over nine years since I visited that, so ...

Quote:
Do you have the original 6800 code perchance? All I could find was a bitmap of a printout that was barely comprehensible. Not that my 6800-foo is any good.

I found this in my old stuff. My Altair to SWTPC port came before the 6502 version, so it doesn't have the bit-wise operators or THEN/ELSE/PEEK/POKE. I'm not 100% sure that this one is totally bug-free, because I was optimizing for size and juggling different revisions before I abandoned it (looks like I got it down to 709 bytes), but here ya go ...

Attachment:
VTL2a.LST (2).txt [33.96 KiB]
Downloaded 81 times


Carl D. Warren's 6809 Cookbook (or at least the old paper-and-ink edition I have) contains a full listing of a rather crude 8/16-bit port, but I don't know how available or useful that would be to you.

Enjoy, and happy coding!

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 20, 2022 9:01 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
I've been puttering around with a 65c02 version ... let's call it VTLC02. I've taken advantage of new techniques I've learned in the last ten years and the "fancy new" instructions and addressing modes of the 'c02 to shave a couple dozen bytes off of VTL02C, but it's still buggy, and I have limited spare time as usual. I'll post it when I get it working in the Kowalski simulator, since I've been trying to move my 65xx stuff into a more generic environment. A laptop crash here caused a big setback, but I'm slowly swimming back upstream.

Question for the interested members: Should I see if I can squeeze another feature like multi-statement lines or a faster run-time line FINDer with the space that I saved switching to CMOS? I want backward software compatibility and < 1024 ROMable bytes for the entire IDE, but the rest is still undecided. Error handling beyond a "?" message is almost certainly off the table, though ...

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 20, 2022 11:16 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
A speedup from finding lines faster sounds good!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 123 posts ]  Go to page Previous  1 ... 5, 6, 7, 8, 9  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 13 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: