6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Apr 26, 2024 4:24 pm

All times are UTC




Post new topic Reply to topic  [ 5 posts ] 
Author Message
PostPosted: Sat Jan 14, 2023 7:41 pm 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 745
Location: Germany
While i was pondering what new project to work on for my 65816 SBC i randomly remembered that SWEET16 is a thing and wondered if anyone ported it to the 65816 yet...
around 2 seconds later by brain actually turned on and i realized that a 16-bit VM like SWEET16 makes very little sense on an already 16-bit CPU.
so then i thought, why not take the base idea of SWEET16 (a RISC-like VM) and make it 32-bit?
and because i'm not good with names i just started calling it "SWEET32" even though it's only inspired SWEET16, and not actually compatibile with it...

also, i actually lost ALL the work i did on this last year and only because Notepad++ keeps 2 backups of every file you ever opened was i able to recover it completely. so thanks Notepad++!

anyways, back to the VM.
SWEET32 (or SW32 for short) is a very RISC like architecture:
  • 15x 32-bit Registers named R1-R15, with R0 being a constant 0
  • a 32-bit Program Counter (functionally only 24-bit)
  • an 8-bit Status Register called SR, holding the ALU flags "Zero", "Negative", and "Carry". in addition to 4 user flags F4-7
  • full 24-bit Addressing (though Load/Store Instructions also be limited to the current bank)

Most Instructions are 2 bytes large, except for the "Load Immidate" Instructions, which are either 4 or 6 bytes in size.
Speaking of Instructions, here's all of them, with the formatting explained:
Ra = Source Register A
Rb = Source Register B
Re = Destination Register
Rx = Source/Destination Register combo
Code:
<------------------------------------------------->
Branch on Clear             - BNv k
Branch on Set               - Bv k

Branches if the specified bit "v" in the SR is set/cleared.
"k" is the address to branch to, encoded as a 8-bit signed offset multiplied by 2, giving it a -128 to +127 Word range.

<------------------------------------------------->
Jump and Link               - JAL Re, Ra

Jumps to the Address in the Source Register "Ra" and stores the address of the following instruction in the Destination Register "Re".
using R0 as the Destination Register makes JAL function like a regular Jump,
using R0 as the Source Register won't modify the PC, bascially just loading the Address of the next Instruction into a Register

<------------------------------------------------->
Add                         - ADR Re, Ra, Rb    <>Flags: Z C N
Subtract                    - SBR Re, Ra, Rb    <>Flags: Z C N
Logic AND                   - ANR Re, Ra, Rb    <>Flags: Z - N
Logic OR                    - ORR Re, Ra, Rb    <>Flags: Z - N
Logic XOR                   - XOR Re, Ra, Rb    <>Flags: Z - N
Logic Shift Left            - SFL Re, Ra        <>Flags: Z C N
Logic Shift Right           - SFR Re, Ra        <>Flags: Z C 0 (N Flag is always cleared)
Rotate Left                 - RLR Re, Ra        <>Flags: Z C N
Rotate Right                - RRR Re, Ra        <>Flags: Z C N

Arithmetic/Logic Instructions all function pretty much the same: Re = Ra <operation> Rb
the Shifts and Rotates are almost the same as the 65xx versions, except that the Source and Destination Registers can be different.
but i'll go into some detail about the flags!
the Zero Flag (Z) is functionally identical to the 65xx Z Flag, if the result of an operation is 0, it's set, otherwise cleared
the Negative Flag (N) is also like the 65xx N Flag, it just copies the MSB of the result into itself
The Carry Flag (C) in the Shift and Rotate Instructions works the same as in the 65xx Instructions, but for Add and Subtract it's slightly different.
Specifically Add and Subtract don't use the Carry Flag as an input, only as output. Also Subtract sets the Carry when a Borrow occurs, which is the opposite of how SBC works on the 65xx

<------------------------------------------------->
Add Immediate               - ADI Rx, k         ; Flags: Z C N

Takes the 8-bit constant "k", sign extends it to 32-bits, and adds it to the combined Source/Destination Register.
Flags are updated exactly like the ADR Instruction would.

<------------------------------------------------->
Load Byte (Signed)          - LB Re, Ra
Load Byte (Unsigned)        - LBU Re, Ra
Load Word (Signed)          - LW Re, Ra
Load Word (Unsigned)        - LWU Re, Ra
Load Long                   - LL Re, Ra

The Source Reginster contains the Address with the value being read getting loaded into the Destination Register

<------------------------------------------------->
Store Byte                  - SB Ra, Rb
Store Word                  - SW Ra, Rb
Store Long                  - SL Ra, Rb

Similar to the Loads, Source register A contains the Address, and Source Register B the value to write to Memory

<------------------------------------------------->
Set Bit                     - SET Rx, k
Clear Bit                   - CLR Rx, k

These have 2 different ways they can work, if "Rx" is R0, the Instructions function like REP/SEP from the 65816,
taking the 8-bit constant "k" and using it as a mask to select which bits in the SR should be set/cleared.
If "Rx" is any register besides R0, the now 5-bit constant "k" selects which bit (0-31) to set/clear in the specified Register.
btw these Instructions are the only ones that can modify the User Flags F4, F5, F6, and F7.

<------------------------------------------------->
Load Word Imm. (Signed)     - LWI Re, k
Load Word Imm. (Unsigned)   - LWIU Re, k
Load Long Immediate         - LLI Re, k

Also simple, LWI takes a 16-bit immediate value, sign extends it to 32-bits, and loads it into the Destination Register
LWIU does the same, except it zero extends the 16-bit value instead. LLI just loads a full 32-bit immediate value into the Destination Register

<------------------------------------------------->
Return to 65816 Mode        - EXIT

This Instruction Exits the VM and resumes regular 65816 program execution after the sw32_execute function

next up, the functions that the VM requires to work:

there are 2 main functions, (both are called with JSL, and expect 8-bit A, and 16-bit X/Y):

"sw32_init" - clears all Registers, and sets the Control Byte to the value in A
Currently the Control byte's only used bit is bit 7, which determins the address width for Load/Store Instructions.
if bit 7 is cleared Load/Store Instructions are limited to 16-bit addressing, if set they are 24-bit instead.

"sw32_execute" - executes SW32 code, it starts at the address given by the X and Y Registers. X = Low Word, Y = High Word (High Byte is ignored)
The function only returns when an EXIT instruction is executed.

"sw32_print" - prints out the contents of all Registers in a nice and tidy format:
Code:
R1:  $00000000
R2:  $00000000
R3:  $00000000
R4:  $00000000
R5:  $00000000
R6:  $00000000
R7:  $00000000
R8:  $00000000
R9:  $00000000
R10: $00000000
R11: $00000000
R12: $00000000  PC: $00000000
R13: $00000000
R14: $00000000      7654NCZ
R15: $00000000  SR: 0000000

in order for sw32_print to work it needs a user provided function to print a single ASCII Character to whatever output the user might have
said function has to be called "sw32_print_char", return with RTL and has to assume A is 8-bits wide, and X/Y are 16-bits wide.
in addition to sw32_print, 2 extra functions for printing hexadecimal values are also given (JSL/RTL, 8-bit A, 16-bit X/Y):
  • sw32_print_h8 - Prints the 8-bit value in A
  • sw32_print_h32 - Prints the 32-bit value that was pushed to the stack before calling the function (push the high word first)
There is no "sw32_print_h16" function, so if needed the user has to implement their own.


As a test i wrote a small bubble sort implementation:
Code:
sw32_sort:
   LLI R15, array_ptr            ; Get the Address of the Array to be sorted
   LWI R14, element_count-1      ; Get the amount of elements in the Array (minus 1)
   
   @outer_loop:
      CLR SR, F7               ; Clear the "swapped" Flag
      
      ADR R13, R15, R0         ; Save a Copy of the Array Address into R13
      ADR R12, R14, R0         ; Save a Copy of the Element count into R12
      @inner_loop:
         LL R1, R13            ; Load a Value from the Array
         ADI R13, 4            ; Increment the Pointer to the next element
         LL R2, R13            ; And Get a second Value from the Array+1
         SBR R0, R2, R1         ; Compare them (R1 - R2)
         BNC @no_swap      ; If R1 > R2, swap them
            SL R13, R1         ; Store R1 to Array+1
            ADI R13, -4
            SL R13, R2         ; Store R2 to Array
            ADI R13, 4         ; Move the Array Pointer back to where it was before
            SET SR, F7         ; And Set the "swapped" Flag
         @no_swap:
         ADI R12, -1            ; Decrement the Element counter
      BNZ @inner_loop
   BF7 @outer_loop
EXIT

the entire function takes up 44 Bytes (plus the 1.3kB for the VM to run) and on my 20MHz SBC took just below 2 minutes to sort an array of 1000 random 32-bit values.

I Uploaded everything to github, so experimenting with it should be pretty simple: https://github.com/ProxyPlayerHD/SWEET32-65816
Overall i really doubt this will ever be used for anything serious, but it would be interesting to see if a compiler made for the VM would result is more compact code compared to the native 65816 (though it would very likely run much much slower due to the emulation overhead)

anyways, this was quite fun to write and i'm pretty proud of it! so tell me your thought below.


Top
 Profile  
Reply with quote  
PostPosted: Sat Jan 14, 2023 10:46 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
That's great - I do like extensions-in-kind.


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 15, 2023 7:07 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1398
Location: Scotland
Looks very interesting, thanks.

It's something I did make a start on myself when I was looking into some 32-bit code for the '816 - I already had malloc/free routines in sweet16 for the 65C02 using my own re-write of sweet16 for the C02 but I abandoned it when I decoded to push on with the bytecode VM that I'd need to run BCPL. I'll have a deeper look at this this week if I have time though (but I'm on a training course of sorts for some of this week which will take most of my time).

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 15, 2023 7:26 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8143
Location: Midwestern USA
Proxy wrote:
While i was pondering what new project to work on for my 65816 SBC i randomly remembered that SWEET16 is a thing and wondered if anyone ported it to the 65816 yet...so then i thought, why not take the base idea of SWEET16 (a RISC-like VM) and make it 32-bit...because i'm not good with names i just started calling it "SWEET32" even though it's only inspired SWEET16...

When Woz concocted the SWEET16 interpreter, he not only was referring to the 16-bit pseudo-processor that SWEET16 implemented, he was tongue-in-cheek acknowledging “sweet 16” as mentioned in 1950s and 1960s rock ’n roll tunes, e.g., “sweet 16 and never been kissed.” Somehow, that doesn’t seem to translate well to SWEET32—“sweet 32 and never been...” :D

So when are you going to give this a try? It should be no more difficult than SWEET16, possibly easier, in fact, due to the 65C816’s more-numerous addressing modes and the ease at which the stack can be used as a scratch-pad. I do 64-bit arithmetic in some of my programs and it doesn't seem much more difficult than doing 32-bit math. Parsing the SWEET32 language should be somewhat easier as well.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 15, 2023 10:51 am 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 745
Location: Germany
BigDumbDinosaur wrote:
So when are you going to give this a try?

what exactly do you mean with that? i did already write all the code and debugged most of it, it's on the github linked at the very bottom of the post.
you can download the source code and/or the premade library, and write SWEET32 code right now using cc65's "ca65" assembler (for your POC for example) :D


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 20 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: