6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 10:31 pm

All times are UTC




Post new topic Reply to topic  [ 20 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Mon Mar 04, 2019 2:23 am 
Offline

Joined: Mon Sep 17, 2018 2:39 am
Posts: 138
Hi!

whartung wrote:
I'm going to get all pedantic here. These are all well and good, but just to note -- none of this code is licensed (beyond sole copyright of the owners). And folks, in theory, "can't use any of it" if they were looking for code to use.


To make it clear, any code I posted of less than 50 lines is public domain (CC0).


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 20, 2019 9:39 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Okay, on the Base64 problem:

A central part of the problem is converting a group of 3 8-bit bytes into 4 6-bit words. So here's a hint as to how to do that efficiently - shift left 6 by shifting right 2.
Code:
; Input is at in1/2/3
; Output to out1/2/3/4
   LDA in2   ; contains 2 4-bit sections of words
   STA out2
   AND #$0F
   STA out3
   LDA in1   ; contains one 6-bit word and one 2-bit section
   LSR A
   ROR out2
   LSR A
   ROR out2
   STA out1   ; first word done
   LSR out2
   LSR out2   ; second word done
   LDA in3   ; contains one 2-bit section and one 6-bit word
   ASL A
   ROL out3
   ASL A
   ROL out3   ; third word done
   LSR A
   LSR A
   STA out4   ; fourth word done
If you have an exact multiple of 3 bytes to encode, then just wrap this in a loop (maybe change in1/2/3 so they can index over a larger input buffer without a separate copy step) and add lookups of out1/2/3/4 in the alphabet string afterwards.

If you might not have an exact multiple of 3 bytes to encode, then you need to be more careful. First, pad the input to a multiple of 3 bytes with appended zeroes, and run the above loop on the padded input. Then replace the same number of pad bytes at the end of the Base64 output with the '=' character, ensuring the output string stays a multiple of 4 characters. The '=' counts as zero in Base64 and also informs the decoder about the true length of the input.


Top
 Profile  
Reply with quote  
PostPosted: Fri Mar 22, 2019 12:17 am 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
Here's my stab at base64.

Appreciate I haven't written this much assembly in, what, 30+ years. Last time I wrote anything of note was when I converted a Maze generation for the Atari from BASIC to assembly.

Originally I had a clever of doing the bytes backward, pushing each encoded char on to the stack, then pulling them off at the end.

But when it came to reusing them for the stragglers, it was extra messy. So I just brute forced it.

The code seems to actually work.

Code:

; Copyright 2019 Will Hartung
; Apache License 2.0

        *=0

; BUF and OUTBUF are required to be in Zero Page, the others can be anywhere

BUF     .WORD   0
BUFLEN  .WORD   0
OUTBUF  .WORD   0
OUTLEN  .WORD   0
WORK    .BYTE   0

        *=$0200

; BASE64 encoder.
; Process as many 3 byte chunks as possible, leave the remaining 1 or 2 as special case.
;
; Entry:
;   BUF    = address of buffer to encode
;   BUFLEN = length of buffer
;   OUTBUF = address of output buffer
;
; Returns:
;   OUTLEN = number of bytes in output buffer
;   BUF, BUFLEN, OUTBUF, and all registers are not preserved.


BASE64  LDA     BUFLEN+1        ; Check if we have more then 3 characters to process.
        BNE     PROC3           ; Bunches, off to happy path of 3 at a time.
        LDA     BUFLEN
        CMP     #3
        BCS     PROC3           ; more than 3 characters, handle the happy path
        LDA     BUFLEN          ; BUFLEN = 0, we're done
        BEQ     DONE
       
        ; handle the remaining 1 or 2 bytes
        LDY     #3
        LDA     #'='            ; Pad end of outbuf with two =
        STA     (OUTBUF),Y
        DEY
        STA     (OUTBUF),Y

        LDY     #0              ; Grab first byte
        LDA     (BUF),Y
        JSR     PROC1ST         ; Process that
        LDY     #1
        DEC     BUFLEN          ; Check if we had 1 or 2 bytes
        BNE     L1              ; 2 bytes, skip ahead
        LDA     #0              ; else prime WORK with a 0 byte (although it technically doesn't matter)
        STA     WORK
        BEQ     L2             
L1      DEC     BUFLEN          ; remove last of BUFLEN
        LDA     (BUF),Y         ; fill WORK with 2nd byte
        STA     WORK
L2      DEY
        LDA     (BUF),Y
        JSR     PROC2ND

        ;; done, finish with the math

        CLC                     ; Update counter
        LDA     OUTLEN
        ADC     #4
        STA     OUTLEN
        LDA     OUTLEN+1
        ADC     #0
        STA     OUTLEN+1

DONE    RTS                     ; All done

; Encode 3 bytes

PROC3   LDY     #0              ; Grab first byte
        LDA     (BUF),Y
        JSR     PROC1ST         ; process it
        LDY     #1              ; Grab second byte
        LDA     (BUF),Y
        STA     WORK
        DEY
        LDA     (BUF),Y         ; paired with first
        JSR     PROC2ND         ; process it
        LDY     #2              ; Grab third byte
        LDA     (BUF),Y
        STA     WORK
        DEY
        LDA     (BUF),Y         ; paired with second
        JSR     PROC3RD         ; process it

        ; All done with the 3 bytes, update pointers and counters

        CLC                     ; Output gets 4 bytes longer
        LDA     OUTLEN
        ADC     #4
        STA     OUTLEN
        LDA     OUTLEN+1
        ADC     #0
        STA     OUTLEN+1

        CLC                     ; Output ptr moves 4 bytes
        LDA     OUTBUF         
        ADC     #4
        STA     OUTBUF
        LDA     OUTBUF+1
        ADC     #0
        STA     OUTBUF+1

        CLC                     ; Buffer pointer moves 3 bytes farther
        LDA     BUF
        ADC     #3
        STA     BUF
        LDA     BUF+1
        ADC     #0
        STA     BUF+1

        SEC                     ; Buf len gets 3 bytes shorter
        LDA     BUFLEN
        SBC     #3
        STA     BUFLEN
        LDA     BUFLEN+1
        SBC     #0
        STA     BUFLEN+1
        JMP     BASE64

        ; PROC1ST is easy, shift out 2 LSB, and indexed in to the table.

PROC1ST LSR
        LSR
        TAX
        LDA     CHRTBL,X
        LDY     #0
        STA     (OUTBUF),Y
        RTS

        ; PROC2ND assumes byte 1 is in A, and byte 2 is in WORK
        ; Strips the top 6 bits, and then ROLs 4 bits of WORK in to it
        ; then indexes in to the character table

PROC2ND AND     #$03
        ROL     WORK
        ROL
        ROL     WORK
        ROL
        ROL     WORK
        ROL
        ROL     WORK
        ROL
        TAX
        LDA     CHRTBL,X
        LDY     #1
        STA     (OUTBUF),Y
        RTS

        ; PROC3RD assumes byte 2 is in A, and byte 3 is in WORK
        ; Similar to PROC2ND, strips top MSB from A, and ROLs in
        ; 2 bits from WORK
        ;
        ; This could theoretically be a little more optimized
        ; because it is only used in the 3 byte scenario,
        ; whereas the other 2 are used in the 3 byte and remaining
        ; byte scenarios, so they need to be a little more reusable.
        ; This could be removed as a JSR and inlined for starters.

PROC3RD AND     #$0F
        ROL     WORK
        ROL
        ROL     WORK
        ROL
        TAX
        LDA     CHRTBL,X
        LDY     #2
        STA     (OUTBUF),Y
        LDA     WORK
        ; WORK has the correct 6 bits, just in the wrong place
        ; so we shift them back.
        LSR
        LSR
        TAX
        LDA     CHRTBL,X
        LDY     #3
        STA     (OUTBUF),Y
        RTS
       
CHRTBL  .BYTE 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'

        .END


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 01, 2019 5:08 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
This is most of an answer to Ed's question of command reading and decoding...

It's an interesting subject, one that can quickly lead down many rabbit holes if you're not careful.... However here is my solution - it's working on my Ruby system right now.

First, a demo of it actually running:

Code:
Ruby 6502 - Boot 1.2. GO

Ruby OS 64K

* help
Builtin commands:
  d
  fill
  fx
  go
  help
* d e000 e0ff
E000: D8 78 A2 FF 9A A9 FF 8D  |  x       |
E008: 30 FE A9 81 8D 10 FE 20  | 0        |
E010: 84 E3 20 46 E7 20 AA FF  |    F     |
E018: 0D 0A 52 75 62 79 20 4F  |   Ruby O |
E020: 53 20 36 34 4B 0D 0A 0A  | S 64K    |
E028: 00 20 AA FF 2A 20 00 20  |     *    |
E030: 98 FF C0 01 F0 F3 A0 FF  |          |
E038: C8 B1 F8 C9 20 F0 F9 C9  |          |
E040: 2A F0 F5 C9 0D F0 E2 C9  | *        |
E048: 23 F0 DE 20 77 E2 20 AD  | #   w    |
E050: E2 C9 0D F0 D4 A2 00 64  |        d |
E058: F2 5A B1 F8 F0 09 DD B1  |  Z       |
E060: E0 D0 25 E8 C8 80 F3 BD  |   %      |
E068: B1 E0 D0 1C 7A A5 F2 0A  |     z    |
E070: A8 B9 C4 E0 8D 85 E0 C8  |          |
E078: B9 C4 E0 8D 86 E0 20 84  |          |
E080: E0 4C 29 E0 4C DE E1 E8  |  L) L    |
E088: BD B1 E0 D0 FA E8 BD B1  |          |
E090: E0 F0 06 7A 5A E6 F2 80  |    zZ    |
E098: C1 20 AA FF 55 6E 6B 6E  |     Unkn |
E0A0: 6F 77 6E 20 63 6F 6D 6D  | own comm |
E0A8: 61 6E 64 0D 0A 00 4C 29  | and   L) |
E0B0: E0 64 00 66 69 6C 6C 00  |  d fill  |
E0B8: 66 78 00 67 6F 00 68 65  | fx go he |
E0C0: 6C 70 00 00 DE E1 8C E1  | lp       |
E0C8: E2 E0 6D E1 37 E1 20 AA  |   m 7    |
E0D0: FF 46 61 69 6C 6C 65 64  |  Failled |
E0D8: 20 61 67 61 69 6E 0D 0A  |  again   |
E0E0: 00 60 20 AD E2 C9 0D F0  |  `       |
E0E8: 35 20 B1 EA 8D 1B E1 9C  | 5        |
E0F0: 1C E1 9C 1D E1 20 AD E2  |          |
E0F8: C9 0D F0 13 20 B1 EA 8D  |          |
* fill 0
Fill RAM with $00 ...

Ruby OS 64K

* go 1000
-> $1000

[BRK 1001:00]
* fx 0

[BRK EED1:00] Ruby 6502 2.0
* help
Builtin commands:
  d
  fill
  fx
  go
  help


The prompt is "star space" because that's what I prefer. Leading spaces and stars are ignored and '#' is a comment line. I'd also go as far as to suggest that Ruby has an operating system rather than a "monitor" as such. That's my aim, anyway.

The help command just lists the commands, but (soon) it will also scan for a BBC Micro style ROM image and dump that, if present.

'd' is a simple memory dump, 'fill' fills the whole of RAM apart from what the OS runs in. 'go' - jumps to an address. So above, I dumped some RAM, (part of the OS which lives in $E000-$FDFF), filled RAM with zeros, jumped to $1000 which promptly sat on a BRK instruction, then ran another command familiar to BBC Micro users.

I'll list the code in bits rather then a big blob - mostly because it's split over several files, but also so I can put in additional comments...

First the main-loop of the command interpreter:

Code:
;*********************************************************************************
;* Main Ruby command interpreter
;*********************************************************************************

        jsr     strout
        .byte   13,10,"Ruby OS 64K",13,10,10,0

rubyOsCmd:
        jsr     strout
        .asciiz "* "
        jsr     getline
        cpy     #1              ; Just one character (CR) ?
        beq     rubyOsCmd

; Skip leading spaces and/or stars

        ldy     #$FF
:       iny
        lda     (b0),y
        cmp     #KEY_SPACE
        beq     :-
        cmp     #'*'
        beq     :-

; Newline?

        cmp     #KEY_CR
        beq     rubyOsCmd

; # for comment?

        cmp     #'#'
        beq     rubyOsCmd


We need to dive a bit deeper here - the strout and getline functions. I'll leave strout, but getline does this:
Code:
; getline:
;       Handy short-cut to osWord 0 with fixed parameters for command
;       line entry in Ruby OS.
;********************************************************************************

.proc   _getline
        ldx     #<_getlineData
        ldy     #>_getlineData
        lda     #0
        jmp     osWord

_getlineData:
        .word   $0300           ; Address of input buffer
        .byte   250             ; Max length
        .byte   32              ; Smallest value to accept
        .byte   126             ; largest...
.endproc


That's going to be alien to some - it calls an operating system routine called osWord - and osWord 0 is "read in a line of text with simple editing". I could go deeper, but I'll leave it at that. Suffice to say that the input line will be placed at $0300, max. 250 characters long, and it will accept characters between space and ~. Simple editing is DEL/BS and Ctrl-U to kill the line. There is a zero-page pointer (b0) which is set to the start address which subsequent code relies on. It returns with the length in Y.

Next is the code to start to parse the command-line:

Code:
; We appear to have something, so ...

        jsr     setupArgs

; Get first arg. which is command name

        jsr     getArg
        cmp     #KEY_CR                 ; Shouldn't get this, but...
        beq     rubyOsCmd


The function 'setupArgs' does some initialisation with the command-line (pointed to by (b0),y) and getArg isolates the next argument/token on the command line. It leaves (b0),y pointing to a zero terminated string, or returns CR in A if there is no-more data on the input line. The very first one should always return something because we check for that in the code above, but it's generic and callable from elsewhere.

The next few lines searches the command table for what we've typed and jumps to code to handle it:

Code:
; find command

        ldx     #0                      ; Index into command table
        stz     cmdI
        phy                             ; Y is index in to argument list
findCommand:
        lda     (b0),y                  ; Get character of argument
        beq     found                   ; ... Zero is end
        cmp     commandTable,x
        bne     nextCommand             ; No match - go to next command
        inx
        iny
        bra     findCommand

; found - we've gotten to the end of the typed command, make sure the
;       command-list is also at it's end too...

found:
        lda     commandTable,x          ; Means argument is < length of keyword, but match.
        bne     nextCommand             ; e.g. we type fi which matches fill...

        ply                             ; Remove/dump search index
        lda     cmdI                    ; Get command index
        asl     a                       ; Double to index into jump table
        tay
        lda     commandList,y
        sta     wooly+1
        iny
        lda     commandList,y
        sta     wooly+2
        jsr     wooly                   ; JSR to the command and hope for an RTS
        jmp     rubyOsCmd

; The wooly jumper

wooly:  jmp     $FFFF                   ; Modified


Nearly there, just the code to move to the next command in the command table:

Code:
; nextCommand:
;       scan through the command table to find the start of the next one

nextCommand0:
        inx

; Start by skipping to the end of the one we're currently testing against

nextCommand:
        lda     commandTable,x
        bne     nextCommand0

        inx                             ; Skip over the zero
        lda     commandTable,x          ; Check for another zero
        beq     badCommand

        ply                             ; Reset argument pointer
        phy
        inc     cmdI
        bra     findCommand

badCommand:
        jsr     strout
        .byte   "Unknown command",13,10,0
        jmp     rubyOsCmd


and finally here, the command table looks like:
Code:
; Command table

commandTable:
        .asciiz "d"
        .asciiz "fill"
        .asciiz "fx"
        .asciiz "go"
        .asciiz "help"
        .byte   0

commandList:
        .word   dump
        .word   fill
        .word   fx
        .word   go
        .word   help


A command example: The fx command which can take 1, 2 or 3 arguments: (this will be somewhat alien if you've no experience of the BBC Micro)

Code:
; fx:
;       Call osByte from the keyboard.
;       Expects 1, 2 or 3 parameters for the A, X and Y registers
;********************************************************************************

fx:

; Get first argument

        jsr     getArg
        cmp     #KEY_CR
        beq     noparam
        jsr     atoi8
        sta     fxA

; Optional 2nd & 3rd

        stz     fxX
        stz     fxY

        jsr     getArg
        cmp     #KEY_CR
        beq     doFx
        jsr     atoi8
        sta     fxX

        jsr     getArg
        cmp     #KEY_CR
        beq     doFx
        jsr     atoi8
        sta     fxY

doFx:
        lda     fxA
        ldx     fxX
        ldy     fxY
        jmp     osByte

fxA:    .byte   0
fxX:    .byte   0
fxY:    .byte   0

noparam:
        jsr     strout
        .byte   "Parameter expected",13,10,0
        rts


to fill in one gap, atoi8 converts the argument (b0),y into an 8-bit number. Decimal by default, but $ for hex or % for binary.

so it all seems to work and it's a nice little framework I'm building up the operating system with - which more and more is becoming BBC Micro-like every day. This was not my initial intention, but it's lead on from a simple "how hard can it be" when I was thinking about BBC BASIC. Ah well.

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 01, 2019 8:14 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Very thorough! Thanks for sharing.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 20 posts ]  Go to page Previous  1, 2

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: