6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Sep 20, 2024 2:27 pm

All times are UTC




Post new topic Reply to topic  [ 14 posts ] 
Author Message
 Post subject: Base64 encoding
PostPosted: Wed Mar 20, 2019 7:20 pm 
Offline

Joined: Wed Mar 20, 2019 7:14 pm
Posts: 28
Location: Rotterdam, The Netherlands
Hi All!

I have been trying to figure out how to encode a string into a base64-encoded one in 6502 assembly.
Even though i do understand how the encoding works (on paper), i can't seem to get my head around how to do it in ASM...

The only way i could think of was to convert every character into 8 bits, store those in memory (in sequence), then read them back 6 bits at a time, padding them with zero's at the two MSB's, and convert them to the table-characters from that.
But it seems so unpractical to go that route.

I suspect it could be done more economically by shifting/AND-ing etc.

Do any of you guru's know a way?

Thanks in advance!


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Wed Mar 20, 2019 7:29 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Welcome! (Is this a homework / class assignment question? It looks a bit like one.)


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Wed Mar 20, 2019 7:43 pm 
Offline

Joined: Wed Mar 20, 2019 7:14 pm
Posts: 28
Location: Rotterdam, The Netherlands
Thanks for your welcome!

No, not really.

I am working on a rudimentary E-mail sender for my homebrew 65C02 SBC.
To log in to the mail server, i need to send my login/password as base64 encoded ASCII strings.
That's what it is for.

All of this exercise may be pretty futile, but good learning material nonetheless :)


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Wed Mar 20, 2019 7:55 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Might be worth a look at Rosetta Code for this:
https://rosettacode.org/wiki/Base64_encode_data

My guess is, you surely do have to pick off 6 bits, unaligned, from your 8 bit input, which pretty much means some shifting. Dealing with three 8 bit inputs at a time might be a good idea, as you get exactly four 6 bit outputs.

Then, you need to convert from the 6 bit number to one of the 64 characters. A lookup into a 64 character array is surely the very easiest way - or you could at least split off the two alphabets and the run of digits with some comparisons. This is the array you'd want, it seems:
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Wed Mar 20, 2019 8:13 pm 
Offline

Joined: Wed Mar 20, 2019 7:14 pm
Posts: 28
Location: Rotterdam, The Netherlands
The 'handle 3 8-bit inputs' tip is indeed a smart one!

I mostly wish i would find a way not first having to go over splitting every character into binary.
Of course i could spit out a series of binary values from the entire string, store that into memory
and pick off the first 6 bits, then shifting the entire binary sequence 6 places, but it is so messy.


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Wed Mar 20, 2019 8:20 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
I'm not quite sure I follow. In a 6502 program, if you have a string of characters, it is already binary.

'A' is decimal 65 is hex 41 is binary 01000001
So 'ABC' is - already - 0100 0001 0100 0010 0100 0011
and so the 6-bit values we need are, perhaps,
010000
010100
001001
000011
and so the characters we need to output are, maybe
QUJD
(checks out OK compared to https://www.base64encode.org/)


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Wed Mar 20, 2019 8:25 pm 
Offline

Joined: Wed Mar 20, 2019 7:14 pm
Posts: 28
Location: Rotterdam, The Netherlands
Yes, that is correct.
But to pick off 6 bits from an 8-bit ASCII character, and then shifting it 6 bits over is the heart of the problem.
Sorry if my previous post was confusing. I am confusing myself too ;)


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Wed Mar 20, 2019 8:32 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
It might be easiest always to be shifting three bytes. What comes out of the carry of one goes into the carry of the next. We can place our output byte to the left of the first (leftmost) character.

memory layout:
output; first; second; third

onebit shift:
asl third
rol second
rol first
rol output

sixbit shift:
JSR onebitshift
JSR onebitshift
JSR onebitshift
JSR onebitshift
JSR onebitshift
JSR onebitshift

(This could be made smaller, or done as a loop, but perhaps the pedestrian way is easiest to get right.)

So now maybe it's as simple as
LDA #0
STA output
JSR sixbitshift
LDA output
JSR dosomethingwithit

and repeat that four times.


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Wed Mar 20, 2019 8:42 pm 
Offline

Joined: Wed Mar 20, 2019 7:14 pm
Posts: 28
Location: Rotterdam, The Netherlands
Yes, this works like a charm!

Thanks for your help so far, i've got leads to carry on with this now.


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Wed Mar 20, 2019 10:05 pm 
Offline

Joined: Tue Jul 24, 2012 2:27 am
Posts: 674
The "handle 3 bytes" style is actually necessary. The output will always be in blocks of 4 characters (6 bits each = 24 bits), padded with '=' characters if there are too few. This corresponds to a chunking 3 byte <-> 4 character transformation.

I think the main special case you have to worry about is padding the final output. My suggestion would be to always start your 4 byte output buffer filled with "=" characters, then overwrite it with the next (up to) 3 bytes' transformation, then send out that 4 byte output buffer. Not the most blazing fast, but might be the most straightforward. But since each of the 3 bytes encodes to a different alignment, you might have code for each and test for the end at each point, knowing exactly how many "=" characters bytes you need to pad from that point forward.

_________________
WFDis Interactive 6502 Disassembler
AcheronVM: A Reconfigurable 16-bit Virtual CPU for the 6502 Microprocessor


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Fri Mar 22, 2019 11:45 am 
Offline

Joined: Wed Mar 20, 2019 7:14 pm
Posts: 28
Location: Rotterdam, The Netherlands
Yeah, i got the encoding working like this, but indeed the padding at the end is still an issue.
Can't get that to work yet.


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Fri Mar 22, 2019 1:22 pm 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
This is my attempt.
Code:
;===============================================================================
;
; Andrew Jacobs
; 2018-10-11
;-------------------------------------------------------------------------------

                .65816          ; But actually 6502 code

;===============================================================================
; Workspace
;-------------------------------------------------------------------------------

                .page0
                .org    $10
               
CH              .space  1               ;
BITS            .space  1

LEN             .space  1               ; Length of input data
PTR             .space  2               ; Address of input data

;===============================================================================
; The emulator expects the S28 to contain a ROM image so this is kicked off via
; the dummy reset vector.
;-------------------------------------------------------------------------------

                .code
                .org    $f000
               
Reset:
                sei
                cld
                ldx     #$ff
                txs
               
                lda     #TEST_END-TEST
                sta     LEN
                lda     #<TEST
                sta     PTR+0
                lda     #>TEST
                sta     PTR+1
               
                jsr     Base64
               
AllDone:
                wdm     #$ff            ; Stop the emulator
                bra     AllDone

; This should encode as:
;
; QSBmb29sIHRoaW5rcyBoaW1zZWxmIHRvIGJlIHdpc2UsIGJ1dCBhIHdpc2UgbWFuIGtub3dzIGhp
; bXNlbGYgdG8gYmUgYSBmb29sLg==

TEST:           .byte   "A fool thinks himself to be wise, but a wise man "
                .byte   "knows himself to be a fool."
TEST_END:

; Output Utilities

UartTx:
                wdm     #$01            ; Print a character
                rts

NewLine:
                lda     #10             ; LF
                bra     UartTx

;-------------------------------------------------------------------------------

Base64:
                ldy     #0              ; Load index into data
                sty     CH              ; .. and initialise
                sty     BITS
                repeat
                 jsr    Extract         ; Output
                 jsr    UartTx
                 jsr    Extract
                 jsr    UartTx
                 jsr    ExtractPad
                 jsr    UartTx
                 jsr    ExtractPad
                 jsr    UartTx
                 cpy    LEN             ; Until out of data
                until cs
                jsr     NewLine
                rts
               
               
ExtractPad:
                cpy     LEN
                if      cs
                 lda    BITS
                 beq    .Done
                 bpl    Extract
.Done            lda    #'='
                 rts
                endif
Extract:
                lda     #0              ; Shift out six bits
                jsr     Shift2
                jsr     Shift2
                jsr     Shift2
                tax                     ; And translate to character
                lda     LEXICON,x
                rts
                               
Shift2:
                ldx     BITS
                if eq
                 cpy    LEN
                 if cc
                  pha
                  lda   (PTR),y
                  sta   CH
                  iny
                  lda   #8
                  sta   BITS
                  pla
                 endif
                endif
                asl     CH
                rol     a
                asl     CH
                rol     a
                dec     BITS
                dec     BITS
                rts
               
LEXICON:        .byte   "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
                .byte   "abcdefghijklmnopqrstuvwxyz"
                .byte   "0123456789+/"
               
       
                .org    $fffc
                .word   Reset           ; Dummy RESET vector
               
                .end           

Everything need to build and run this is in this ZIP


Attachments:
Base64.zip [654.16 KiB]
Downloaded 97 times

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs
Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Fri Mar 22, 2019 1:35 pm 
Offline
User avatar

Joined: Mon Dec 08, 2008 6:32 pm
Posts: 143
Location: Brighton, England
EDIT: Bitwise posted while I was posting this so sorry if I have duplicated his efforts. Must read his post and see if he came up with something clever! - Yep, he did :D

The easy way to handle the packing is to have three seperate encoding routines, one handles 1 byte, one handles 2 bytes and one handles 3 bytes. Much of the three encoding routines will be the same and so this can be split off as a subroutine that all three encoding routines call. You then determine, before you start encoding, how many bytes you have to encode and call the relevant routine. This can be done with:
Code:
Assuming the number of bytes to encode is in A

            ASL   A
            TAX
            JMP   (JmpTable,X)

JmpTable    Address of Encode1 routine
            Address of Encode2 routine
            Address of Encode3 routine


We double the number of bytes to encode by shifting A, then transfer this value to X and use it as an offset into a jump table.

Often, it is easier to split a function up like this than try to create a single, general purpose routine that can handle all the possible cases.

_________________
Shift to the left,
Shift to the right,
Mask in, Mask Out,
BYTE! BYTE! BYTE!


Top
 Profile  
Reply with quote  
 Post subject: Re: Base64 encoding
PostPosted: Sat Mar 23, 2019 9:16 pm 
Offline

Joined: Wed Mar 20, 2019 7:14 pm
Posts: 28
Location: Rotterdam, The Netherlands
Thank you, Bitwise! Your code works flawlessly!
And thanks to everyone else for giving me such good input for learning something new!
What a wonderful community you are.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 

All times are UTC


Who is online

Users browsing this forum: SamCoVT and 21 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: