Page 1 of 1
Base64 encoding
Posted: Wed Mar 20, 2019 7:20 pm
by jaccodewijs
Hi All!
I have been trying to figure out how to encode a string into a base64-encoded one in 6502 assembly.
Even though i do understand how the encoding works (on paper), i can't seem to get my head around how to do it in ASM...
The only way i could think of was to convert every character into 8 bits, store those in memory (in sequence), then read them back 6 bits at a time, padding them with zero's at the two MSB's, and convert them to the table-characters from that.
But it seems so unpractical to go that route.
I suspect it could be done more economically by shifting/AND-ing etc.
Do any of you guru's know a way?
Thanks in advance!
Re: Base64 encoding
Posted: Wed Mar 20, 2019 7:29 pm
by BigEd
Welcome! (Is this a homework / class assignment question? It looks a bit like one.)
Re: Base64 encoding
Posted: Wed Mar 20, 2019 7:43 pm
by jaccodewijs
Thanks for your welcome!
No, not really.
I am working on a rudimentary E-mail sender for my homebrew 65C02 SBC.
To log in to the mail server, i need to send my login/password as base64 encoded ASCII strings.
That's what it is for.
All of this exercise may be pretty futile, but good learning material nonetheless

Re: Base64 encoding
Posted: Wed Mar 20, 2019 7:55 pm
by BigEd
Might be worth a look at Rosetta Code for this:
https://rosettacode.org/wiki/Base64_encode_data
My guess is, you surely do have to pick off 6 bits, unaligned, from your 8 bit input, which pretty much means some shifting. Dealing with three 8 bit inputs at a time might be a good idea, as you get exactly four 6 bit outputs.
Then, you need to convert from the 6 bit number to one of the 64 characters. A lookup into a 64 character array is surely the very easiest way - or you could at least split off the two alphabets and the run of digits with some comparisons. This is the array you'd want, it seems:
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
Re: Base64 encoding
Posted: Wed Mar 20, 2019 8:13 pm
by jaccodewijs
The 'handle 3 8-bit inputs' tip is indeed a smart one!
I mostly wish i would find a way not first having to go over splitting every character into binary.
Of course i could spit out a series of binary values from the entire string, store that into memory
and pick off the first 6 bits, then shifting the entire binary sequence 6 places, but it is so messy.
Re: Base64 encoding
Posted: Wed Mar 20, 2019 8:20 pm
by BigEd
I'm not quite sure I follow. In a 6502 program, if you have a string of characters, it is already binary.
'A' is decimal 65 is hex 41 is binary 01000001
So 'ABC' is - already - 0100 0001 0100 0010 0100 0011
and so the 6-bit values we need are, perhaps,
010000
010100
001001
000011
and so the characters we need to output are, maybe
QUJD
(checks out OK compared to
https://www.base64encode.org/)
Re: Base64 encoding
Posted: Wed Mar 20, 2019 8:25 pm
by jaccodewijs
Yes, that is correct.
But to pick off 6 bits from an 8-bit ASCII character, and then shifting it 6 bits over is the heart of the problem.
Sorry if my previous post was confusing. I am confusing myself too

Re: Base64 encoding
Posted: Wed Mar 20, 2019 8:32 pm
by BigEd
It might be easiest always to be shifting three bytes. What comes out of the carry of one goes into the carry of the next. We can place our output byte to the left of the first (leftmost) character.
memory layout:
output; first; second; third
onebit shift:
asl third
rol second
rol first
rol output
sixbit shift:
JSR onebitshift
JSR onebitshift
JSR onebitshift
JSR onebitshift
JSR onebitshift
JSR onebitshift
(This could be made smaller, or done as a loop, but perhaps the pedestrian way is easiest to get right.)
So now maybe it's as simple as
LDA #0
STA output
JSR sixbitshift
LDA output
JSR dosomethingwithit
and repeat that four times.
Re: Base64 encoding
Posted: Wed Mar 20, 2019 8:42 pm
by jaccodewijs
Yes, this works like a charm!
Thanks for your help so far, i've got leads to carry on with this now.
Re: Base64 encoding
Posted: Wed Mar 20, 2019 10:05 pm
by White Flame
The "handle 3 bytes" style is actually necessary. The output will always be in blocks of 4 characters (6 bits each = 24 bits), padded with '=' characters if there are too few. This corresponds to a chunking 3 byte <-> 4 character transformation.
I think the main special case you have to worry about is padding the final output. My suggestion would be to always start your 4 byte output buffer filled with "=" characters, then overwrite it with the next (up to) 3 bytes' transformation, then send out that 4 byte output buffer. Not the most blazing fast, but might be the most straightforward. But since each of the 3 bytes encodes to a different alignment, you might have code for each and test for the end at each point, knowing exactly how many "=" characters bytes you need to pad from that point forward.
Re: Base64 encoding
Posted: Fri Mar 22, 2019 11:45 am
by jaccodewijs
Yeah, i got the encoding working like this, but indeed the padding at the end is still an issue.
Can't get that to work yet.
Re: Base64 encoding
Posted: Fri Mar 22, 2019 1:22 pm
by BitWise
This is my attempt.
Code: Select all
;===============================================================================
;
; Andrew Jacobs
; 2018-10-11
;-------------------------------------------------------------------------------
.65816 ; But actually 6502 code
;===============================================================================
; Workspace
;-------------------------------------------------------------------------------
.page0
.org $10
CH .space 1 ;
BITS .space 1
LEN .space 1 ; Length of input data
PTR .space 2 ; Address of input data
;===============================================================================
; The emulator expects the S28 to contain a ROM image so this is kicked off via
; the dummy reset vector.
;-------------------------------------------------------------------------------
.code
.org $f000
Reset:
sei
cld
ldx #$ff
txs
lda #TEST_END-TEST
sta LEN
lda #<TEST
sta PTR+0
lda #>TEST
sta PTR+1
jsr Base64
AllDone:
wdm #$ff ; Stop the emulator
bra AllDone
; This should encode as:
;
; QSBmb29sIHRoaW5rcyBoaW1zZWxmIHRvIGJlIHdpc2UsIGJ1dCBhIHdpc2UgbWFuIGtub3dzIGhp
; bXNlbGYgdG8gYmUgYSBmb29sLg==
TEST: .byte "A fool thinks himself to be wise, but a wise man "
.byte "knows himself to be a fool."
TEST_END:
; Output Utilities
UartTx:
wdm #$01 ; Print a character
rts
NewLine:
lda #10 ; LF
bra UartTx
;-------------------------------------------------------------------------------
Base64:
ldy #0 ; Load index into data
sty CH ; .. and initialise
sty BITS
repeat
jsr Extract ; Output
jsr UartTx
jsr Extract
jsr UartTx
jsr ExtractPad
jsr UartTx
jsr ExtractPad
jsr UartTx
cpy LEN ; Until out of data
until cs
jsr NewLine
rts
ExtractPad:
cpy LEN
if cs
lda BITS
beq .Done
bpl Extract
.Done lda #'='
rts
endif
Extract:
lda #0 ; Shift out six bits
jsr Shift2
jsr Shift2
jsr Shift2
tax ; And translate to character
lda LEXICON,x
rts
Shift2:
ldx BITS
if eq
cpy LEN
if cc
pha
lda (PTR),y
sta CH
iny
lda #8
sta BITS
pla
endif
endif
asl CH
rol a
asl CH
rol a
dec BITS
dec BITS
rts
LEXICON: .byte "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
.byte "abcdefghijklmnopqrstuvwxyz"
.byte "0123456789+/"
.org $fffc
.word Reset ; Dummy RESET vector
.end
Everything need to build and run this is in this ZIP
Re: Base64 encoding
Posted: Fri Mar 22, 2019 1:35 pm
by PaulF
EDIT: Bitwise posted while I was posting this so sorry if I have duplicated his efforts. Must read his post and see if he came up with something clever! - Yep, he did
The easy way to handle the packing is to have three seperate encoding routines, one handles 1 byte, one handles 2 bytes and one handles 3 bytes. Much of the three encoding routines will be the same and so this can be split off as a subroutine that all three encoding routines call. You then determine, before you start encoding, how many bytes you have to encode and call the relevant routine. This can be done with:
Code: Select all
Assuming the number of bytes to encode is in A
ASL A
TAX
JMP (JmpTable,X)
JmpTable Address of Encode1 routine
Address of Encode2 routine
Address of Encode3 routine
We double the number of bytes to encode by shifting A, then transfer this value to X and use it as an offset into a jump table.
Often, it is easier to split a function up like this than try to create a single, general purpose routine that can handle all the possible cases.
Re: Base64 encoding
Posted: Sat Mar 23, 2019 9:16 pm
by jaccodewijs
Thank you, Bitwise! Your code works flawlessly!
And thanks to everyone else for giving me such good input for learning something new!
What a wonderful community you are.