6502.org

Posted: **Fri Aug 18, 2017 7:34 am**

So, I successfully wrote a PC Huffman encoder in C++. I use canonical codes and only store the "huffman tree" as a 256 byte table where each byte indicate the code length of the corresponding index (or $00 if a symbol unused). Cannonical modes means that if several symbols use the same bit length, then their encoding is sorted in order. For example if 'A', 'B' and 'C' are encoded in 5-bits (and nothing else is), then A is necessarily "11100", B "11101" and C "11110". Symbols coded on 6 and more bits will always start with 5 1s, and symbols starting with less than 3 1s are encoded in less than 5-bits.

I am looking at how to implement Huffman decoding in 6502. It doesn't have to be fast, but if it happen to be, then it's nice. Currently the C++ code is as follows :

Code: Select all

static block_t decode_block(block_t & block, block_t const& depth_table, uint16_t size)
{
	block_t out_buff;
	BitArrayRead b(block);

	for(uint16_t i=0; i<size; i++)
	{
		//Decode the input bitstream
		//This decoding method is slow, however it requires no extra RAM or variables,
		//which makes it interesting for embedded applications.

		unsigned int nbits = 0;
		//This hodls the currently read code, and the code we'll compare it to
		long code = 0, matchcode = 0;
		bool found = false;
		uint8_t value = 0;
		do
		{
			//We do the "clever algorithm" again but this time we search to which symbol
			//the code we have belongs to
			nbits++;
			code <<= 1;
			if(b) code |=1;
			matchcode <<= 1;

			//Search if the read code corresponds to something...
			for(unsigned int j = min; !found && j <= max; j++)
			{
				unsigned int match_len = depth_table[j-min];
				if(match_len != nbits) continue;

				if(matchcode == code)
				{
					found = true;
					value = j;
				}
				matchcode++;
			}
			//No -> continue searching among longer codes
		}
		while(!found);
		//Yes -> we decoded a byte
		out_buff.push_back(value);
	}
	return out_buff;
}

My assembly code so far looks like this (untested).

Code: Select all

; Decode one symbol from a cannonical huffman encoded bistream
; (To decode multiple symbols this have to be called within a loop)
; Note that some variables must be inintialized before decoding the 1st symbol

Get_Huff_Value
        lda #$00
        sta nbits
        sta code_l
        sta code_m
        sta code_h              ; Needed only if some codes are longer than 16-bit
        sta match_code_l
        sta match_code_m
        sta match_code_h        ; Needed only if some codes are longer than 16-bit

main_loop:
        inc nbits

        jsr Read_bit_from_stream        ;Read one bit and shift in the current code variable
        rol code_l
        rol code_m
        rol code_h

        asl match_code_l
        rol match_code_m
        rol match_code_h

        ; Search if the read code correspond to something...
        ldy #$00
search_loop:
        lda Huffman_length_tbl, Y          ; Search for a node that has the current length
        cmp nbits
        bne not_found

found_length:
        lda match_code_l
        cmp code_l
        bne code_wrong
        lda match_code_m
        cmp code_m
        bne code_wrong
        lda match_code_h
        cmp code_h
        bne code_wrong

found:
        tya                ; Value in Y
        rts

code_wrong:
        inc match_code_l        ; If the code is not this one, match_code is incremented by '1' and continue searching
        bne not_found
        inc match_code_m
        bne not_found
        inc match_code_h

not_found:
        iny                    ; Next code
        bne search_loop        ; Continue for searching for same length
        beq main_loop          ; Searching for a longer length

Read_bit_from_stream:         ; Read a single bit from input stream and return it in C flag
        dec HuffBitCtr
        bpl _not_next_byte

        lda #$07
        sta HuffBitCtr
        inc HuffPointerL
        bne +
        inc HuffPointerH

+       lda (HuffPointerL),Y  ; Read next byte in shift register
        sta HuffShiftReg
_not_next_byte
        lsr HuffShiftReg
        rts

Read bit from stream is a separate subroutine only for clarity, but it could easily be inlined in the main loop.
I wonder whether this decoding code can be (significantly) improved for speed and/or size. An obvious improvement is that if codes do never surpass 16-bits (they rarely do) then the 24-bit variables can be collapsed to 16-bits only, however it is not rare that some code lengths are above 16-bit if some symbols are very rare.

A problem with this approach is that the whole 256 byte table is traversed for every possible length (including unused lengths). So if we decode a symbol encoded in, say, 17 bits, the loops looks for the whote 256 byte table 17 times before it can decode the symbol, neededless to say it's slow. However it's slow for the rare symbols and faster for the frequent symbols, as such it's not that bad.

Posted: **Fri Aug 18, 2017 10:43 am**

Here's a small improvement to the subroutine to get a bit, which doesn't require a counter. Just requires HuffShiftReg to be initialized to zero:

Code: Select all

Read_bit_from_stream:          ; Read a single bit from input stream and return it in C flag
        lsr HuffShiftReg
        bne _not_next_byte

        lda (HuffPointerL),Y   ; Read next byte in shift register
        inc HuffPointerL
        bne +
        inc HuffPointerH
+       sec
        ror a
        sta HuffShiftReg
_not_next_byte
        rts

Posted: **Fri Aug 18, 2017 2:34 pm**

Wow this is quite clever !

Apparently unlike the full canonical decoding I did, which relies only on a length table, some decoding requires the sorted canonical codes and a table of how many codes there is per length instead. In examples of length going up to 17 bits (this is what I have for my current data pool !), then it means 17 more bytes to be stored, however the decoding can likely be done much faster as it needs to loop only 17 times instead of 17 * 256.

The problem is that I do not know how this optimization works at all.

Posted: **Fri Aug 18, 2017 5:18 pm**

Just realised this can be further optimized by omitting the SEC, and initializing HuffShiftReg to 1 instead. Shifting out the last sentinel bit will always result in carry set!

I'm surprised to see such a concise bit of 6502 to do Huffman decoding! It's not a compression method I've ever pursued (always preferred some variant of LZ77 instead).

Posted: **Sat Aug 19, 2017 8:58 am**

I used Huffman encoding to compress a lid of cookie strings into a ROM for the BBC micro. I had a 256 byte lookup table indexed by the current code. I shifted bits into the index until lda table,x was non-zero. To reduce the length of the codes I used a control character to indicate when then next character should be capitalised.

Posted: **Sat Sep 16, 2017 5:55 am**

is there a working 6502 (source) implementation of Huffman decoding somewhere?

Posted: **Wed Sep 20, 2017 4:44 pm**

Bregalad wrote:

some decoding requires the sorted canonical codes and a table of how many codes there is per length instead. In examples of length going up to 17 bits (this is what I have for my current data pool !), then it means 17 more bytes to be stored, however the decoding can likely be done much faster as it needs to loop only 17 times instead of 17 * 256.

The problem is that I do not know how this optimization works at all.

This needn't even take more space to store the codebook, as you only need to store characters which appear in the alphabet in the "sorted canonical codes". It's also really easy to decode, and, as you say, much faster, as you can immediately deduce the symbol instead of needing to continually loop and keep checking.

Here's how you might do it (pseudocode, untested):

Code: Select all

array: numCodesWithNumBits[maxBits]   ; for an index n, yields how many characters are represented by n+1 bits
array: codes[]   ; list of sorted canonical codes, smallest bit length first

inputCode = 0
firstCodeWithNumBits = 0
startIndexForCurrentNumBits = 0

for numBits = 0 to maxBits - 1

    ; build input code
    inputCode = (inputCode << 1) | getNextBit()

    ; get how many canonical codes have this many bits
    numCodes = numCodesWithNumBits[numBits]

    ; if input code so far is within the range of the first code with the current number of bits, it's a match
    indexForCurrentNumBits = inputCode - firstCodeWithNumBits
    if indexForCurrentNumBits < numCodes
        return codes[startIndexForCurrentNumBits + indexForCurrentNumBits];
    else
        ; otherwise, move to the next bit length
        firstCodeWithNumBits = (firstCodeWithNumBits + numCodes) << 1
        startIndexForCurrentNumBits += numCodes
    endif

next

; Shouldn't get here; means the codebook was invalid

I've never attempted to write this algorithm in 6502, but it looks like it would translate really easily. Combined with an LZ77 type decoder, you could get a pretty efficient INFLATE implementation with this!

Posted: **Wed Feb 07, 2018 10:58 pm**

I has the code. Took me a long while to understand precisely how the canonical code works, but when you see the codes laid out it becomes clear how to figure out to do the decoding without having to scan the entire code table.

Here's a blog post where I go into a bit more detail.

https://github.com/djangojames/explorat ... -in-8-bits

This code only supports 16 bit Huffamn codes, but it can be made to support 24 bits by adding another byte to the accumulator, another ROL, CMP, SBC and the extra bytes in the table at the end.
I apologise in for the code addresses not being annotated inline (my disassembler doesn't do that...)

To read a symbol we do JSR huffman-next until we read our specific EOS indicator. Before we do any of that, we need to set the address, HUFFMAN-PTR, to where we want to read from and additionally set HUFFMAN-BITS to 1.

It has been unit tested with a bunch of strings round-trip fashion so it does work... I'm going to port my adventure game to use it over the next few days.

Code: Select all

OUTPUT-STRING-INDEX -> 0000
HUFFMAN-POP-TABLE -> 0001
HUFFMAN-PTR -> 0003
HUFFMAN-BITS -> 0005
DECODER:ACC-HI -> 000C
DECODER:ACC-LO -> 000D
DECODER:NEXT-BYTE -> 000E

Code: Select all

HUFFMAN-NEXT 070E A000     LDY #$00
                     0710 840C     STY $0C   ;clear accumulator
                     0712 840D     STY $0D
                     0714 88       DEY       ;Start at zero in pop table, following the first INY
   DECODER:FETCH-BIT 0715 C605     DEC $05
                     0717 D014     BNE $072D ;DECODER:GOT-BITS 
                     ;New byte required
                     ;Save our pop table index
                     0719 98       TYA
                     071A AA       TAX
                     071B A000     LDY #$00
                     071D B103     LDA ($03),Y
                     071F 850E     STA $0E
                     0721 E603     INC $03   ;PTR
                     0723 D002     BNE $0727 ;L1:INC16-DONE 
                     0725 E604     INC $04   ;PTR + 1
       L1:INC16-DONE 0727 A908     LDA #$08
                     0729 8505     STA $05
                     ;Restore the pop table index
                     072B 8A       TXA
                     072C A8       TAY
                     ;Lets rotate a bit from the next byte
                     ;into the accumulator
    DECODER:GOT-BITS 072D 060E     ASL $0E
                     072F 260D     ROL $0D
                     0731 260C     ROL $0C
                     0733 C8       INY
                     0734 B101     LDA ($01),Y ;Any symbols at this length?
                     0736 F0DD     BEQ $0715 ;DECODER:FETCH-BIT 
                     0738 C8       INY       ;Skip row indicator byte
                     ;We have some symbols- does the accumulator hold one?
                     ;It does if the accumulator is less than the max-code
                     ;for codes of this length. So let's compare it.
                     0739 B101     LDA ($01),Y
                     073B C50C     CMP $0C
                     073D 3014     BMI $0753 ;DECODER:GT1  ;acc-hi > max-code(len)-hi
                     073F C8       INY
                     0740 B101     LDA ($01),Y
                     0742 C50D     CMP $0D
                     0744 300E     BMI $0754 ;DECODER:GT2  ;acc-lo > max-code(len)-lo
                     0746 C8       INY
                     ;Now acc<=max-code(len), this means we have a symbol woo!
                     ;To get the symbol index we subtract the offset
                     ;which has kindly been pre-computed in the pop table
                     0747 38       SEC
                     0748 A50D     LDA $0D
                     074A F101     SBC ($01),Y ;- offset lo
                     074C AA       TAX
                     074D C8       INY
                     074E A50C     LDA $0C
                     0750 F101     SBC ($01),Y ;- offset hi
                     ;Return to caller with index-hi in A and index-lo in X
                     0752 60       RTS
                     ;Skip rest of row and get next bit
         DECODER:GT1 0753 C8       INY
         DECODER:GT2 0754 C8       INY
                     0755 C8       INY
                     0756 D0BD     BNE $0715 ;DECODER:FETCH-BIT 
                     ;String Table
                     ;General strings huffman population
                     ;0 symbols of length 1
                     ;Max Code: NIL Offset: NIL
    STRING-POP-TABLE 0758 00       DB $00
                     ;1 symbols of length 2
                     ;Max Code: 0000 Offset: 0000
                     0759 FF0000.. DB $FF, $00, $00, $00, $00
                     ;0 symbols of length 3
                     ;Max Code: NIL Offset: NIL
                     075E 00       DB $00
                     ;6 symbols of length 4
                     ;Max Code: 0009 Offset: 0003
                     075F FF0009.. DB $FF, $00, $09, $03, $00
                     ;6 symbols of length 5
                     ;Max Code: 0019 Offset: 000D
                     0764 FF0019.. DB $FF, $00, $19, $0D, $00
                     ;7 symbols of length 6
                     ;Max Code: 003A Offset: 0027
                     0769 FF003A.. DB $FF, $00, $3A, $27, $00
                     ;8 symbols of length 7
                     ;Max Code: 007D Offset: 0062
                     076E FF007D.. DB $FF, $00, $7D, $62, $00
                     ;4 symbols of length 8
                     ;Max Code: 00FF Offset: 00E0
                     0773 FF00FF.. DB $FF, $00, $FF, $E0, $00
                 END 0778 00       BRK       ;HUFFMAN-PTR ;HUFFMAN-PTR + 1

6502.org

Huffman decoding on a 6502

Huffman decoding on a 6502

Re: Huffman decoding on a 6502

Re: Huffman decoding on a 6502

Re: Huffman decoding on a 6502

Re: Huffman decoding on a 6502

Re: Huffman decoding on a 6502

Re: Huffman decoding on a 6502

Re: Huffman decoding on a 6502