Garth: That looks like something I might try (cassette audio <-> PC soundcard) if I'm understanding your design properly. Vice doesn't allow writing to the simulated tape drive.
I'm still proceeding with the strategy of compressing source code into pages and saving it from top of RAM building downward, then dumping large chunks of that out to tape with the SAVE and LOAD routines.
Here's my RLE encoder/decoder code, and another compressor I've been playing with (just the packer, not the unpacker yet). It crams four 6-bit characters into three bytes which probably isn't new art but I've been calling it "shamrock compression" because of the 3 bytes::4 characters thing, in an attempt to reduce memory required by up to 25% (by not storing repetition in the upper two bits.)
There is a table of screen wrap from $E0-$F8 that tells the screen editor which lines are 40 column and which are 80-character continuation lines (bit 7 off). Otherwise they point to the page of memory where the line of video is located. Low bytes of these addresses are stored elsewhere (in a ROM table)
>C:00e0 80 00 80 80 80 80 80 81 81 81 81 81 81 82 82 82 ................
>C:00f0 82 82 82 82 83 83 83 83 83 00 00 00 00 00 00 00 ...............
There is a BASIC trick of forcing a bunch of return characters into the keyboard buffer and then printing commands on the screen spaced appropriately, then letting the screen editor take over. Self-modifying BASIC code without the use of POKE is possible. I'm wondering if I were to change these high-order bytes from $80xx-$83xx to the $7Cxx-$7Fxx range what would the screen editor do? Would continuation lines work?
So, having 1000 characters on the 40x25 screen and 25 bytes in the line wrap table, that's 1025 bytes per screen. Perfect! Not! Commodore 64s had this annoying "one extra byte" too, with Koala images being 10,001 bytes long (background color).
As it turns out (apparently from my experimenting in the screen editor) it's impossible for the top line ($E0) to be a continuation line. The PET double-scrolls the screen. So I only have to stash $E1-F8 (24 bytes) along with the text/graphics on the screen for a neat 1024 bytes.
Not that it matters much. I'm compressing it all. But it's nice to be able to compress a contiguous region of memory ($8000-$83FF) instead of the 1000-byte chunk and the 25-byte (turns out really 24-byte) chunk
As for performance of these two compression algorithms, it goes:
1000 bytes -> RLE -> 490 bytes (51%)
490 bytes -> shamrock -> 405 bytes (59%)
The data used for testing was a typical-looking-enough block of volksForth source.
Code:
;--------------------------------------------------------------
;
; SHAMPACK ( from to size -- size2 )
;
; pack four 8-bit bytes into four 6-bit codes until "size" bytes
; have been packed from the input stream starting at "from".
; return "size2" as the number of bytes in the packed buffer at "to"
;
; set quad for 1 character (octal)
; 00 00
; 00 01
; 00 02
; 00 03
;
; set quad for multiple characters
; 00 04
; 00 05
; 00 06
; 00 07
;
; an '@' sign within this quad
; 00 10
;
; A = byte to pack
shamrockshift ldy #6
shamrockshift01 lsr
ror n+5
ror n+6
ror n+7
dey
bne shamrockshift01
dex
bne shamrockshift05
ldx #2
shamrockshift02 lda n+5,x ; write bytes to output stream
jsr rle_write
dex
bpl shamrockshift02
ldx #4 ; make a new shamrock
shamrockshift05 rts
; A = byte to pack
shamrockshake bcc shamrockshake02 ; read?
lda zi ; original byte
eor n+4
and #$c0
beq shamrockshake03
lda #0 ; new quad here
jsr shamrockshift ; escape character
lda zi
eor (n+2),y ; quad of byte after current byte
and #$c0 ; just the quad bits, please
php
pla
lsr
lsr ; C is a copy of Z flag (from and #c0)
lda zi
and #$c0
bcc shamrockshake02
sta n+4 ; set quad as permanent
shamrockshake02 rol ; carry bit
rol ; quad
rol ; bits
jsr shamrockshift ; send command
shamrockshake03 lda zi
and #$3f
bne shamrockshake04 ; handle "@" sign
jsr shamrockshift
lda #$10
shamrockshake04 jmp shamrockshift ; send character & exit
shampacklfa .word $adde
.byt (shampack-*-1)|bit7
.asc "SHAMPAC","K"|bit7
xyzzy
shampack ldy #2
jsr setup ; 4 bytes in tos tos+1 n n+1
stx storex
lda n+1
eor #$ff
pha
lda n
eor #$ff
pha ; preserve original "to" address
lda zi
pha ; elbow room
ldx #4 ; bytes per shamrock
inc tos+1 ; bias +$0100 for bne loop
jsr rle_read ; get first byte
sta zi
and #$c0 ; mask off hi two bits ( quad )
eor #$c0 ; make it the wrong quad
sta n+4 ; current quad
lda zi
shampack01 sec ; write
jsr shamrockshake
jsr rle_read
sta zi
bne shampack01 ; ~ might be missing the last byte
shampack02 cpx #4 ; ~ find out when I write unpack
beq shampack03
lda #$3f ; pad last shamrock with 1 bits
ora n+4 ; use current quad to avoid spillover
sec ; write
jsr shamrockshake
bne shampack02 ; bra
shampack03 ldx storex
pla
sta zi
sec
pla
adc n ; this is really subtraction
sta tos
pla
adc n+1
sta tos+1
jmp next
;--------------------------------------------------------------
;
; RLDECODE ( from to size -- )
;
; run-length decodes beginning at "from" to address "to" until
; "size" bytes have been decoded
rld_read lda (n+2),y
inc n+2
bne rldecode01
inc n+3
rldecode01 rts
rld_write sta (n),y
inc n
bne rldecode02
inc n+1
rldecode02 dec tos
bne rldecode03
dec tos+1
rldecode03 rts
rldecodelfa .word $adde
.byt (rldecode-*-1)|bit7
.asc "RLDECOD","E"|bit7
rldecode ldy #2
jsr setup ; n = to; n+2 = from; tos = howmany
stx storex
inc tos+1 ; bias $0100 for bne loop
rldecode04 jsr rld_read
rldecode05 sta n+4 ; lastbyte
jsr rld_write
beq rldecode09
jsr rld_read
cmp n+4
bne rldecode05
jsr rld_write
beq rldecode09
jsr rld_read
tax
beq rldecode04
lda n+4
rldecode07 jsr rld_write
beq rldecode09
dex
bne rldecode07
beq rldecode04
rldecode09 ldx storex
jmp pops
;--------------------------------------------------------------
;
; RLENCODE ( from to size -- size2 )
;
; run-length encodes beginning at "from" for "size" bytes to address "to"
; returns "size2" as the number of compressed bytes at "to"
;
rle_read lda (n+2),y
inc n+2
bne rlencode01
inc n+3
rlencode01 dec tos
bne rlencode02
dec tos+1
rlencode02 rts
rle_write sta (n),y
inc n
bne rlencode03
inc n+1
rlencode03 rts
rlencodelfa .word $adde
.byt (rlencode-*-1)|bit7
.asc "RLENCOD","E"|bit7
rlencode ldy #2
jsr setup ; n = to; n+2 = from; tos = howmany
lda n
sta n+6
lda n+1
sta n+7 ; copy original to address
stx storex
inc tos+1 ; bias $0100 for bne loop
rlencode04 jsr rle_read
sta n+4 ; lastbyte
beq rlencode10
jsr rle_write
rlencode05 jsr rle_read
beq rlencode10 ; done
jsr rle_write
tax
eor n+4
bne rlencode08 ; newchar
ldx #1
rlencode06 jsr rle_read
pha
beq rlencode09 ; done rle
eor n+4
bne rlencode07 ; endrle
pla
inx
bne rlencode06 ; rleloop
rlencode07 dex
txa ; overflow
jsr rle_write
inx
beq rlencode04 ; handle overflow - restart
pla
tax
jsr rle_write
rlencode08 stx n+4
jmp rlencode05 ; next
rlencode09 txa
jsr rle_write
pla
rlencode10 jsr rle_write
ldx storex
sec
lda n
sbc n+6
sta tos
lda n+1
sbc n+7
sta tos+1
jmp next
; hex 8000 7c00 decimal 1000 rlencode
; hex 7c00 8000 decimal 1000 rldecode