6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Sep 19, 2024 11:58 pm

All times are UTC




Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Mon May 11, 2020 8:53 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
I was wondering if anyone has ever seen, written, or used any 6502 code which implemented any kind of Forward Error Correction?

I vaguely remember thinking that it wouldn't be too hard to compute and send row and column parity for a block of data bytes: if there was then exactly one row and one column parity error, that gives a single bit to flip back to fix it. (If there was exactly one parity error in total, it can also be treated as a single bit error.)

But I never wrote any code.

(Mentioned recently in this thread about floppy formats)


Top
 Profile  
Reply with quote  
PostPosted: Mon May 11, 2020 9:03 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8509
Location: Southern California
I've thought about it from time to time over the years, but never done it.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue May 12, 2020 4:10 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
The only concrete example I know of is the stock C64 tape format, which simply repeats every block and header twice. If there's a read error somewhere, it gets a second chance. Each byte is parity-checked, so a form of erasure coding may be possible - ie. only re-read the bytes that failed to decode - but I somehow doubt that the C64 KERNAL routines are that smart.

Of course, there are many implementations of checksums to detect errors. The topic here is correcting those errors, up to some limit, without needing to re-read the data; on a tape, the latter option would require asking the user to manually rewind the tape to go over the bad block. There's a number of well-known techniques for this, with varying tradeoffs of space, speed, and effectiveness. Space and speed are traditionally at something of a premium on a 6502… It should be reasonably straightforward to implement Hamming codes, though.

If we assume the Acorn block format, which has 28 bytes of meaningful overhead (from filename field to data CRC inclusive) and use 192-byte data blocks, then we have 220 meaningful bytes to operate on. Adding 9 further bytes of selective parity information, and inserting them strategically into the byte sequence, gives us an "extended Hamming code" capable of correcting any one-byte error in that block (strictly, any one-bit error in any one byte-wise bit position), and detecting any two-byte error. Since the total number of bytes in the block at most 256, these 9 bytes are adequate for the purpose, and we can efficiently index over the block on our 6502. Indeed we can accommodate up to 247 payload bytes alongside the 9 parity bytes in a 255-byte block, before we have to enlarge the code.

In this extended Hamming code, bytes within the block are numbered starting from 0. The parity bytes are inserted at positions 0, 1, 2, 4, 8, 16, 32, 64, 128; the payload bytes occupy the other positions. It will be necessary to rearrange the bytes in the input so that they conform to this format, and restore them after decoding. The eight latter parity bytes are calculated as the exclusive-OR of all the bytes in positions whose addresses AND the parity-byte's own position are non-zero. The parity byte in position 0 is calculated as the exclusive-OR over all bytes in the block, including the other parity bytes. So, generating the parity bytes is very straightforward.

Decoding is also fairly easy if you assume that a complete byte has been corrupted, rather than random bits possibly scattered across several bytes. The principle can be extended to cover the latter case, with some complication in the code.

Start by generating a fresh copy of the parity bytes from the received data, and storing them separately. Ensure that they are exclusive-ORed with their corresponding parity bytes as received (which is easy to arrange for as part of the recalculation). If they are all zero, then there are no errors to correct. If they are all either zero or some common value (and, in particular, the 0 parity is this common value), then this common value indicates which bits are wrong in the errored byte, and the OR function over the non-zero parity bytes' positions gives the position of the byte in error (which, let's not forget, might actually be one of the parity bytes). A simple exclusive-OR operation is then sufficient to correct that byte. But if neither of these conditions hold, then more than two bytes are in error.

The Hamming code is much less powerful than, say, a Reed-Solomon code. The latter would be able to correct any 4 byte-errors using the same 9 bytes of added parity information, assuming no erasure information was available. But implementing a Reed-Solomon code is considerably more difficult - something of a project in its own right.

Code:
; Data rearrangement.
; We simply append the payload bytes displaced by parity bytes to the end of the block.
; On entry, X contains the length of the payload.  We assume it's at least 129 and at most 247, so the routine is self-reversing.
; On exit, X contains the length of the payload plus parity.  This will read zero if the total is 256.
HammingShuffle:
  ; overall parity byte in position 0
  LDA buffer+0
  LDY buffer,X
  STY buffer+0
  STA buffer,X
  INX
  ; parity over odd bytes in position 1
  LDA buffer+1
  LDY buffer,X
  STY buffer+1
  STA buffer,X
  INX
  ; parity mod 4 in position 2
  LDA buffer+2
  LDY buffer,X
  STY buffer+2
  STA buffer,X
  INX
  ; parity mod 8 in position 4
  LDA buffer+4
  LDY buffer,X
  STY buffer+4
  STA buffer,X
  INX
  ; parity mod 16 in position 8
  LDA buffer+8
  LDY buffer,X
  STY buffer+8
  STA buffer,X
  INX
  ; parity mod 32 in position 16
  LDA buffer+16
  LDY buffer,X
  STY buffer+16
  STA buffer,X
  INX
  ; parity mod 64 in position 32
  LDA buffer+32
  LDY buffer,X
  STY buffer+32
  STA buffer,X
  INX
  ; parity mod 128 in position 64
  LDA buffer+64
  LDY buffer,X
  STY buffer+64
  STA buffer,X
  INX
  ; parity mod 256 in position 128
  LDA buffer+128
  LDY buffer,X
  STY buffer+128
  STA buffer,X
  INX
  RTS

; Calculate the nine parity bytes over a block, leaving them in zero page.
; On entry, assume that the space in a 256-byte page after the end of the block has been zeroed.
HammingParity:
  ; First, calculate selective parity bytes
  LDX #1
  STX parity+0    ; use this byte as the address mask for now
: LDY #0
  STZ parity,X
: TYA
  BIT parity+0    ; check if this byte is covered by the current parity
  BEQ :+
  LDA buffer,Y
  EOR parity,X
  STA parity,X
: INY
  BNE :--
  INX
  ASL parity+0    ; go on to the next parity
  BNE :---
  ; Finally, calculate overall parity
  LDA #0
: EOR buffer,Y
  INY
  BNE :-
  STA parity+0
  RTS

; Attempt to correct an error (if necessary and possible) based on calculated parity over received data.
; Sets Carry flag if error was detected but not corrected.
HammingCorrection:
  LDA parity+0
  ORA parity+1
  ORA parity+2
  ORA parity+3
  ORA parity+4
  ORA parity+5
  ORA parity+6
  ORA parity+7
  ORA parity+8
  BNE :+
  ; Parity received is consistent with that calculated, nothing to do.
  CLC
  RTS
: CMP parity+0
  BEQ :+
  ; There is a double-bit error in at least one bit position, which the Hamming code cannot correct
  SEC
  RTS
: TAY
  ; Establish location of errored byte, and whether there is exactly one errored byte.
  ; The latter is true if every parity byte is either zero or equal to the overall parity byte.
  LDA #0
  LDX parity+1
  BEQ :+
  ORA #1
  CPY parity+1
  BNE @error
: LDX parity+2
  BEQ :+
  ORA #2
  CPY parity+2
  BNE @error
: LDX parity+3
  BEQ :+
  ORA #4
  CPY parity+3
  BNE @error
: LDX parity+4
  BEQ :+
  ORA #8
  CPY parity+4
  BNE @error
: LDX parity+5
  BEQ :+
  ORA #16
  CPY parity+5
  BNE @error
: LDX parity+6
  BEQ :+
  ORA #32
  CPY parity+6
  BNE @error
: LDX parity+7
  BEQ :+
  ORA #64
  CPY parity+7
  BNE @error
: LDX parity+8
  BEQ :+
  ORA #128
  CPY parity+8
  BNE @error
: TAX
  ; We can now correct the errored byte.
  TYA
  EOR buffer,X
  STA buffer,X
  CLC
  RTS
@error:
  ; An interesting situation: the errored bits are spread between more than one byte.
  ; A problem to tackle another day.
  SEC
  RTS

HammingEncode:
  LDX #bufferLength
: STZ buffer,X
  INX
  BNE :-
  LDX #bufferLength
  JSR HammingShuffle
  JSR HammingParity
  LDA parity+0
  STA buffer+0
  LDA parity+1
  STA buffer+1
  LDA parity+2
  STA buffer+2
  LDA parity+3
  STA buffer+4
  LDA parity+4
  STA buffer+8
  LDA parity+5
  STA buffer+16
  LDA parity+6
  STA buffer+32
  LDA parity+7
  STA buffer+64
  LDA parity+8
  STA buffer+128
  RTS

HammingDecode:
  LDX #bufferLength+9
  BEQ :++
: STZ buffer,X
  INX
  BNE :-
: JSR HammingParity
  JSR HammingCorrection
  PHP
  JSR HammingShuffle
  PLP
  RTS
I'm sure there are more efficient ways to implement this, but sometimes the most straightforward way is the correct one.


Top
 Profile  
Reply with quote  
PostPosted: Tue May 12, 2020 8:49 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Splendid! Dare I ask if this is tested? Do you think you might use this approach?


Top
 Profile  
Reply with quote  
PostPosted: Tue May 12, 2020 9:49 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Completely untested so far, but it's potentially useful due to its simplicity and speed, and indeed it's so conceptually simple that it's hard to get wrong (barring implementation bugs). It's likely to have the most value for cases where the alignment of bytes is reliable but the value of an individual byte might be lost. Once the basic scheme is implemented with confidence, it can be extended to larger block sizes with proportionately greater corrective power, by interleaving - for example, using five interleaved 205-byte payloads to cover a 1024-byte block, requiring 45 additional bytes of parity, and being able to correct a 5-byte burst error.

And then there is the possibility of cross-interleaving for greater corrective power; CDs use cross-interleaved Reed-Solomon codes. One way of doing that is to treat entire blocks as elements of the code, much as I'm using bytes as elements in the above. On the other hand, a different sort of code might be more efficient for that purpose.

I've also just come across Golay codes, which may be a useful middle ground between Hamming and Reed-Solomon codes. Golay codes require as much parity data as payload, but they are more powerful than Hamming codes while still being tolerably easy to understand and decode. They come in a fixed size - 12 bits payload expanding to 24 bits with the parity data - so would need to be used in interleaved form. But I have a feeling this is a code type better suited to communication than storage.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 16, 2020 3:15 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Some good news, I've finally wrapped my head sufficiently far around Galois Field arithmetic to understand how to implement a Reed-Solomon encoder on 8-bit symbols using the 6502. There doesn't seem to be any particular difficulty introduced by the 6502's capabilities for that half of the operation, though to run it efficiently will require about 1KB of RAM to be set aside - half for buffers, the other half for lookup tables (which could be in ROM).

The algorithms needed for Reed-Solomon decoding are considerably more complex. However I have some hope that they are also reasonably implementable on the 6502. Verifying that there are no errors, however, is simply a matter of generating the parity symbols from the received data, and comparing them to the received parity symbols. So the complex decoding only needs to be run in the case where that verification fails, and hence there is an error to correct.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 16, 2020 6:33 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
> the complex decoding only needs to be run in the case where that verification fails, and hence there is an error to correct.

That's hopeful!


Top
 Profile  
Reply with quote  
PostPosted: Sat May 16, 2020 8:49 am 
Offline

Joined: Sun Jun 29, 2014 5:42 am
Posts: 351
Chromatix wrote:
The algorithms needed for Reed-Solomon decoding are considerably more complex. However I have some hope that they are also reasonably implementable on the 6502. Verifying that there are no errors, however, is simply a matter of generating the parity symbols from the received data, and comparing them to the received parity symbols. So the complex decoding only needs to be run in the case where that verification fails, and hence there is an error to correct.

It's not in 6502, but nearly 20 years ago I designed a synthesisable Reed-Solomon encoder/decoder in Verilog. This was a heavily pipelined parallel implemetation, where the code could be specified as a parameter.

There are a couple of HP Labs Tech Reports that describe this work in quite a lot of detail:

Design of a Synthesisable Reed-Solomon ECC Core

Verification of a Synthesisable Reed-Solomon ECC Core

These might be of some use to you, as they cover the both the encoder and decoder implementations at quite a low level.

As part of the verification, this was synthesised to a Xilinx Vertex-E XCV1000E-6 FPGA, and took about 80% of the available space, including all the text fixtures. This device is similar in size to a XC6SLX25. The code we used was a RS(160,128,t=16) code.

I still have all the text books as well.

It would be great to see some 6502 code for Reed Solomon coding.

Dave


Top
 Profile  
Reply with quote  
PostPosted: Sat May 16, 2020 9:05 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
What's the net effect of this R-S coding for 8 bit symbols: how many bits in an input chunk get transformed into how many bits of output chunk?


Top
 Profile  
Reply with quote  
PostPosted: Sat May 16, 2020 9:11 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
One of the quirks of Reed-Solomon is that there are quite a few different methods of decoding it. I'll probably go for some of the more conceptually simple decoding algorithms at each step, even if they turn out a bit slower. The aim is to recover from a medium error somewhat gracefully, and it doesn't have to be in realtime (unlike say a CD-ROM drive). With any luck it should be practical to build into the firmware of a standalone drive controller, which could in principle be attached to a variety of SBCs or micros.

For example, I've seen papers which recommend using Cramer's rule (involving finding the determinant of matrices) to solve the linear equation system that comes right at the end of the decoding process. As soon as you have more than two symbols to correct, you have to recurse into the determinant function because the direct rule only works for 2x2 matrices, and that sounds like it would be awkward on the 6502. So I'm more comfortable with applying the Gauss-Jordan method, which is mainly iterative rather than recursive.

With that said, I think I have the concepts lined up reasonably well now, even with no actual code written. What I need next is to adapt my 65C02 simulator (long overdue) so that I can efficiently use it as a unit test rig.

Quote:
What's the net effect of this R-S coding for 8 bit symbols: how many bits in an input chunk get transformed into how many bits of output chunk?

R-S works on block of symbols, some of which are allocated to data and some to parity (conceptually similar to the Hamming code I posted above). The maximum block size is one less symbol than the size of the alphabet in each symbol - in this case, 255 bytes. You need one parity byte per erasure (known-bad byte) or two per error (bad byte whose location is unknown). So a direct analogue to the Hamming code, using 9 parity bytes, would be able to correct 4 errors plus one erasure, or nine erasures, or something in between.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 16, 2020 9:48 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
So one could rig up something that dealt in, say 64 byte data + 7 parity bytes?

(Thanks for the links to your previous work Dave - very impressive!)


Top
 Profile  
Reply with quote  
PostPosted: Sat May 16, 2020 9:57 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
That's right. CD-ROM uses a cross-interleaved RS code in which the primary (outer) code has 28 data bytes and 4 parity bytes, then the secondary (inner) code has 24 data bytes and also 4 parity bytes. The two codes are essentially at right-angles to each other, forming a 28x32 byte rectangle, and decoding failures in the outer code are processed as erasures in the inner code, increasing the effective decoding power dramatically.

A slightly more space-efficient scheme could interleave 17 RS blocks, each with 241 data bytes and 14 parity bytes, to form a 4096-byte payload block. This would be able to correct 7 errors in each sub-block; properly interleaved, that could be up to 119 consecutive byte errors corrected without erasure information.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 16, 2020 10:21 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Oh right, so a natural level to apply RS is about the length of a floppy track? And you wouldn't have to decode too deeply if all had gone well. And less than 10% overhead.


Top
 Profile  
Reply with quote  
PostPosted: Sun May 17, 2020 7:13 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
So first, let's get the basics of Galois Field arithmetic out of the way. Galois Fields support addition, subtraction, multiplication, and division - but the rules for doing so are very different from those for ordinary integers. Understanding these rules is fundamental for understanding the algorithms involved in Reed-Solomon codes.

A Reed-Solomon code (as usually applied over 8-bit bytes) is based on discrete polynomials over two different Galois Fields: GF(2) and GF(256). One of these polynomials is used to generate GF(256) from GF(2); the other is used to generate the parity symbols of a message block. Decoding the received message involves solving the roots of this polynomial in a particular way.

GF(2) is simply a binary digit. GF(256) is derived from it by grouping eight GF(2) fields together - which is of course convenient to do on a byte-oriented CPU like the 6502. The trick is that there are two different representations of a GF(256) value, one most suited to multiplication, and one most suited to addition. We convert between the two by computing what amounts to the discrete logarithm and discrete exponent. If you've studied cryptography to any extent, at least one of those terms should be familiar. To make this efficient, we can build a pair of lookup tables, and this doesn't take very much memory.

Addition in GF(2) and GF(256) is straightforward - it is the exclusive-OR function. The 6502 has this operation natively. This also means that subtraction is identical to addition.

To multiply in GF(256), we first check whether either operand is zero - if so, the result is also zero. Otherwise, we look up the discrete logarithm of each operand, add them modulo 255 (avoiding zero), and then take the discrete exponent. The following code should suffice:
Code:
; On entry, X and Y contain the values to multiply.  On exit, A contains the result.  X is clobbered.
GF_Mult:
  TXA
  BEQ @exit
  TYA
  BEQ @exit
  ; neither argument is zero
  LDA logTable,Y
  DEC A
  CLC
  ADC logTable,X
  ADC #0    ; modulo 255, avoiding zero
  TAX
  LDA expTable,X
@exit:
  RTS
This presupposes we have filled 256-entry logTable and expTable with appropriate values; it is possible to precompute these and stick them in ROM, or to generate them dynamically in RAM. Doing so requires repeatedly evaluating the generator polynomial of GF(256) over GF(2), which requires some explanation. In effect we are computing successive powers of 2, modulo the generator polynomial, using GF arithmetic - and this generates a repeating sequence of length 255 within GF(256).

There are several generator polynomials which satisfy this property, but Reed-Solomon coders usually take the numerically smallest one, which is often quoted in the literature as X^8 + X^4 + X^3 + X + 1. But X is 2 in this context - since the base Galois Field involved is GF(2) - and it is convenient to write this as the 9-bit value $11B. Obtaining the next power of two is then as simple as multiplying by two (ie. shifting left one place), then if the most significant bit of the result matches the most-significant bit of the polynomial, subtract the polynomial. Note that the usual sense of "greater" or "lesser" doesn't apply here, since we are in a modular number system. And the 6502 makes this operation easy for us, even though it briefly involves a 9-bit value:
Code:
; Generate logTable and expTable for GF(256) over $11B
GF_Init:
  ; For consistency, both logarithm and exponent of zero are taken as zeroes; really this is 0 == a^(-inf).
  ; The real tables begin with entry 1 == a^0, both at index 1.
  STZ logTable
  STZ expTable
  LDA #1
  TAX
@loop:
  ; store matching pair of entries
  TAY
  STA expTable,X
  STX logTable,Y
  ; calculate next exponent
  ASL A
  BCC :+
  EOR #$1B
  ; and the next index
: INX
  BNE @loop
  ; A should now contain 1 again
  CMP #1
  BEQ :+
  BRK ; oops
: RTS
This method of taking the remainder of a division by the generator polynomial works because a polynomial over GF(2) can have coefficients only of 0 or 1, so we only have a binary decision as to whether to keep or subtract, and don't need to worry about multiplying the generator polynomial by anything. But when we start dealing with polynomials over GF(256), that convenience goes away, as the coefficients can take many values. When calculating by hand, we can use a form of long division, but the notation quickly gets confusing due to the need to refer to the discrete logarithm tables every time we switch between multiplying (to match the values of the leftmost digits) and subtracting.

But this is a good time to establish the correct method of dividing two values in GF(256). In effect, we do this by multiplying the dividend by the reciprocal of the divisor, a process which leaves no remainder. We can use the logarithm table to discover the reciprocal, by observing that we only need to negate the logarithm, and then adjust the resulting table index for the wrap-around of the exponentiation sequence. It probably helps to observe that the reciprocal of zero is undefined (we can take it as zero) and the reciprocal of 1 (== a^0) is also 1 (== a^-0). But the reciprocal of 2 (== a^1) is a^-1 which is also a^254, to be found at offset 255 in the exponent table. Generally, for values other than 0 and 1, subtracting the logarithm from 256 will give the correct logarithm of the reciprocal, to which the multiplication method seen above can be applied.
Code:
; On entry, X is dividend, Y is divisor; on exit, A is quotient.
GF_Div:
  TXA
  BEQ @exit
  CPY #1
  BEQ @exit
  TYA
  BEQ @exit
  ; neither argument is zero and the divisor is not 1
  LDA logTable,X
  CLC
  SBC logTable,Y
  ADC #0
  TAX
  LDA expTable,X
@exit:
  RTS
The above code includes a rather counter-intuitive optimisation resulting from combining the negation of one operand, the decrement of the other, and the addition of the results. I will need to verify that it actually does the right thing, but it is awfully convenient.

I'll save the problem of calculating the remainder over a polynomial of GF(256) for another post, as that's the fundamental operation for generating the parity symbols of a Reed-Solomon block.


Top
 Profile  
Reply with quote  
PostPosted: Sun May 17, 2020 2:08 pm 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
If you get the Reed Solomon decoding working then I have around 200 'DiskOnChip' (early flash) devices that I bought thinking they might make nice storage devices for small micros. They have a built in encoder that calculates the checksum as bytes are written but you have to decode and fix the errors in software.

If I can't find a use for them then they are going to be upcycled as an art project.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 26 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: