6502.org

Posted: **Tue Mar 14, 2023 3:09 pm**

No, I don't have 6502 version of my code written out, but I have tested it with Dieharder/TestU01. Results are mixed. You can get good results with sufficient number of state bytes and/or multiple iterations, but the smallest/fastest versions aren't so good.

I don't know where the bug is in the optimized code, but I converted it to C, and it produces the same numbers as your test output.

Code: Select all

    uint64_t LFSR = 0xffffffff;
    uint8_t OUT;

    while( 1 )
    {
        for( i = 0; i < 8; i++ )
        {
            LFSR <<= 1;
            OUT = OUT >> 1;
            if( LFSR & 0x100000000UL )
            {
                LFSR ^= 0x1ea000001;
                OUT |= 0x80;
            }
        }
        putchar( OUT );
    }

Posted: **Tue Mar 14, 2023 3:37 pm**

Ok Arlet.

Posted: **Tue Mar 14, 2023 3:41 pm**

I think the

Code: Select all

rol mask

should be

Code: Select all

asl mask

My version does not have the mask, so it doesn't have this bug.

Posted: **Tue Mar 14, 2023 3:47 pm**

Arlet wrote:

I think the

Code: Select all

rol mask

should be

Code: Select all

asl mask

My version does not have the mask, so it doesn't have this bug.

Well done ! that was the bug !!

Posted: **Tue Mar 14, 2023 5:00 pm**

Alex1 wrote:

I'm really sorry to tell you and I hope you won't get offended but your algorithm is not random at all and is full of short cycles. not at all 2^32.

nah you're good.

Alex1 wrote:

I guess you never did any tests on it, did you?

not really. i just thought it would work due to the pointer itself only repeating every 2^32 calls so it would guarantee a new value each time. maybe if the function takes the pointer as well as the data from the table to generate a number. since the whole thing is always 256 bytes large the function could be made larger/more complex.
but why bother when others have made much better functions that are way smaller and faster?

Alex1 wrote:

All al lot of short cycles, I stopped the test well before 2^32 :
The first 20-byte sequence repeats non-stop :

Quote:

Found 'A0C3BEA2C0BE731CBE16371CBEA0C1BEA2BEBE53' at index 0 !

hmm, when i turn the function into a small logic circuit i get:

Quote:

A1 C3 BE A3 C0 BE 74 1C BE 16 38 1C BE A1 C1 BE A3 BE BE 54

at the start. with the first repeating at index $D00, the next at $7B00, then $D0000, then $D0D00, and so on.
maybe there was some small mistake when converting the assembly to... whatever you use to run the test, did you forget the carry from the first to the second add? if i remove that carry line in the circuit i get exactly what you had:

Quote:

A0 C3 BE A2 C0 BE 73 1C BE 16 37 1C BE A0 C1 BE A2 BE BE 53

either way it doesn't matter, the RNG function does suck

Posted: **Tue Mar 14, 2023 5:23 pm**

Proxy wrote:

maybe there was some small mistake when converting the assembly to... whatever you use to run the test, did you forget the carry from the first to the second add? if i remove that carry line in the circuit i get exactly what you had:

Yes you are right, there was a carry problem in my code to test your RNG algorithm. But, as you said, the loops are always there with or without carry.
The problem is, in fact, not the 2^32 pointer but the table of 256 bytes too small.

Conversely with LFSR we can prove mathematically that there will be no loops while with a random data table like yours, we cannot mathematically demonstrate that there will be no loops. According to the data of the table, the algorithm can go in loop without warning.

Posted: **Wed Mar 15, 2023 12:05 am**

Quote:

I think the
Code:
rol mask

should be
Code:
asl mask

My version does not have the mask, so it doesn't have this bug.

Yeah, that occurred to me a bit later. The problem stems from what I called LFSR. Any time a set bit is shifted into the carry flag and the branch to 'noupdate' is not taken, it's still there when 'mask' is rotated. So it adds bits to mask; probably eight of them if the initial seed value starts with $FF. The loop will still stop after eight rotations, but there might be a whole lot of extra set bits in 'mask'. ASL instead of ROL always shifts in a zero bit, so 'mask' never has more than one set bit.

Good catch.

Posted: **Wed Mar 15, 2023 3:07 am**

White Flame wrote:

For really compact but dumb PRNGs, a while ago I ported the basic LFSR algorithm and gave it a full 256|65536 period, as opposed to the default of just 255|65535. Of course, the period is only as long as the number itself, but I wanted something where graphical effects would go through all 256 values of a grid once in a "random" order.

https://codebase64.org/doku.php?id=base ... 8-bit_prng
https://codebase64.org/doku.php?id=base ... 6-bit_prng

The 8-bit one is quite tiny, even with the fixup for full range:

Code: Select all

        lda seed
        beq doEor
         asl
         beq noEor
         bcc noEor
doEor:    eor #$1d  ; 16 possible magic numbers work here, see the link
noEor:  sta seed

I have come to the realization that my VTL02 PRNG is not up to snuff. It involves a swap-rotate-add-update "thingy" between the 16-bit seed (tick, tick+1) and the most recent 16-bit numeric assignment value (arg+2, arg+3). VTL02 code consists of nothing but assignment statements and optional comments, so I figured the assignments should be frequent enough to keep the mix fresh. Some informal experiments have shown me that it has a strong tendency to suck, especially in short loops:

Code: Select all

    lda  arg+2      ;
    ...
    sta  (arg)      ;
    adc  tick+1     ; store arg[{1}] in the left-side
    rol             ;   variable
    tax             ;
    ldy  #1         ;
    lda  arg+3      ;
    sta  (arg),y    ;
    adc  tick       ; pseudo-randomize {'}
    rol             ;
    sta  tick+1     ;
    stx  tick       ;
execrts:
    rts             ;

I think I should spend a few bytes replacing it with White Flame's 16-bit code, unless someone smarter than I can come up with a better plan. Updating the seed only when used instead of after every numeric assignment executed should improve overall performance slightly too.

Posted: **Wed Mar 15, 2023 2:09 pm**

As PRNGs seem to interest some people I also give you my LFSR 64 which also dates from 1988,

It generates a random sequence of (2^64)-1 bits. It's a 64-bit GALOIS LFSR. Each call to the routine generates an 8-bit byte. We can therefore obtain (2^64)/8 bytes before repeating the sequence. There is no short cycle for all seed values.

First a 8-bytes seed must be put at seed:
The seed can vary from 1 to $FFFFFFFFFFFFFFFF but never 0 otherwise the LFSR produces only bits to 0.

Each call to the start: function returns a pseudorandom byte in rndbyte:

First init: must be call one time.
After start: must be call for each generated byte (rndbyte:) that can be use directly.

The test vector is :

For a seed = $FFFFFFFFFFFFFFFF
the first 50 bytes are :

Quote:

C4 B0 C2 1F BB C2 A2 D6 EA 28 87 FE 9E EA D6 14 D1 6E 0E 7A 46 84 41 FE 08 8A 49 74 4E 0E 07 7A 24 7F 23 05 6A 71 49 74 E0 05 A9 71 8A 74 23 05 A0 71

Code: Select all

//----------------------------------------------------
// 			Pukall LFSR 64
//			    1988
//----------------------------------------------------

//----------------------------------------------------
// 			Main Program
//----------------------------------------------------

init:	
  	ldx #0
loop:	jsr start
	inx
	cpx #64
	bne loop
	rts
	
start:	lda #0
	sta rndbyte
	sta tmp
	jsr begin
	lda tmp
	cmp #0
	beq bit1
	sta rndbyte
		
bit1:	lda #0
	sta tmp
	jsr begin
	lda tmp
	cmp #0
	beq bit2
	lda rndbyte
	ora #$2
	sta rndbyte

bit2:	lda #0
	sta tmp
	jsr begin
	lda tmp
	cmp #0
	beq bit3
	lda rndbyte
	ora #$4
	sta rndbyte

bit3:	lda #0
	sta tmp
	jsr begin
	lda tmp
	cmp #0
	beq bit4
	lda rndbyte
	ora #$8
	sta rndbyte

bit4:	lda #0
	sta tmp
	jsr begin
	lda tmp
	cmp #0
	beq bit5
	lda rndbyte
	ora #$10
	sta rndbyte
	
bit5:	lda #0
	sta tmp
	jsr begin
	lda tmp
	cmp #0
	beq bit6
	lda rndbyte
	ora #$20
	sta rndbyte

bit6:	lda #0
	sta tmp
	jsr begin
	lda tmp
	cmp #0
	beq bit7
	lda rndbyte
	ora #$40
	sta rndbyte

bit7:	lda #0
	sta tmp
	jsr begin
	lda tmp
	cmp #0
	beq ret
	lda rndbyte
	ora #$80
	sta rndbyte
ret:	rts

begin:
        asl seed
        rol seed+1
        rol seed+2
        rol seed+3
	rol seed+4
	rol seed+5
	rol seed+6
	rol seed+7
        bcc not
        lda seed
        eor #$01
        sta seed
        lda seed+7
        eor #$B0
        sta seed+7
	lda #1
	sta tmp
not:
	rts

seed: .byte $FF,$FF,$FF,$FF,$FF,$FF,$FF,$FF
tmp:  .byte $00
rndbyte: .byte $00

A FIPS 140-2 test shows that bytes are seen as random:

Quote:

rngtest: starting FIPS tests...
rngtest: bits received from input: 40000032
rngtest: FIPS 140-2 successes: 2000
rngtest: FIPS 140-2 failures: 0
rngtest: FIPS 140-2(2001-10-10) Monobit: 0
rngtest: FIPS 140-2(2001-10-10) Poker: 0
rngtest: FIPS 140-2(2001-10-10) Runs: 0
rngtest: FIPS 140-2(2001-10-10) Long run: 0
rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
rngtest: input channel speed: (min=130.640; avg=3263.214; max=9536.743)Mibits/s
rngtest: FIPS tests speed: (min=1021.455; avg=16349.437; max=17147.717)Kibits/s
rngtest: Program run time: 2404057 microseconds

The optimized code :

Code: Select all

//----------------------------------------------------
// 			Pukall LFSR 64
//			    1988
//        Optimized by Teamtempest 2023
//----------------------------------------------------


//----------------------------------------------------
// 			Main Program
//----------------------------------------------------


init:	
  	ldx #0
loop:	jsr start
	inx
	cpx #64
	bne loop
	rts
	
start:	lda #$00
        sta rndbyte
        lda #$01
        sta mask
	
begin:
        asl seed
        rol seed+1
        rol seed+2
        rol seed+3
	rol seed+4
	rol seed+5
	rol seed+6
	rol seed+7
        bcc not
        lda seed
        eor #$01
        sta seed
        lda seed+7
        eor #$B0
        sta seed+7

	lda rndbyte
        ora mask
        sta rndbyte
not:
	asl mask
        bcc begin
        rts

seed: .byte $FF,$FF,$FF,$FF,$FF,$FF,$FF,$FF
mask:  .byte $00
rndbyte: .byte $00

Posted: **Wed Mar 15, 2023 3:43 pm**

I'm all in favour of LFSRs!

Thanks for sharing your code. Just a couple of things you're doing are not idiomatic in 6502 land.

Code: Select all

   lda tmp
   cmp #0
   beq bit7

You almost never need to compare with zero, because the Z flag is set by a load.

Code: Select all

   sta tmp
   jsr begin
   lda tmp

Here, I would probably use PHA and PLA, if tmp isn't needed inside the subroutine. (But in this case it is in fact the return value.)

But I think it will be more natural, faster, and shorter, to use carry as the return value.

Code: Select all

        bcc not
...
   lda #1
   sta tmp
not:
   rts

So we just need a SEC instead:

Code: Select all

        bcc not
...
   sec
not:
   rts

and now C is the return boolean. Which is nice, because we no longer need to initialise tmp.

So

Code: Select all

   jsr begin
   lda tmp
   cmp #0
   beq bit7

becomes

Code: Select all

   jsr begin
   bcc bit7

(probably!)

But also, as I think as noted upthread, your 8 calls to begin, with preamble and postamble, can be compressed into a loop. In which case begin need not be a subroutine, it can be placed inline.

I think it can be useful though, to start by writing code however it occurs to you, and get it debugged, and then to perform some transformations to improve it.

Posted: **Wed Mar 15, 2023 4:45 pm**

BigEd, it is an unoptimized 1988 code. Thank you for suggesting improvements.

Posted: **Sat Mar 18, 2023 10:54 am**

Here's my attempt at a random number generator optimized for 6502. An 8 bit random number is returned in A. It needs 7 bytes of RAM (preferably zero page), and has a period guaranteed to be at least 32 bits. It passes Dieharder and PractRand tests (up to at least 4GB of output).

Code: Select all

CLC
LDA #$45
ADC seed+0
STA seed+0
ADC seed+1
STA seed+1
ADC seed+2
STA seed+2
ADC seed+3
STA seed+3
EOR seed+6
ADC seed+4
STA seed+4
ADC seed+5
STA seed+5
ADC seed+6
STA seed+6
EOR seed+4
RTS

Posted: **Sat Mar 18, 2023 3:28 pm**

Nice! It's going to be tough to beat that for quality of output in the general case, I think, at least when considering code footprint.

Posted: **Mon Mar 20, 2023 4:11 pm**

Hi Arlet,

Is your algorithm based on math? What is the principle in this case? Or did you create randomly by trial and error?

Posted: **Mon Mar 20, 2023 5:09 pm**

Alex1 wrote:

Is your algorithm based on math? What is the principle in this case? Or did you create randomly by trial and error?

A lot of trial and error, and then using the errors to refine my intuition. There's also some math involved, but I didn't realize that until it was already somewhat successful. Particularly, a chain of ADC instructions like this results in a maximum length permuted counter that will go through all possible states, but in a randomish order (provided that initial input is odd, and carry flag is cleared). The problem with just a simple chain of ADCs is that the lower order bits are quite regular, so I mixed in some EORs to try to break that pattern, and then used PractRand to see if it was successful.

The "magic" $45 constant input isn't very critical. Other odd values will work too.

6502.org

Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation

Re: Random-number generation