fast 24 bit PRNG

Programming the 6502 microprocessor and its relatives in assembly and other languages.
Post Reply
User avatar
Broti
Posts: 28
Joined: 07 Sep 2023
Location: Moers, Germany

fast 24 bit PRNG

Post by Broti »

Just found this fast 24 bit PRNG algorithm by Wim Couwenberg:

Code: Select all

lda a           ; Operation 7 (with carry clear).
asl
eor b
sta b
rol             ; Operation 9.
eor c
sta c
eor a           ; Operation 5.
sta a
lda b           ; Operation 15.
ror
eor c
sta c
eor b           ; Operation 6.
sta b
For more details: https://wimcouwenberg.wordpress.com/202 ... processor/
ROR A? Where we're coding, we don't need A.
User avatar
BigDumbDinosaur
Posts: 9425
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: fast 24 bit PRNG

Post by BigDumbDinosaur »

Broti wrote:
Just found this fast 24 bit PRNG algorithm by Wim Couwenberg...

Interesting.

His code could be easily adapted to the 65C816 running in native mode, which would make extending it to 32 bits relatively painless.  Also, 16-bit code running on the 816 might be faster than 8-bit code running on a 65C02 at the same Ø2 rate.  I may try this out in the Kowalski simulator to get some execution time data.

Going beyond that, if some “noise” can be used to seed the terms before computing the PRN, randomness ought to be better.  That “noise” could come from a hardware timer or an interrupt-driven jiffy clock.  In my POC unit, I can get “noise” from the 16-bit counter/timer in DUART A, which free-runs and underflows 100 times per second—the “noise” range is $0000-$4800.  That timer could be read on the fly as a 16-bit operation when some “noise” is needed.  An even better “noise” source might be the currently unused timer in DUART B, which could be preset to $FFFF and set to run free (counter mode).
x86?  We ain't got no x86.  We don't NEED no stinking x86!
gfoot
Posts: 871
Joined: 09 Jul 2021

Re: fast 24 bit PRNG

Post by gfoot »

Interesting - it is worth reading this past thread as well, especially Arlet's short and simple RNGs: viewtopic.php?f=2&t=587&hilit=Arlet+rng ... 95#p100642
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: fast 24 bit PRNG

Post by BigEd »

A good find, thanks Broti!

And thanks for the link to that previous thread George.
User avatar
BigDumbDinosaur
Posts: 9425
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: fast 24 bit PRNG

Post by BigDumbDinosaur »

BigDumbDinosaur wrote:
Going beyond that, if some “noise” can be used to seed the terms before computing the PRN, randomness ought to be better...In my POC unit, I can get “noise” from the 16-bit counter/timer in DUART A, which free-runs and underflows 100 times per second—the “noise” range is $0000-$4800.  That timer could be read on the fly as a 16-bit operation when some “noise” is needed.  An even better “noise” source might be the currently unused timer in DUART B, which could be preset to $FFFF and set to run free (counter mode).
Sort of.  :D

Here’s my test code, which I wrote with the M/L monitor’s assembler.  I added labels to branch targets for clarity:

Code: Select all

         rep #%11111111        ;reset all SR bits
         sep #%00110000        ;8-bit registers
         lda #%00110000        ;operate timer B...
         sta $c104             ;as a counter
         rep #%00100000        ;16-bit .A
         stz $c106             ;preset timer to zero
         bra restart           ;start timer & halt
;
;
;	here we read the timer on the fly...
;
read     ldx $c10f             ;stop timer &...
         lda $c104             ;read it
         sta $c104             ;write it back as new preset
;
restart  ldx $c10e             ;restart timer
         sta $a0               ;save timer value
         brk
         nop
         bra read              ;play it again, Sam

I’m doing this with timer B—I don’t want to monkey with timer A, since it is responsible for generating jiffy IRQs.

Next step will be to integrate this with the PRNG code and see what happens.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
commodorejohn
Posts: 299
Joined: 21 Jan 2016
Location: Placerville, CA
Contact:

Re: fast 24 bit PRNG

Post by commodorejohn »

I don't know enough about this stuff to say about the algorithm in question, but if it's vulnerable to the same "all zeros = frozen state" issue as LFSRs are, you wanna be careful injecting arbitrary "noise" values into it - maybe take the timer value and OR it with the PRNG state rather than loading it directly.
gfoot
Posts: 871
Joined: 09 Jul 2021

Re: fast 24 bit PRNG

Post by gfoot »

I wonder whether "increment if zero" is a reasonable way to avoid that particular issue - however randomness is one of those cases where I know just enough to know that I don't know, and that it's easy to go in with good intentions but come out with really bad results!
User avatar
Yuri
Posts: 371
Joined: 28 Feb 2023
Location: Texas

Re: fast 24 bit PRNG

Post by Yuri »

Typically the thing I've seen in the PRNGs I've worked with is the use of the fact that adding prime numbers within the domain of an even number (e.g. the limit of a byte/word/dword) will cover all the numbers in that space eventually.

E.g.: A common one I've seen is:
newSeed = oldSeed * 524287 + oldSeed * 8191 + 127;

The last + should avoid a frozen state.
User avatar
BigDumbDinosaur
Posts: 9425
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: fast 24 bit PRNG

Post by BigDumbDinosaur »

commodorejohn wrote:
I don't know enough about this stuff to say about the algorithm in question, but if it's vulnerable to the same "all zeros = frozen state" issue as LFSRs are, you wanna be careful injecting arbitrary "noise" values into it - maybe take the timer value and OR it with the PRNG state rather than loading it directly.

What I’m going to do is 65C816-ize the algorithm, run it for a number of iterations to get a feel for how it works, and assuming it doesn’t trip over the “frozen” state, then add the “noise” and see if that breaks something.

I slightly reworked my “noise generator” to create a 32-bit result that will never be zero.  In theory, even if the algorithm freezes, the “noise” ought to rescue it.

I decided to write some fresh code in the Kowalski assembler to make it easier to fiddle around.

Code: Select all

	.opt proc65816,caseinsensitive,swapbin
;===============================================================================
;
;PRNG SEED NOISE GENERATOR
;
;	—————————————————————————————————————————————————————————————————————
;	This bit of code sets up timer B as a free-running counter driven by
;	the X1 clock without scaling.  The timer will run down from $FFFF to
;	$0000, reload & repeat.  THe “noise” is gotten by briefly stopping
;	the timer, doing a 16-bit read of its registers & then restarting it.
;	The result will be a value somewhere in the 16-bit range.
;	—————————————————————————————————————————————————————————————————————
;
acrb     =$00c104              ;DUART ‘B’ aux control
ctb      =$00c106              ;DUART ‘B’ timer
ctstrtb  =$00c10e              ;DUART ‘B’ timer start
ctstopb  =$00c10f              ;DUART ‘B’ timer stop
;
setasctr =%00100000            ;counter mode using 1× X1 clock
;
noise    =$a0                  ;noise storage
;
         *=$000100
;
main     rep #%11111111        ;reset all SR bits
         lda !#0
         tax
         tay
         phk
         plb
         sep #%00010000        ;8-bit index
         ldx #setasctr         ;operate timer B...
         stx acrb              ;as a counter
         stz ctb               ;initial preset
         stz noise             ;reset...
         stz noise+2           ;noise
         bra restart           ;start timer & pause
;
read     ldx ctstopb           ;stop timer
         lda ctb               ;read timer
         sta ctb               ;write back as preset
         xba                   ;reverse endianess &...
         sta noise             ;save noise LSW
         eor !#-1              ;mix up things
         sta noise+2           ;save noise MSW
;
restart  ldx ctstrtb           ;start timer
         brk
         nop
         bra read              ;get some noise
;
	.end
x86?  We ain't got no x86.  We don't NEED no stinking x86!
John West
Posts: 383
Joined: 03 Sep 2002

Re: fast 24 bit PRNG

Post by John West »

If the PRNG is any good, messing with it in an attempt to introduce "more randomness" (without theory or testing) is only going to make it worse. Almost certainly much worse. If you want non-determinism, use it when you're selecting the seed and then leave it alone.

In this case, there is a cycle of length 1 when the state is all 0, so you want to avoid 0 as a seed. That's easy - seed it with 16 bits of timer and any non-zero constant byte, call the generator a few times to mix it up, and you're set.

I'm suspicious about this one though. Each bit of the state seems to affect only 3 to 7 other bits, when (from what I've read on the subject) you really want them to affect about half (12 in this case). I would expect that to result in some correlation between consecutive values. And that should show up on a "x = rand() y = rand() drawpixel(x,y)" test. But it doesn't: here's 32768 iterations, with this generator on the left and whatever luajit uses on the right.

One of these days I will have to write some tests for little PRNGs. Anything that's suitable for a slow 8 bit processor is going to fail pretty much immediately when the "proper" test suites are thrown at them, but that's OK - our requirements are a lot more modest.
Attachments
rand.png
gfoot
Posts: 871
Joined: 09 Jul 2021

Re: fast 24 bit PRNG

Post by gfoot »

John West wrote:
One of these days I will have to write some tests for little PRNGs. Anything that's suitable for a slow 8 bit processor is going to fail pretty much immediately when the "proper" test suites are thrown at them, but that's OK - our requirements are a lot more modest.
An option I've used recently is to make my code output a string of binary bits and feed it in here: https://mzsoltmolnar.github.io/random-bitstream-tester/

But I believe what Arlet was doing in the older thread I linked above was using code on a PC to try out different combinations of simple operations, feeding results through a standard randomness testing suite (PractRand?) and determining which combinations did lead to good randomness results from the test suite. It feels like a better approach to me rather than relying on either randomness testing on-device, or my method of copying back bitstreams, both of which are really slow and impractical for large enough amounts of random data.
barnacle
Posts: 1831
Joined: 19 Jan 2004
Location: Potsdam, DE
Contact:

Re: fast 24 bit PRNG

Post by barnacle »

gfoot wrote:
An option I've used recently is to make my code output a string of binary bits...
My code is generally a random string of binary bits anyway :mrgreen:

Neil
User avatar
commodorejohn
Posts: 299
Joined: 21 Jan 2016
Location: Placerville, CA
Contact:

Re: fast 24 bit PRNG

Post by commodorejohn »

If you're trying to test/measure, a 24-bit PRNG has a maximum sequence length of "only" 16.7 million values, which is well within the realm of generating & caching for analysis on modern machines.
dmsc
Posts: 153
Joined: 17 Sep 2018

Re: fast 24 bit PRNG

Post by dmsc »

Hi!
Broti wrote:
Just found this fast 24 bit PRNG algorithm by Wim Couwenberg:

Code: Select all

lda a           ; Operation 7 (with carry clear).
asl
eor b
sta b
rol             ; Operation 9.
eor c
sta c
eor a           ; Operation 5.
sta a
lda b           ; Operation 15.
ror
eor c
sta c
eor b           ; Operation 6.
sta b
For more details: https://wimcouwenberg.wordpress.com/202 ... processor/
Sadly, that PRNG is really bad.... see the results testing with PractRand, it fails after only 8k outputs!:

Code: Select all

$ minisim -b couwenberg.xex  | ../../pr-095/RNG_test stdin8 -te 1 -tlmin 10 -tlmax 50 -tlmaxonly -multithreaded
RNG_test using PractRand version 0.95
RNG = RNG_stdin8, seed = unknown
test set = expanded, folding = standard (8 bit)

rng=RNG_stdin8, seed=unknown
length= 1 kilobyte (2^10 bytes), time= 0.1 seconds
  no anomalies in 16 test result(s)

rng=RNG_stdin8, seed=unknown
length= 2 kilobytes (2^11 bytes), time= 0.4 seconds
  no anomalies in 30 test result(s)

rng=RNG_stdin8, seed=unknown
length= 4 kilobytes (2^12 bytes), time= 1.0 seconds
  no anomalies in 38 test result(s)

rng=RNG_stdin8, seed=unknown
length= 8 kilobytes (2^13 bytes), time= 1.8 seconds
  Test Name                         Raw       Processed     Evaluation
  BRank(18):128(1)                  R= +2133  p~=  3.3e-643   FAIL !!!!!!!   
  ...and 60 test result(s) without anomalies

rng=RNG_stdin8, seed=unknown
length= 16 kilobytes (2^14 bytes), time= 2.8 seconds
  Test Name                         Raw       Processed     Evaluation
  BRank(18):128(2)                  R= +3017  p~=  3.3e-909   FAIL !!!!!!!   
  ...and 79 test result(s) without anomalies

rng=RNG_stdin8, seed=unknown
length= 32 kilobytes (2^15 bytes), time= 4.2 seconds
  Test Name                         Raw       Processed     Evaluation
  BRank(18):128(2)                  R= +3017  p~=  3.3e-909   FAIL !!!!!!!   
  ...and 95 test result(s) without anomalies

rng=RNG_stdin8, seed=unknown
length= 64 kilobytes (2^16 bytes), time= 5.8 seconds
  Test Name                         Raw       Processed     Evaluation
  BRank(18):128(2)                  R= +3017  p~=  3.3e-909   FAIL !!!!!!!   
  BRank(18):256(1)                  R= +4889  p~=  1e-1472    FAIL !!!!!!!!  
  [Low1/8]BRank(18):128(1)          R= +2133  p~=  3.3e-643   FAIL !!!!!!!   
  ...and 110 test result(s) without anomalies
The period is as advertised, 2^24-1 bytes, but the way it is constructed means that each 3 bytes of output are exactly repeated twice between the period, the output is very regular.

As mentioned earlier, the generators that Arlet came with are not only much better (only giving problems at 2^29 bytes of output), but also faster, see the minimal one at: https://github.com/Arlet/pseudo-random- ... l-6502.asm

Have Fun!
White Flame
Posts: 704
Joined: 24 Jul 2012

Re: fast 24 bit PRNG

Post by White Flame »

gfoot wrote:
I wonder whether "increment if zero" is a reasonable way to avoid that particular issue - however randomness is one of those cases where I know just enough to know that I don't know, and that it's easy to go in with good intentions but come out with really bad results!
I tackled this 0->0 cycle problem properly in my 8-bit PRNG: https://codebase.c64.org/doku.php?id=ba ... 8-bit_prng

For any input, you choose whether or not to XOR the magic byte, based on the high bit shifted out. 01 through FF have a single continuous chain between them in pseudorandom order, for certain XOR values. But 00, since it doesn't XOR, just loops back to 00. So I forced XOR on 00, but now it creates the same result as 80. So I forced 80 not to XOR, and it becomes the only value whose output is 00. And that links 00 properly into the chain.

Code: Select all

        lda seed
        beq doEor  ;if the input was $00, force EOR
         asl
         beq noEor ;if the input was $80, skip the EOR
         bcc noEor
doEor:    eor #$1d
noEor:  sta seed


; $00 : (shift and) perform EOR
; $80 : just shift
; %1xxxxxxx : shift and perform EOR
; %0xxxxxxx : just shift
By the way, I also tried just switching the XOR decision based on the high bit clear instead of set, and didn't find ANY magic XOR bytes that worked in that arrangement, even for just a 255-byte cycle!

It doesn't look like that simple 24-bit one is easily fixable at all, but it's also good to have quick & dirty "bad" PRNGs because not everything needs full coverage.
Post Reply