Page 1 of 1
fast 24 bit PRNG
Posted: Thu Oct 09, 2025 3:12 pm
by Broti
Just found this fast 24 bit PRNG algorithm by Wim Couwenberg:
Code: Select all
lda a ; Operation 7 (with carry clear).
asl
eor b
sta b
rol ; Operation 9.
eor c
sta c
eor a ; Operation 5.
sta a
lda b ; Operation 15.
ror
eor c
sta c
eor b ; Operation 6.
sta b
For more details:
https://wimcouwenberg.wordpress.com/202 ... processor/
Re: fast 24 bit PRNG
Posted: Thu Oct 09, 2025 4:50 pm
by BigDumbDinosaur
Just found this fast 24 bit PRNG algorithm by Wim Couwenberg...
Interesting.
His code could be easily adapted to the 65C816 running in native mode, which would make extending it to 32 bits relatively painless. Also, 16-bit code running on the 816 might be faster than 8-bit code running on a 65C02 at the same Ø2 rate. I may try this out in the Kowalski simulator to get some execution time data.
Going beyond that, if some “noise” can be used to seed the terms before computing the PRN, randomness ought to be better. That “noise” could come from a hardware timer or an interrupt-driven jiffy clock. In my POC unit, I can get “noise” from the 16-bit counter/timer in DUART A, which free-runs and underflows 100 times per second—the “noise” range is $0000-$4800. That timer could be read on the fly as a 16-bit operation when some “noise” is needed. An even better “noise” source might be the currently unused timer in DUART B, which could be preset to $FFFF and set to run free (counter mode).
Re: fast 24 bit PRNG
Posted: Thu Oct 09, 2025 5:40 pm
by gfoot
Interesting - it is worth reading this past thread as well, especially Arlet's short and simple RNGs:
viewtopic.php?f=2&t=587&hilit=Arlet+rng ... 95#p100642
Re: fast 24 bit PRNG
Posted: Thu Oct 09, 2025 6:05 pm
by BigEd
A good find, thanks Broti!
And thanks for the link to that previous thread George.
Re: fast 24 bit PRNG
Posted: Thu Oct 09, 2025 10:04 pm
by BigDumbDinosaur
Going beyond that, if some “noise” can be used to seed the terms before computing the PRN, randomness ought to be better...In my POC unit, I can get “noise” from the 16-bit counter/timer in DUART A, which free-runs and underflows 100 times per second—the “noise” range is $0000-$4800. That timer could be read on the fly as a 16-bit operation when some “noise” is needed. An even better “noise” source might be the currently unused timer in DUART B, which could be preset to $FFFF and set to run free (counter mode).
Sort of. 
Here’s my test code, which I wrote with the M/L monitor’s assembler. I added labels to branch targets for clarity:
Code: Select all
rep #%11111111 ;reset all SR bits
sep #%00110000 ;8-bit registers
lda #%00110000 ;operate timer B...
sta $c104 ;as a counter
rep #%00100000 ;16-bit .A
stz $c106 ;preset timer to zero
bra restart ;start timer & halt
;
;
; here we read the timer on the fly...
;
read ldx $c10f ;stop timer &...
lda $c104 ;read it
sta $c104 ;write it back as new preset
;
restart ldx $c10e ;restart timer
sta $a0 ;save timer value
brk
nop
bra read ;play it again, Sam
I’m doing this with timer B—I don’t want to monkey with timer A, since it is responsible for generating jiffy IRQs.
Next step will be to integrate this with the PRNG code and see what happens.
Re: fast 24 bit PRNG
Posted: Fri Oct 10, 2025 12:13 am
by commodorejohn
I don't know enough about this stuff to say about the algorithm in question, but if it's vulnerable to the same "all zeros = frozen state" issue as LFSRs are, you wanna be careful injecting arbitrary "noise" values into it - maybe take the timer value and OR it with the PRNG state rather than loading it directly.
Re: fast 24 bit PRNG
Posted: Fri Oct 10, 2025 1:50 am
by gfoot
I wonder whether "increment if zero" is a reasonable way to avoid that particular issue - however randomness is one of those cases where I know just enough to know that I don't know, and that it's easy to go in with good intentions but come out with really bad results!
Re: fast 24 bit PRNG
Posted: Fri Oct 10, 2025 2:16 am
by Yuri
Typically the thing I've seen in the PRNGs I've worked with is the use of the fact that adding prime numbers within the domain of an even number (e.g. the limit of a byte/word/dword) will cover all the numbers in that space eventually.
E.g.: A common one I've seen is:
newSeed = oldSeed * 524287 + oldSeed * 8191 + 127;
The last + should avoid a frozen state.
Re: fast 24 bit PRNG
Posted: Fri Oct 10, 2025 3:47 am
by BigDumbDinosaur
I don't know enough about this stuff to say about the algorithm in question, but if it's vulnerable to the same "all zeros = frozen state" issue as LFSRs are, you wanna be careful injecting arbitrary "noise" values into it - maybe take the timer value and OR it with the PRNG state rather than loading it directly.
What I’m going to do is 65C816-ize the algorithm, run it for a number of iterations to get a feel for how it works, and assuming it doesn’t trip over the “frozen” state, then add the “noise” and see if that breaks something.
I slightly reworked my “noise generator” to create a 32-bit result that will never be zero. In theory, even if the algorithm freezes, the “noise” ought to rescue it.
I decided to write some fresh code in the Kowalski assembler to make it easier to fiddle around.
Code: Select all
.opt proc65816,caseinsensitive,swapbin
;===============================================================================
;
;PRNG SEED NOISE GENERATOR
;
; —————————————————————————————————————————————————————————————————————
; This bit of code sets up timer B as a free-running counter driven by
; the X1 clock without scaling. The timer will run down from $FFFF to
; $0000, reload & repeat. THe “noise” is gotten by briefly stopping
; the timer, doing a 16-bit read of its registers & then restarting it.
; The result will be a value somewhere in the 16-bit range.
; —————————————————————————————————————————————————————————————————————
;
acrb =$00c104 ;DUART ‘B’ aux control
ctb =$00c106 ;DUART ‘B’ timer
ctstrtb =$00c10e ;DUART ‘B’ timer start
ctstopb =$00c10f ;DUART ‘B’ timer stop
;
setasctr =%00100000 ;counter mode using 1× X1 clock
;
noise =$a0 ;noise storage
;
*=$000100
;
main rep #%11111111 ;reset all SR bits
lda !#0
tax
tay
phk
plb
sep #%00010000 ;8-bit index
ldx #setasctr ;operate timer B...
stx acrb ;as a counter
stz ctb ;initial preset
stz noise ;reset...
stz noise+2 ;noise
bra restart ;start timer & pause
;
read ldx ctstopb ;stop timer
lda ctb ;read timer
sta ctb ;write back as preset
xba ;reverse endianess &...
sta noise ;save noise LSW
eor !#-1 ;mix up things
sta noise+2 ;save noise MSW
;
restart ldx ctstrtb ;start timer
brk
nop
bra read ;get some noise
;
.end
Re: fast 24 bit PRNG
Posted: Fri Oct 10, 2025 1:11 pm
by John West
If the PRNG is any good, messing with it in an attempt to introduce "more randomness" (without theory or testing) is only going to make it worse. Almost certainly much worse. If you want non-determinism, use it when you're selecting the seed and then leave it alone.
In this case, there is a cycle of length 1 when the state is all 0, so you want to avoid 0 as a seed. That's easy - seed it with 16 bits of timer and any non-zero constant byte, call the generator a few times to mix it up, and you're set.
I'm suspicious about this one though. Each bit of the state seems to affect only 3 to 7 other bits, when (from what I've read on the subject) you really want them to affect about half (12 in this case). I would expect that to result in some correlation between consecutive values. And that should show up on a "x = rand() y = rand() drawpixel(x,y)" test. But it doesn't: here's 32768 iterations, with this generator on the left and whatever luajit uses on the right.
One of these days I will have to write some tests for little PRNGs. Anything that's suitable for a slow 8 bit processor is going to fail pretty much immediately when the "proper" test suites are thrown at them, but that's OK - our requirements are a lot more modest.
Re: fast 24 bit PRNG
Posted: Fri Oct 10, 2025 1:57 pm
by gfoot
One of these days I will have to write some tests for little PRNGs. Anything that's suitable for a slow 8 bit processor is going to fail pretty much immediately when the "proper" test suites are thrown at them, but that's OK - our requirements are a lot more modest.
An option I've used recently is to make my code output a string of binary bits and feed it in here:
https://mzsoltmolnar.github.io/random-bitstream-tester/
But I believe what Arlet was doing in the older thread I linked above was using code on a PC to try out different combinations of simple operations, feeding results through a standard randomness testing suite (PractRand?) and determining which combinations did lead to good randomness results from the test suite. It feels like a better approach to me rather than relying on either randomness testing on-device, or my method of copying back bitstreams, both of which are really slow and impractical for large enough amounts of random data.
Re: fast 24 bit PRNG
Posted: Fri Oct 10, 2025 2:07 pm
by barnacle
An option I've used recently is to make my code output a string of binary bits...
My code is generally a random string of binary bits anyway
Neil
Re: fast 24 bit PRNG
Posted: Fri Oct 10, 2025 2:10 pm
by commodorejohn
If you're trying to test/measure, a 24-bit PRNG has a maximum sequence length of "only" 16.7 million values, which is well within the realm of generating & caching for analysis on modern machines.
Re: fast 24 bit PRNG
Posted: Sat Oct 11, 2025 2:31 pm
by dmsc
Hi!
Just found this fast 24 bit PRNG algorithm by Wim Couwenberg:
Code: Select all
lda a ; Operation 7 (with carry clear).
asl
eor b
sta b
rol ; Operation 9.
eor c
sta c
eor a ; Operation 5.
sta a
lda b ; Operation 15.
ror
eor c
sta c
eor b ; Operation 6.
sta b
For more details:
https://wimcouwenberg.wordpress.com/202 ... processor/
Sadly, that PRNG is really bad.... see the results testing with PractRand, it fails after only 8k outputs!:
Code: Select all
$ minisim -b couwenberg.xex | ../../pr-095/RNG_test stdin8 -te 1 -tlmin 10 -tlmax 50 -tlmaxonly -multithreaded
RNG_test using PractRand version 0.95
RNG = RNG_stdin8, seed = unknown
test set = expanded, folding = standard (8 bit)
rng=RNG_stdin8, seed=unknown
length= 1 kilobyte (2^10 bytes), time= 0.1 seconds
no anomalies in 16 test result(s)
rng=RNG_stdin8, seed=unknown
length= 2 kilobytes (2^11 bytes), time= 0.4 seconds
no anomalies in 30 test result(s)
rng=RNG_stdin8, seed=unknown
length= 4 kilobytes (2^12 bytes), time= 1.0 seconds
no anomalies in 38 test result(s)
rng=RNG_stdin8, seed=unknown
length= 8 kilobytes (2^13 bytes), time= 1.8 seconds
Test Name Raw Processed Evaluation
BRank(18):128(1) R= +2133 p~= 3.3e-643 FAIL !!!!!!!
...and 60 test result(s) without anomalies
rng=RNG_stdin8, seed=unknown
length= 16 kilobytes (2^14 bytes), time= 2.8 seconds
Test Name Raw Processed Evaluation
BRank(18):128(2) R= +3017 p~= 3.3e-909 FAIL !!!!!!!
...and 79 test result(s) without anomalies
rng=RNG_stdin8, seed=unknown
length= 32 kilobytes (2^15 bytes), time= 4.2 seconds
Test Name Raw Processed Evaluation
BRank(18):128(2) R= +3017 p~= 3.3e-909 FAIL !!!!!!!
...and 95 test result(s) without anomalies
rng=RNG_stdin8, seed=unknown
length= 64 kilobytes (2^16 bytes), time= 5.8 seconds
Test Name Raw Processed Evaluation
BRank(18):128(2) R= +3017 p~= 3.3e-909 FAIL !!!!!!!
BRank(18):256(1) R= +4889 p~= 1e-1472 FAIL !!!!!!!!
[Low1/8]BRank(18):128(1) R= +2133 p~= 3.3e-643 FAIL !!!!!!!
...and 110 test result(s) without anomalies
The period is as advertised, 2^24-1 bytes, but the way it is constructed means that each 3 bytes of output are exactly repeated twice between the period, the output is very regular.
As mentioned earlier, the generators that Arlet came with are not only much better (only giving problems at 2^29 bytes of output), but also faster, see the minimal one at:
https://github.com/Arlet/pseudo-random- ... l-6502.asm
Have Fun!
Re: fast 24 bit PRNG
Posted: Thu Oct 16, 2025 9:57 pm
by White Flame
I wonder whether "increment if zero" is a reasonable way to avoid that particular issue - however randomness is one of those cases where I know just enough to know that I don't know, and that it's easy to go in with good intentions but come out with really bad results!
I tackled this 0->0 cycle problem properly in my 8-bit PRNG:
https://codebase.c64.org/doku.php?id=ba ... 8-bit_prng
For any input, you choose whether or not to XOR the magic byte, based on the high bit shifted out. 01 through FF have a single continuous chain between them in pseudorandom order, for certain XOR values. But 00, since it doesn't XOR, just loops back to 00. So I forced XOR on 00, but now it creates the same result as 80. So I forced 80 not to XOR, and it becomes the only value whose output is 00. And that links 00 properly into the chain.
Code: Select all
lda seed
beq doEor ;if the input was $00, force EOR
asl
beq noEor ;if the input was $80, skip the EOR
bcc noEor
doEor: eor #$1d
noEor: sta seed
; $00 : (shift and) perform EOR
; $80 : just shift
; %1xxxxxxx : shift and perform EOR
; %0xxxxxxx : just shift
By the way, I also tried just switching the XOR decision based on the high bit clear instead of set, and didn't find ANY magic XOR bytes that worked in that arrangement, even for just a 255-byte cycle!
It doesn't look like that simple 24-bit one is easily fixable at all, but it's also good to have quick & dirty "bad" PRNGs because not everything needs full coverage.