Page 12 of 17
Re: Random-number generation
Posted: Tue Apr 18, 2023 11:04 pm
by dmsc
Hi!
I was thinking specifically that with only 4 bytes of state, you can at best get a permutation of 2^32 outputs. You can't get the same 32 bit output twice. So I'd expect such a setup to fail these statistical tests in a much bigger way than a 5 byte setup.
The trick is that you don´t output the full 32 bits of "mangled state", you only output 8 bits each time. This means that you don´t see the permutation, and with a proper mixing of all the state bits you can easily pass statistical tests up to 2^32/4 = 2^30 samples. And IMHO, 2^30 outputs is big enough for any 6502 use - think that with 200 cycles between RNG calls, and a 12MHz CPU clock, you would need five hours to start to notice something bad.
For comparison, the RNG in the Atari 8-bit BASIC uses a 17-bit hardware LFSR running at the CPU frequency, and "hopes" that reads of the hw register are not correlated enough, but in reality the quality of the generated numbers is very bad.
Have Fun!
Re: Random-number generation
Posted: Wed Apr 19, 2023 3:41 am
by Arlet
Here's one that's even smaller than the 'minimal-6502.asm', with slightly less performance, but 4 cycles saved. It uses 4 bytes for guaranteed 4GB period, plus 1 extra state byte to improve randomness.
Code: Select all
CLC
LDA #$41
ADC state+0 ; iterate state
STA state+0 ;
ADC state+1 ;
STA state+1 ;
ADC state+2 ;
STA state+2 ;
ADC state+3 ;
STA state+3 ; end of iterator
ADC state+4 ; output function with feedback
ASL
ADC state+3
STA state+4
EOR state+2
RTS
I made a few attempts at reducing the code even more, but they all made things dramatically worse.
Owlet code
Re: Random-number generation
Posted: Wed Apr 19, 2023 5:00 am
by dmsc
Hi!
Here's one that's even smaller than the 'minimal-6502.asm', with slightly less performance, but 4 cycles saved. It uses 4 bytes for guaranteed 4GB period, plus 1 extra state byte to improve randomness.
Code: Select all
CLC
LDA #$45
ADC state+0 ; iterate state
STA state+0 ;
ADC state+1 ;
STA state+1 ;
ADC state+2 ;
STA state+2 ;
ADC state+3 ;
STA state+3 ; end of iterator
ADC state+4 ; output function with feedback
ASL
ADC state+3
STA state+4
EOR state+2
RTS
Very good!
Passes PractRand up to 2^29, IMHO, very good performance for such a tiny code:
Code: Select all
$ minisim arlet8-minimal.65 | ./conv8 | ../../pr-095/RNG_test stdin8 -te 1 -tlmin 10 -tlmax 50 -tlmaxonly -multithreaded
RNG_test using PractRand version 0.95
RNG = RNG_stdin8, seed = unknown
test set = expanded, folding = standard (8 bit)
rng=RNG_stdin8, seed=unknown
length= 1 kilobyte (2^10 bytes), time= 0.1 seconds
no anomalies in 16 test result(s)
rng=RNG_stdin8, seed=unknown
length= 2 kilobytes (2^11 bytes), time= 0.3 seconds
no anomalies in 30 test result(s)
.... snip ....
rng=RNG_stdin8, seed=unknown
length= 512 megabytes (2^29 bytes), time= 203 seconds
Test Name Raw Processed Evaluation
[Low1/8]FPF-14+6/16:all R= +6.1 p = 3.5e-5 unusual
[Low1/8]Gap-16:A R= +5.9 p = 1.7e-4 unusual
...and 313 test result(s) without anomalies
rng=RNG_stdin8, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 381 seconds
Test Name Raw Processed Evaluation
[Low1/8]FPF-14+6/64:(0,14-0) R= +8.4 p = 2.4e-7 mildly suspicious
[Low1/8]FPF-14+6/64:all R= +10.5 p = 2.2e-9 VERY SUSPICIOUS
[Low1/8]FPF-14+6/32:(0,14-0) R= +9.0 p = 5.8e-8 suspicious
[Low1/8]FPF-14+6/32:all R= +9.4 p = 2.5e-8 very suspicious
[Low1/8]FPF-14+6/16:(0,14-0) R= +6.8 p = 6.0e-6 unusual
[Low1/8]FPF-14+6/16:all R= +8.7 p = 1.1e-7 very suspicious
[Low1/8]Gap-16:A R= +11.9 p = 1.2e-9 VERY SUSPICIOUS
...and 328 test result(s) without anomalies
rng=RNG_stdin8, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 774 seconds
Test Name Raw Processed Evaluation
[Low1/8]FPF-14+6/64:(0,14-0) R= +12.6 p = 3.0e-11 VERY SUSPICIOUS
[Low1/8]FPF-14+6/64:(1,14-0) R= +11.7 p = 1.9e-10 very suspicious
[Low1/8]FPF-14+6/64:all R= +17.8 p = 3.6e-16 FAIL !
[Low1/8]FPF-14+6/32:(0,14-0) R= +13.6 p = 3.7e-12 VERY SUSPICIOUS
[Low1/8]FPF-14+6/32:(1,14-0) R= +9.4 p = 2.7e-8 suspicious
[Low1/8]FPF-14+6/32:all R= +17.4 p = 8.7e-16 FAIL !
[Low1/8]FPF-14+6/16:(0,14-0) R= +10.3 p = 4.0e-9 very suspicious
[Low1/8]FPF-14+6/16:(1,14-0) R= +7.0 p = 4.4e-6 unusual
[Low1/8]FPF-14+6/16:(2,14-0) R= +8.3 p = 2.5e-7 mildly suspicious
[Low1/8]FPF-14+6/16:all R= +16.1 p = 1.4e-14 FAIL
[Low1/8]FPF-14+6/4:(0,14-0) R= +7.0 p = 4.2e-6 unusual
[Low1/8]FPF-14+6/4:all R= +11.5 p = 2.6e-10 VERY SUSPICIOUS
[Low1/8]Gap-16:A R= +22.7 p = 8.7e-18 FAIL !
[Low1/8]Gap-16:B R= +11.6 p = 1.7e-9 VERY SUSPICIOUS
...and 341 test result(s) without anomalies
Have Fun!
Re: Random-number generation
Posted: Wed Apr 19, 2023 5:02 am
by Arlet
I just edited the code a little bit. The constant has been changed from $45 to $41, and that seems to work a bit better.
Re: Random-number generation
Posted: Wed Apr 19, 2023 6:17 am
by Arlet
Previously I tested Bob Jenkins 8-bit Small Fast PRNG, but it translates to bigger 6502 code and only passes PractRand 0.94, not 0.95, as it fails the new "mod3n(0):(0,9-6)" test at 2^18 bytes.
Is that the jsf8 code below ? This code uses a different strategy. It's not an iterator+output function, but a chaotic feedback function, which suffers from the problem that some bad seeds can cause short periods. They work around that problem by the "raninit" function that forces a restriction on possible seed values, so that they guarantee a long (but less than 4GB) period. The advantage of this method is that you have the potential for better randomness in fewer operations. However, the rot8() is not well suited to 6502, so the translation to 6502 assembly adds a bunch of cycles again.
It would be interesting to explore the concept, but starting from scratch with 6502 targeted operations. Edit: I tried a few things, but not much hope. The nice thing about the ADC/STA sequence is that the accumulator is already loaded with the value that will be used to modify the next state byte. If you try to do something like the code below, you're going to need some LDAs, which seem like a waste of time, because they are only moving data around, not actually changing it. Maybe there is a solution, but at best I would expect a tiny improvement in cycles, at the bigger cost of not being able to use any random seed
Code: Select all
typedef uint8_t u1;
typedef struct ranctx { u1 a; u1 b; u1 c; u1 d; } ranctx;
#define rot8(x,k) (((x)<<(k))|((x)>>(8-(k))))
u1 ranval( ranctx *x ) {
u1 e = x->a - rot8(x->b, 1);
x->a = x->b ^ rot8(x->c, 4);
x->b = x->c + x->d;
x->c = x->d + e;
x->d = e + x->a;
return x->d;
void raninit( ranctx *x, u1 seed ) {
x->a = 0xed, x->b = x->c = x->d = seed;
for (u1 i=0; i<20; ++i) {
(void)ranval(x);
}
}
Re: Random-number generation
Posted: Wed Apr 19, 2023 4:39 pm
by Alex1
When i try to compile arlet-medium.asm with ca65 i've some erros like "Segment `STARTUP' does not exist"
Which command line do you use to compile arlet-medium.asm and did you add some code in the arlet-medium.asm ?
Re: Random-number generation
Posted: Wed Apr 19, 2023 8:40 pm
by dmsc
Hi!
When i try to compile arlet-medium.asm with ca65 i've some erros like "Segment `STARTUP' does not exist"
Which command line do you use to compile arlet-medium.asm and did you add some code in the arlet-medium.asm ?
Code: Select all
cl65 -tatari -Catari-asm.cfg arlet-medium.asm
This will produce an Atari 8-bit executable that the simulator can run.
Have Fun!
Re: Random-number generation
Posted: Wed Apr 19, 2023 10:29 pm
by Alex1
Thanks.
Re: Random-number generation
Posted: Wed Apr 19, 2023 10:46 pm
by dmsc
Hi!
Hi!
Code: Select all
cl65 -tatari -Catari-asm.cfg arlet-medium.asm
This will produce an Atari 8-bit executable that the simulator can run.
Thanks you for your answer but it doesn't work.
Inside arlet-medium.asm
Code: Select all
prng:
CLC
LDA #$45
ADC state+0
STA state+0
ADC state+1
STA state+1
ADC state+2
STA state+2
ADC state+3
STA state+3
ADC state+4
STA state+4
LDA state+5
EOR state+7
ADC state+4
STA state+5
ADC state+6
STA state+6
LDA state+7
ASL
ADC state+6
STA state+7
EOR state+2
RTS
state: .res 8
Code: Select all
cl65 -tatari -catari-asm.cfg arlet-medium.asm
It produce only arlet-medium.o which is an object file, not an executable file.
Code: Select all
./atarisim arlet-medium.o
./atarisim: error reading binary file.
You are using a lowercase "-c", that means "compile but don't link", the correct incantation is "-Catari-asm.cfg", with a uppercase "C", that means "use this configuration file".
Have Fun!
Re: Random-number generation
Posted: Thu Apr 20, 2023 12:36 am
by Alex1
@dmsc thanks for your answer.
BTW do you have a link for your conv8 converter ?
Re: Random-number generation
Posted: Thu Apr 20, 2023 6:52 pm
by Arlet
Instead of printing the output in hex and then piping through conv8, isn't it easier/faster just to putc the 8 bit binary ?
Re: Random-number generation
Posted: Thu Apr 20, 2023 7:36 pm
by Alex1
Instead of printing the output in hex and then piping through conv8, isn't it easier/faster just to putc the 8 bit binary ?
Not on Atari : 0x9B from .asm is converted to 0x0D by Atari ROM on the pipe and 0x12 from .asm is converted to 0x2D on the pipe too. And then rng_test failed.
Same type of problem as on windows with 0x0D 0x0A but only 0x0A on linux.
Re: Random-number generation
Posted: Thu Apr 20, 2023 7:57 pm
by Alex1
i found a solution, i modify this in the .asm code, no carriage return (to gain speed with one less character) :
Code: Select all
; Generate and print numbers
loop:
jsr rand
jsr print_hex_byte
----> ; lda #$9B
----> ; jsr putc
inc cnt
bne loop
inc cnt+1
bne loop
inc cnt+2
bne loop
inc cnt+3
bne loop
inc cnt+4
bne loop
inc cnt+5
bne loop
rts
and use xxd :
Code: Select all
./atarisim arlet-medium.65 | xxd -r -p | rng_test stdin8 -te 1 -tlmin 10 -tlmax 50 -tlmaxonly -multithreaded
Seems to work. Do you confirm dmsc ?
Re: Random-number generation
Posted: Thu Apr 20, 2023 8:31 pm
by dmsc
Hi!
i found a solution, i modify this in the .asm code, no carriage return (to gain speed with one less character) :
Code: Select all
; Generate and print numbers
loop:
jsr rand
jsr print_hex_byte
----> ; lda #$9B
----> ; jsr putc
inc cnt
bne loop
inc cnt+1
bne loop
inc cnt+2
bne loop
inc cnt+3
bne loop
inc cnt+4
bne loop
inc cnt+5
bne loop
rts
and use xxd :
Code: Select all
./atarisim arlet-medium.65 | xxd -r -p | rng_test stdin8 -te 1 -tlmin 10 -tlmax 50 -tlmaxonly -multithreaded
Seems to work. Do you confirm dmsc ?
Yes, seems to work ok, with or without the $9B.
About direct binary output, you can use the "-b" option to minisim and if will skip the ATASCII to ASCII conversion:
Code: Select all
./atarisim -b arlet-medium.65 | rng_test stdin8 -te 1 -tlmin 10 -tlmax 50 -tlmaxonly -multithreaded
Have Fun!
Re: Random-number generation
Posted: Thu Apr 20, 2023 8:38 pm
by Alex1
By the way, @dmsc i don't understand your code, you define :
But how does it works? what are the init values for state? I suppose it will be random, it will use some free memory parts with random data inside to allocate the state ?
I use this code with fixed values :
Code: Select all
.export start
ICPTL = $0346
ICPTH = $0347
.zeropage
zp_start:
;state: .res 8
cnt: .res 6
zp_end:
.code
.proc rand
; fast PRNG (pseudo random number generator) written in 6502 assembly language
; (C) Arlet Ottens 2023 <arlet@c-scape.nl>
; https://github.com/Arlet/pseudo-random-number-generator/blob/main/medium-6502.asm
CLC
LDA #$45
ADC state+0
STA state+0
ADC state+1
STA state+1
ADC state+2
STA state+2
ADC state+3
STA state+3
ADC state+4
STA state+4
LDA state+5
EOR state+7
ADC state+4
STA state+5
ADC state+6
STA state+6
LDA state+7
ASL
ADC state+6
STA state+7
EOR state+2
RTS
.endproc
state: .byte $FF,$FF,$FF,$FF,$FF,$FF,$FF,$FF
.proc print_hex_byte
pha
lsr
lsr
lsr
lsr
jsr print_hex
pla
print_hex:
and #$0F
tax
lda hextab,x
; Fall through
::putc:
ldx #0
jump: jmp $FFFF
hextab: .byte "0123456789ABCDEF"
.endproc
start:
; Init fast putchar
lda ICPTL
clc
adc #1
sta print_hex_byte::jump+1
lda ICPTH
adc #0
sta print_hex_byte::jump+2
; Clear state
ldx #zp_end-zp_start-1
lda #0
zpclr:
sta zp_start, x
dex
bpl zpclr
; Generate and print numbers
loop:
jsr rand
jsr print_hex_byte
; lda #$9B
; jsr putc
inc cnt
bne loop
inc cnt+1
bne loop
inc cnt+2
bne loop
inc cnt+3
bne loop
inc cnt+4
bne loop
inc cnt+5
bne loop
rts
With state: .byte $FF,$FF,$FF,$FF,$FF,$FF,$FF,$FF i've a problem with 1kb:
length= 1 kilobyte (2^10 bytes), time= 2.1 seconds
Test Name Raw Processed Evaluation
FPF-14+6/64:all R= +17.3 p~= 7e-5 mildly suspicious
...and 15 test result(s) without anomalies
With state state: .byte $01,$00,$00,$00,$00,$00,$00,$00 , i've an unusual with 8 kb:
length= 8 kilobytes (2^13 bytes), time= 6.8 seconds
Test Name Raw Processed Evaluation
[Low1/8]FPF-14+6/64:all R= +14.9 p~= 5e-4 unusual
...and 51 test result(s) without anomalies
With state: .byte $00,$00,$00,$00,$00,$00,$00,$01 , i've a FAIL with 1kb:
length= 1 kilobyte (2^10 bytes), time= 1.2 seconds
Test Name Raw Processed Evaluation
FPF-14+6/64:all R= +33.5 p~= 4e-13 FAIL
...and 15 test result(s) without anomalies
With state state: .byte $01,$00,$00,$00,$00,$00,$00,$01 , i've an unusual with 64 kb:
length= 64 kilobytes (2^16 bytes), time= 16.4 seconds
Test Name Raw Processed Evaluation
[Low1/8]Gap-16:B R= -3.3 p =1-1.0e-3 unusual
...and 90 test result(s) without anomalies
Could someone confirm this ?