Random-number generation

Programming the 6502 microprocessor and its relatives in assembly and other languages.
Post Reply
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Random-number generation

Post by BigEd »

Just possibly compiler explorer can help here - look at the machine code generated by a compiler. (Various compilers, various target CPUs, whichever options you like.)
https://godbolt.org/
Alex1
Posts: 80
Joined: 13 Mar 2023

Re: Random-number generation

Post by Alex1 »

The output is the same

Code: Select all

11ec9d01 : 8A A0 25 94 BA 34 EF = 55
11ec9d02 : CF 6F 95 29 81 B6 A5 = 24
11ec9d03 : 14 84 19 43 67 1E C4 = A3
11ec9d04 : 59 DD F6 39 65 84 48 = 2D
11ec9d05 : 9E 7B 72 AC 49 CE 16 = 5F
11ec9d06 : E3 5E D1 7D B5 83 9A = 2F
11ec9d07 : 28 87 58 D6 01 85 1F = 1E
11ec9d08 : 6D F4 4C 23 3E C3 E2 = DC
11ec9d09 : B2 A6 F3 16 33 F7 D9 = EA
11ec9d0a : F7 9D 91 A8 A4 9B 75 = D1
11ec9d0b : 3C DA 6B 14 06 A2 17 = 11
11ec9d0c : 81 5B C7 DB D2 74 8C = 5E
11ec9d0d : C6 21 E9 C4 1B 90 1C = 07
11ec9d0e : 0B 2D 16 DB E2 72 8F = 6D
11ec9d0f : 50 7D 93 6E C4 37 C7 = 03
11ec9d10 : 95 12 A6 14 98 D0 97 = 0F
11ec9d11 : DA EC 92 A7 C8 98 30 = F8
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: Random-number generation

Post by Arlet »

Maybe there's a problem in your loop detector.
Alex1
Posts: 80
Joined: 13 Mar 2023

Re: Random-number generation

Post by Alex1 »

Arlet wrote:
Maybe there's a problem in your loop detector.
oh yes, i find it. My loop dectector count each output char, then "0101" in the output A07E0101, will be on index 4.
i suppose your index will only be 2?
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: Random-number generation

Post by Arlet »

Indeed, when I check at hex digit number 11EC9D07, I see the digits 008AA873 in the output, but they are shifted by one nibble:

Code: Select all

08f64e81 : 0A 38 7D A1 48 03 87 = CF
08f64e82 : 4F 87 04 A6 69 6C F3 = 9A
08f64e83 : 94 1B 20 C6 9E 0A FE = 60
08f64e84 : D9 F4 14 DB C3 CD CB = 08
08f64e85 : 1E 13 28 03 8C 5A 26 = AA
08f64e86 : 63 76 9E A1 13 6E 94 = 87
08f64e87 : A8 1E BD 5E DE 4C E1 = 3F
08f64e88 : ED 0B C9 27 A5 F2 D3 = 76
08f64e89 : 32 3E 07 2F A1 94 68 = C9
The random number generator produced 60, 08, AA, 87, 3F, and that triggers your loop detector, which saw 008AA873 in the output stream.
Alex1
Posts: 80
Joined: 13 Mar 2023

Re: Random-number generation

Post by Alex1 »

Arlet wrote:
Here's when I see the bytes 00 8A A8 73 in sequence:

Code: Select all

(index)    : internal state       - output 4-byte sequence
0000000004 : 14 b3 66 74 dc ce af - 73 008aa873
001b4c11af : 2b 0c f5 cc 69 0f 1a - 73 008aa873
031110bb51 : d5 19 99 54 ff aa 8c - 73 008aa873
036d723c01 : 45 71 51 42 13 8c 60 - 73 008aa873
038707f8e7 : 43 72 c4 30 59 ea 2a - 73 008aa873
As you can see, even though the sequence 00 8A A8 73 shows up repeatedly, the internal state differs. That's because you're only watching 4 consecutive bytes, but there are 7 bytes of state.
the problem with your PRNG is always repeating output. With a 32-bit LFSR, the same 32-bit output does not repeat or 1 time maximum. With your PRNG, this is repeated much more, even if the internal state is not the same.
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: Random-number generation

Post by Arlet »

Here's the first real repeat of the 00 8A A8 73 sequence:

Code: Select all

1b4c11ab : 5C 59 0D E7 62 13 62 = 00
1b4c11ac : A1 FA 07 EF EF 02 65 = 8A
1b4c11ad : E6 E0 E8 D7 A2 A5 0A = A8
1b4c11ae : 2B 0C F5 CC 69 0F 1A = 73
1b4c11af : 70 7C 71 3E 8E 9D B7 = 39
1b4c11b0 : B5 31 A3 E1 E4 81 39 = DD
But as we compare that with the initial sequence, you can see it's not a loop. It's just a coincidence. The next bytes are different.

Code: Select all

00000000 : 45 45 45 45 45 45 45 = 00
00000001 : 8A CF 14 5A 64 A9 EE = 8A
00000002 : CF 9E B3 0D 48 F2 E0 = A8
00000003 : 14 B3 66 74 DC CE AF = 73
00000004 : 59 0C 73 E7 24 F3 A2 = 86
00000005 : 9E AA 1D 05 CC BF 62 = AE
Alex1
Posts: 80
Joined: 13 Mar 2023

Re: Random-number generation

Post by Alex1 »

Arlet wrote:
Indeed, when I check at hex digit number 11EC9D07, I see the digits 008AA873 in the output, but they are shifted by one nibble:

Code: Select all

08f64e81 : 0A 38 7D A1 48 03 87 = CF
08f64e82 : 4F 87 04 A6 69 6C F3 = 9A
08f64e83 : 94 1B 20 C6 9E 0A FE = 60
08f64e84 : D9 F4 14 DB C3 CD CB = 08
08f64e85 : 1E 13 28 03 8C 5A 26 = AA
08f64e86 : 63 76 9E A1 13 6E 94 = 87
08f64e87 : A8 1E BD 5E DE 4C E1 = 3F
08f64e88 : ED 0B C9 27 A5 F2 D3 = 76
08f64e89 : 32 3E 07 2F A1 94 68 = C9
The random number generator produced 60, 08, AA, 87, 3F, and that triggers your loop detector, which saw 008AA873 in the output stream.
yes my loop dectector is able to detect loop in bits. Usually I do research directly on bit strings.
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: Random-number generation

Post by Arlet »

Alex1 wrote:
yes my loop dectector is able to detect loop in bits. Usually I do research directly on bit strings.
But the consequence is that it finds the same pattern more often. I find that the 4 byte initial pattern only repeats once in the first 4GB of data, which is expected. The 4GB limitation was a design criterion to keep it as small as possible.
Alex1
Posts: 80
Joined: 13 Mar 2023

Re: Random-number generation

Post by Alex1 »

Arlet wrote:
Alex1 wrote:
yes my loop dectector is able to detect loop in bits. Usually I do research directly on bit strings.
But the consequence is that it finds the same pattern more often. I find that the 4 byte initial pattern only repeats once in the first 4GB of data, which is expected. The 4GB limitation was a design criterion to keep it as small as possible.
A first conclusion is that rng_test is not a good test, it does not detect bit loops. I'm doing a search on the bits themselves and we'll see what happens.
Alex1
Posts: 80
Joined: 13 Mar 2023

Re: Random-number generation

Post by Alex1 »

Arlet wrote:
Alex1 wrote:
yes my loop dectector is able to detect loop in bits. Usually I do research directly on bit strings.
But the consequence is that it finds the same pattern more often. I find that the 4 byte initial pattern only repeats once in the first 4GB of data, which is expected. The 4GB limitation was a design criterion to keep it as small as possible.
I have always done my tests on LFSRs with bits and not bytes and LFSRs do not repeat. We will see if your PRNG repeats at the bit level.
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: Random-number generation

Post by Arlet »

Quote:
A first conclusion is that rng_test is not a good test, it does not detect bit loops
But there is no loop. There is just a certain 32 bit pattern that repeats. If you check per individual bit, you would expect to see any 32 bit pattern occur, on average, once every 4 billion bits, or 512 million bytes.

Try your loop detector on /dev/random or a known good generator such as chacha 20.

Code: Select all

RNG_output "chacha(20)" inf | xxd -p | tr -d '\n' 
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: Random-number generation

Post by Arlet »

Here's a histogram of how many times repeated 4-byte patterns there are in my random generator with seed=0, after 4GB of data, compared to expected values for perfect random numbers.

Code: Select all

0 1580036882 1580030169
1 1580006676 1580030169
2  790043016  790015084
3  263329368  263338361
4   65831187   65834590
5   13167200   13166918
6    2194782    2194486
7     313927     313498
8      39295      39187
9       4480       4354
10       442        435
11        37         40
12         4          3
13         0          0
Alex1
Posts: 80
Joined: 13 Mar 2023

Re: Random-number generation

Post by Alex1 »

Arlet, yes you are right it's patterns not loops.

What i say is that LFSR is based on maths and there is no pattern. For my LFSR 32, the output is 2^32 different patterns. The maths says that and the real world too :

With my LFSR 32, the same pattern is only after 2^32 bits :
Quote:
Found '10001111001001101010111011010011' at index 0 !
Found '10001111001001101010111011010011' at index FFFFFFFF !
With your PRNG :
Quote:
Found '00000000100010101010100001110011' at index 0 !
Found '00000000100010101010100001110011' at index 4422DC31 !
Found '00000000100010101010100001110011' at index 47B2741C !
Found '00000000100010101010100001110011' at index 90215B01 !
Found '00000000100010101010100001110011' at index A1E6E039 !
Found '00000000100010101010100001110011' at index DA608D58 !
But you said your PRNG produces 2^32 bytes (4GB) then it's (2^32)*8 bits = 32 Giga-bits :

and with 32 Giga-bits here are the patterns with your PRNG :
Quote:
Found '00000000100010101010100001110011' at index 0 !
Found '00000000100010101010100001110011' at index 4422DC31 !
Found '00000000100010101010100001110011' at index 47B2741C !
Found '00000000100010101010100001110011' at index 90215B01 !
Found '00000000100010101010100001110011' at index A1E6E039 !
Found '00000000100010101010100001110011' at index DA608D58 !
Found '00000000100010101010100001110011' at index 1DDC91D77 !
Found '00000000100010101010100001110011' at index 41FFA3A69 !
Found '00000000100010101010100001110011' at index 4F774146B !
Found '00000000100010101010100001110011' at index 57C03DB9D !
Found '00000000100010101010100001110011' at index 5AFA0662E !
Found '00000000100010101010100001110011' at index 5BF3AE07E !
Found '00000000100010101010100001110011' at index 6CDF53905 !
The PRNG are building blocks. If they do not meet certain specifications, anything that can be produced on them afterwards may have big problems.

Chacha20 is based on strong maths for building blocks so you can't compare a final encryption system as chacha20 with a building block like your PRNG.

And the other problem is that rng_test don't detect patterns in your PRNG.

From what I understand is that you say that the repetition of patterns in a PRNG is not important. I do not think so. Because if we have patterns then there is linear relationships not foreseen in the algorithm and if we used this PRNG as building block to create an encryption system, then linear cryptanalysis could break it easily.
Last edited by Alex1 on Fri Mar 24, 2023 1:47 pm, edited 1 time in total.
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: Random-number generation

Post by Arlet »

Alex1 wrote:
For my LFSR 32, the output is 2^32 different patterns.
Correct, and that's not a good thing. If the numbers are truly random, you would expect to get some duplicates. If you take random 32 bit numbers, the chance of getting at least one duplicate grows to 1% even after drawing 10000 numbers. After 80000 numbers, the probability grows to more than 50%. If there's still not a single duplicate after half a million numbers, it's a safe bet to assume that the random number generator is broken.
Quote:
you can't compare a final encryption system as chacha20 with a basic brick like your PRNG.
You can run both through a test tool, such as your loop detector, and compare the differences. I bet you'll see quite similar behavior in number distribution over the first 4GB of output.
Quote:
if we used this PRNG as basic brick to create an encryption system, then linear cryptanalysis could break it easily.
Probably, but it was never supposed to be cryptographically secure.
Post Reply