I've been looking at how important the seeding process is and the need to "warm up" the RNG after seeding it. I used the 32-bit version that BDD is using for this testing, as it's probably the worst case as it has the most state.
The TL;DR is that whatever you do with seeding,
I'd recommend running the RNG straight away afterwards at least 8 times, preferably 16 times, and discarding the results, as the first samples can show significant bias based on the seed. A second point is that
it doesn't matter too much exactly how you set the state based on your seed (whether you spread the entropy around the state vector's bits, etc) - it affects things a bit, but the best way to do this shuffling is probably just to run the RNG itself because that's its specialty.
I experimented with - but didn't observe any benefit in - picking different "LDA" values. It does affect the quality of the random sequences overall (usually negatively) but it doesn't seem to have a noticeable effect on the correlation issues I was looking at here.
As an initial illustration, here's a graph showing successive 32-bit values (scaled down of course!) returned immediately after setting the seed very lazily to all zeros but with a small number (1, 2, 3, ..., 7, 8 ) in state vector entry 0:

- sc2.png (6.82 KiB) Viewed 759 times
It's pretty clear that the first few samples are all very similar regardless of the change in seed value. However after 4 or 5 samples things start to look better. Of course this is only looking at 8 different seed values, and not making any effort to spread the bits around even the first 16-bit word, let alone the whole 128-bit vector.
It's hard to visualise a large number of seeds but what I decided to do was to plot - for each of 2048 different seed values horizontally - the first value returned vertically, and then repeat for the second value on another frame, and the third on another, and so on, up to 8 frames, and put it in an animated GIF. So the animation shows the evolution of values returned, for each seed, over the first 8 calls to the random function. The correlations with the seed values are always clearly visible in the first few frames, and you can see how many frames it takes to get a random-looking arrangement, for various strategies of setting the whole seed vector.
I would again like to highlight that
even in the worst case, after 16 samples there's no visible pattern and also that
even in the best case, it still takes 6/7/8 samples before there's no visible pattern. This is why I advise to always discard the first 8 samples, and if you just discard the first 16 you can worry less about how you actually populated the seed vector.
These are animated GIFs but you might need to click through to play them and see all the frames.
First up - this is setting state[0] to the seed number (0-2047) directly, with the other state entries all set to zero. It is pretty much the worst thing you can do (actually setting a later state entry is worse...) and you can see it takes until about frame 6 before you see any irregularity in the pattern at all, and frame 8 before it looks fairly random. I put extra frames in this one because it was so slow to become random.
Next - still only setting state[0], but scaling the seed number to span the whole range of 16-bit values. This is quite a lot better, with irregularities in frame 4 and frame 6 being the last one that doesn't look random to my eye.
Now I did something like state[0] = seed + (seed << 4) + (seed << 8 ) + (seed << 12) so that the most varying bits are spread more widely over the word. It is a bit different but still pretty much takes until frame 7 to look random.
What if we go back to the version that scaled the seed up to span the whole 16-bit value, but now write that same value to all 8 state vector entries? I would say this still takes until frame 7 to look reasonably random, but it is easier to do and saves a lot of bitshifting.
Now the same thing but with the "mixed up" version that added bitshifted copies of seed to spread the entropy around. This is now better and frame 6 looks pretty random:
And finally, I made changed it to set different values in all of the state entries, still based on the same 11-bit seed number but just adding more variety across the state vector. This performs similarly to the previous one, I think, perhaps looking almost random on frame 5 but not quite - and, critically, it is still really bad before that point:
I think the summary is pretty much what I said in the TL;DR at the top - discard about 16 samples (it is the quickest and easiest way to avoid these problems) and aside from that so long as you find a decent way to get a unique number of up to 64 bits to use as a seed, you can just write it straight into the bottom of the state vector (in this version of the algorithm - some others would want it at the top), and you're guaranteed to get a unique sequence of numbers as a result.