6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 10, 2024 5:08 am

All times are UTC




Post new topic Reply to topic  [ 13 posts ] 
Author Message
PostPosted: Sat Oct 20, 2018 1:47 pm 
Offline

Joined: Sat Oct 20, 2018 1:39 pm
Posts: 3
Hello! I have been playing around with the visual6502 simulation code. I ported it to lua in an attempt to understand it better (which worked), however in testing some things I discovered something that confused me. Shuffling the updates done in recalcNodeList(), or even just shuffling the allNodes() recalc done in initChip() causes the simulation to fail. What assumption(s) are these shuffles breaking that the simulation is relying on? It seems like the node numbering is significant.


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 20, 2018 2:06 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
Oh dear, that would not be good news. Because the simulation iterates (after each input change) until nothing's changing any more, it's always been hoped, and believed, that the order of evaluation doesn't matter. Certainly, the node order is more or less arbitrary - no special care has been taken as far as I know.

Can you give any details, or examples, so we could try to track this down in the JavaScript?

(If you know C, you can similarly experiment with Michael Steil's perfect6502, a port to C of the JS original. It's faster, of course.)


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 20, 2018 2:06 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
Oh, and welcome to the forum!


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 20, 2018 3:50 pm 
Offline

Joined: Sat Oct 20, 2018 1:39 pm
Posts: 3
Thanks! I do know C, but I've been playing with the JS version, so i'll stick with that for now. (:

I may have spoken a bit too soon. It seems that although the states the simulator goes through are different (i checksum a list of all high nodes), the output is the same. At least up to the 10k cycles i was testing just now. I do occasionally see different values in some registers (X, Y, SP) but it doesn't seem to have affected the program's execution (the PC and memory match).

I'm using nodejs and a small shim script to run the simulator from the command line. The program being executed by the chip is just the small one included in testprogram.js.

Code:
V6502DIR=../visual6502

cat \
    $V6502DIR/segdefs.js \
    $V6502DIR/transdefs.js \
    $V6502DIR/nodenames.js \
    $V6502DIR/chipsim.js \
    $V6502DIR/macros.js \
    $V6502DIR/wires.js \
    $V6502DIR/testprogram.js \
    shim.js | node -


shim.js:
Code:
golden = 'cdf976ab';

now = Date.now;
function setCellValue() {}
function selectCell() {}
function refresh() {}
function setStatus(a, b, c) {
    var chk = adler32(activeNodes());
    if (chk == golden) chk += ' OK';
    console.log((a+' '+b+' '+c).replace('&#8209', '-').replace(/ Hz:.*/, '')+' '+chk);
}
consolebox = {};
userCode = [];
userResetLow = undefined;
userResetHigh = undefined;

function activeNodes() {
    var list = [];
    for (var n in nodes) {
        if (nodes[n].state)
            list.push(parseInt(n));
    }
    // by default, list.sort() compares as string (even when given numbers)
    list.sort(function(a, b) {
        return a < b ? -1 : 1;
    });
    return list.join(',');
}

function shuffle(a) {
    var j, x, i;
    for (i = a.length - 1; i > 0; i--) {
        j = Math.floor(Math.random() * (i + 1));
        x = a[i];
        a[i] = a[j];
        a[j] = x;
    }
    return a;
}

setupNodes();
setupTransistors();
loadProgram();
initChip();
for (var i = 0; i < 10000; i++) {
    halfStep();
    cycle++;
    chipStatus();
}
chipStatus();

console.log('memory', memory.slice(0, 32).join(','));


Shuffle test locations:
Code:
// in macros.js, replace
   recalcNodeList(allNodes());
// with
   var list = allNodes(); shuffle(list);
   recalcNodeList(list);

// and in chipsim.js, just insert the call at the top of the function
function recalcNodeList(list){
    shuffle(list);


With no shuffles, the last lines of the simulation are:
Code:
 halfcyc:10000 phi0:0 AB:000f D:f2 RnW:0  PC:0014 A:16 X:b3 Y:4d SP:fb Nv-BdIzc cdf976ab OK
memory 169,0,32,16,0,76,2,0,0,0,0,0,0,0,0,242,232,136,230,15,56,105,2,96,,,,,,,,


The 242 is the number the test program is incrementing.

And here are some examples with both shuffles enabled:
Code:
 halfcyc:10000 phi0:0 AB:000f D:f2 RnW:0  PC:0014 A:16 X:b3 Y:4d SP:fb Nv-BdIzc 597c751f
memory 169,0,32,16,0,76,2,0,0,0,0,0,0,0,0,242,232,136,230,15,56,105,2,96,,,,,,,,

 halfcyc:10000 phi0:0 AB:000f D:f2 RnW:0  PC:0014 A:16 X:56 Y:8d SP:fb Nv-BdIzc 14b177ac
memory 169,0,32,16,0,76,2,0,0,0,0,0,0,0,0,242,232,136,230,15,56,105,2,96,,,,,,,,

 halfcyc:10000 phi0:0 AB:000f D:f2 RnW:0  PC:0014 A:16 X:b3 Y:4d SP:fb Nv-BdIzc 3b7571a4
memory 169,0,32,16,0,76,2,0,0,0,0,0,0,0,0,242,232,136,230,15,56,105,2,96,,,,,,,,

 halfcyc:10000 phi0:0 AB:000f D:f2 RnW:0  PC:0014 A:16 X:c3 Y:8d SP:fb Nv-BdIzc 842d700e
memory 169,0,32,16,0,76,2,0,0,0,0,0,0,0,0,242,232,136,230,15,56,105,2,96,,,,,,,,

 halfcyc:10000 phi0:0 AB:000f D:f2 RnW:0  PC:0014 A:16 X:c3 Y:7d SP:1f Nv-BdIzc a62275c9
memory 169,0,32,16,0,76,2,0,0,0,0,0,0,0,0,242,232,136,230,15,56,105,2,96,,,,,,,,


I'm assuming this isn't caused by the way I've pulled the simulator out of the rest of the code. But that is entirely possible. Perhaps i'll check against the C version.

I'm not convinced i know what's going on at all here. Intuitively it makes sense to me that different converged states could lead to the same behaviors (some state information just gets discarded), but it is kind of disconcerting to see different values showing up in the registers.

Update before i actually post: it happens in the C version too. Like with the JS version, it doesn't seem to actually cause the simulation to diverge (aside from X, Y, and occasionally SP containing garbage). I took it out to 100k and 1m cycles and it seemed like it was behaving?
Code:
halfcyc:100017 phi0:0 AB:0016 D:69 RnW:1 PC:0016 A:EB X:FA Y:06 SP:FB P:15 IR:69
169,0,32,16,0,76,2,0,0,0,0,0,0,0,0,58,232,136,230,15,56,105,2,96,0,0,0,0,0,0,0,0

halfcyc:100017 phi0:0 AB:0016 D:69 RnW:1 PC:0016 A:EB X:7A Y:CC SP:38 P:15 IR:69
169,0,32,16,0,76,2,0,0,0,0,0,0,0,0,58,232,136,230,15,56,105,2,96,0,0,0,0,0,0,0,0

The four tests i ran at 1m cycles actually all came out identical:
Code:
halfcyc:1000017 phi0:0 AB:01FD D:10 RnW:1 PC:0004 A:43 X:C1 Y:3F SP:10 P:14 IR:20
169,0,32,16,0,76,2,0,0,0,0,0,0,0,0,1,232,136,230,15,56,105,2,96,0,0,0,0,0,0,0,0


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 20, 2018 3:57 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
Hmm, well, I don't like the sound of register values coming out different! Can a modest-length simulation show the same? I'm hoping we can track down the point of divergence somehow. Looks like we'll also need to use a PRNG which gives us control of the seed.

As it happens, Dave (hoglet) has found a need to improve the resolution of node values, for a different MPU. I wonder if that will solve this one too. See this commit.


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 20, 2018 6:12 pm 
Offline

Joined: Sat Oct 20, 2018 1:39 pm
Posts: 3
I suspect the garbage in the registers is due to settling during initialization. The test program doesn't store into X and Y; it only in/decrements them.

I also tested splitting recalcNodeList() into two pieces: the first which calculates the new states of node groups, and the second which updates the transistors controlled by individual nodes. This removes the dependence on evaluation order, and it still seems to simulate equivalently. It's basically treating the transistors as being "slower" than any path of propagation of state through nodes. I don't think it's necessarily better than the other way though. The visual6502 way is still completely deterministic unless you switch around node numbers (untested), or deliberately shuffle the recalc order (tested above).


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 20, 2018 8:42 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
Interesting, thanks. Sounds like it's worth (me or you or someone) doing some experimenting with code which does initialise the programmer visible state.

For example this program computes pi:
http://goo.gl/FXuoI
(not a great choice of test program, but I knew where to find it)


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 12, 2021 8:27 am 
Offline

Joined: Thu Jul 20, 2017 9:58 am
Posts: 91
BigEd wrote:
Oh dear, that would not be good news. Because the simulation iterates (after each input change) until nothing's changing any more, it's always been hoped, and believed, that the order of evaluation doesn't matter. Certainly, the node order is more or less arbitrary - no special care has been taken as far as I know.


I can confirm that the order of evaluation DOES MATTER! :(

I have created a netlist for the MOS8501R4 and started debugging it with M. Steil's perfect6502 code. What I can say so far is:

The CPU crashes right after the reset sequence with a wrong PC due to a glitch on a few signals.
BUT: If I reverse the order of evaluation the CPU enters the user code correctly - no glitch.
I already tracked down the datapath control lines responsible for this misbehavior but I didn't nail down the exact line and timing.


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 12, 2021 12:02 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
ooh...


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 12, 2021 12:31 pm 
Offline

Joined: Thu Jul 20, 2017 9:58 am
Posts: 91
Here's a snippet of a problematic section:

Attachment:
Glitch.png
Glitch.png [ 1.23 MiB | Viewed 1911 times ]


Here's the output of one bit of the ALU with the two signals ADD_ADL and ADD_SB.

Unfortunately either ADL or SB get tied to ground before the input lines of the ALU are
disconnected leading to a destroyed PC on the rising edge of PHI2.
If I simply reverse the order of evaluation in the code the PC doesn't get destroyed.

In order to track down I added some dump functionality for specific signals and I can
see that a few signals change state TWO times in one iteration.


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 13, 2021 6:29 am 
Offline

Joined: Thu Jul 20, 2017 9:58 am
Posts: 91
I now have tracked down one problematic section and can explain it:

Some background info:
In the simulation the signals propagate iteratively from (connected) nodes (let's call these Group #A) to the
nodes affected by changes of the gates connected to the Group #A. Let's call this Group #B (Or A+1 as
it's the next node group that will be evaluated).

The problem:
Now we have these non overlapping clock signals PHI1+PHI2 which means that one of them goes high after
the other one went low and also let's assume PHI1 lies in Group#X.
Due to the non overlapping nature of the signals PHI2 has to lie at least in Group#(X+1)
(X+1 means it changes at least one iteration after).

So far no problem.... BUT: The registers for the data path control lines also generate and use a signal
PHI1# which is directly derived from PHI1. Hence it also lies in Group#(X+1) too.

In my case the problem stems from the fact that signal PHI2 gets high BEFORE PHI1# is high in the simulation.
PHI1# is used to invalidate some signals controlling the ALU input and PHI2 is used to control ALU output.

In other words: The ALU output gets connected to an internal bus before the input to the ALU got disconnected.

Attachment:
Race_Condition.jpg
Race_Condition.jpg [ 175.49 KiB | Viewed 1857 times ]


Top
 Profile  
Reply with quote  
PostPosted: Sun Apr 18, 2021 6:43 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
IIRC, those datapath control drivers are circuits which show evidence of late changes in the 6502 layout, so possibly this possibility of a race was present also in early prototypes.
http://visual6502.org/wiki/index.php?ti ... timing_fix


Top
 Profile  
Reply with quote  
PostPosted: Sun Apr 18, 2021 7:27 pm 
Offline

Joined: Thu Jul 20, 2017 9:58 am
Posts: 91
I was able to work around that issue by disconnecting the on chip clock generator and generating both signals (PHI1+2)
manually.

After fixing a "flaw" (which made my pull up detection fail) in the layout of the 8501R4
(obviously present in ALL 850x derivatives) I got the simulation running.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 13 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: