internal states

ptorric · Post by **ptorric** » Mon Dec 21, 2009 7:59 pm

i think this forum is really very interesting, i'm sorry i cannot contribute so much.

for business i'm software oriented, i also love hardware but i've basic (old school) electronic competence.

so, i'd like to understand better the internal 6502 process, because i can undestrand what is an alu or a buffer, but i can't find a good explanation about the internal decode / state of a cpu, for example a simple one like our belove 6502

in a wdc pdf diagram, there is a time unit that drive most of internal process, but i think that must exist an internal state that guide the instruction execution, because time unit seem to be drived only by external clock.
if i understand it well (a big if), instruction execution must be done as a sequence of micro commands, but i cannot see how cpu knows what's done and what's to do.
for example:

Code: Select all

lda $1000

need lot of steps, done over more that one clock.

as usual sorry for my english.
thanks to all who wanna explain me!

BigEd · Post by **BigEd** » Mon Dec 21, 2009 11:06 pm

Have a read of the article on pagetable where it describes the internal states T1-T7 (and a couple of others) - the big schematic of the 6502 shows the shift register which counts these states.

In effect, the 6502 is counting cycles from the SYNC cycle, through to the next SYNC, orchestrating the memory accesses, ALU ops and register transfers for each state.

ptorric · Post by **ptorric** » Tue Dec 22, 2009 9:07 am

BigEd wrote:

Have a read of the article on pagetable where it describes the internal states T1-T7 (and a couple of others) -

now it's a bit more clear: in this schema i can see that there is a way for the decode rom to drive timing, via the "random control logic" (strange name

)

GARTHWILSON · Post by **GARTHWILSON** » Tue Dec 22, 2009 9:57 am

This topic is related.

The low number of clock cycles required to carry out an instruction on the 6502 is something that often mystifies newcomers who have already learned other less-efficient processors like the Z80. The 6502 can do more than one operation per clock, so an instruction takes very few clocks to fetch, decode, and execute. ADC# (add with carry, immediate) is an example given in the programming manual, which says it requires five distinct steps, and yet does it in two clocks, meaning 2us at 1MHz, 100ns @ 20MHz, etc.. That number of 5 could be increased if you add the incrementing of the program counter and the implied automatic CMP#0 instruction; so conceivably you could say it does at least 8 or 9 operations in two clocks' time. This is partly why even in 1980, a Z80 had to go at least 4MHz to keep up with a 1MHz 6502 in terms of how long it took to get a job done.

There are no internally generated clock signals, at least not in the sense of deriving higher internal frequencies or splitting the clock into four phases or anything like that. The 6502 also does not use microcode. The logic is all in hardware. WDC's data sheet tells what is on the bus in individual clocks; but where the processor needs an extra clock here or there before it is ready for the next bus transaction, the data sheet just says "IO" for "internal operation," with no description of what that is.

BigEd · Post by **BigEd** » Tue Dec 22, 2009 11:05 am

Google's translation of Beregnyei Balazs' 6502 reverse-engineering site isn't excellent, but you can make out some of the story about the T-state shift register here - including the RDY signal acting as a multiplexor, to stop or allow the shift. Oddly, the RDY input pad is missing - the schematic isn't complete.

(I should say, the T-states are labels for full clock cycles, there is indeed no internal fast clock as with the early x86.)

If you want to know more about MOS technology, and how transistors are used to make up logic gates, see here

Although it's said (repeatedly and emphatically) that the 6502 isn't micro-coded, it is notable that the bulk of the control complexity is hidden in the large PLA, which is a regular structure, and which takes six of the T-state signals as inputs. So, not an addressable ROM, and not built by writing microcode, but nonetheless a structured approach which must have helped to get the product to market.

As for the T-states, I'll see if I can sketch an operation like your

Code: Select all

1000    LDA $4321

as accurately as I can - some guesswork involved!

Code: Select all

Cycle 0, start: PC value of 1000 placed on address bus, Sync high, RnW high.
Cycle 0, middle: computing PC=PC+1, possibly committing results of previous opcode to registers.
Cycle 0, end: capture data bus (opcode) into internal PreDecode Register

Cycle 1, start: PC value of 1001 placed on address bus, Sync low, RnW high, PreDecode Register transferred to Intruction Register
Cycle 1, middle: computing PC=PC+1, deriving control signals for next cycle
Cycle 1, end: capture data bus (low address operand) into internal Input Data Latch

Cycle 2, start: PC value of 1002 placed on address bus, Sync low, RnW high, Input Data Latch transferred to B input register
Cycle 2, middle: computing PC=PC+1, deriving control signals for next cycle, add 0 to B register
Cycle 2, end: capture data bus (high address operand) into internal Input Data Latch, capture adder output (low address operand) into Adder Hold Register

Cycle 3, start: Input Data Latch value placed on high byte of address bus, Adder Hold Register placed on low byte of address bus, Sync low, RnW high
Cycle 3, middle: deriving control signals for next cycle, PC not incremented
Cycle 3, end: capture data bus (content of $4321) into internal Input Data Latch

Cycle 4, start:  PC value of 1003 placed on address bus, Sync high, RnW high, Input Data Latch transferred to Accumulator

Note that the final action of the opcode occurs during the first cycle of the next instruction. The control signals for this action were set up in the previous cycle, and this action will occur even if (for example) an interrupt is taken.

Note that cycle 1 must always do the same work: there has not yet been time to act on the instruction fetch. This is why the second cycle always reads the byte after the opcode, and why NOP takes two cycles. An instruction which takes only two cycles (like, say, DEX) will use the ALU and commit the result in the third cycle, which overlaps with the fetch of the next instruction.

(I'm not quite sure what happens with PC. Here's a guess: at the end of Cycle1 of a NOP, although PC=PC+1 has been computed for the second time, it won't be captured, and so the start of Cycle2 will present the same singly-incremented value of PC onto the bus, to fetch the next instruction. The schematic shows 4 control signals for each of PC high and low, and the x3 signal might be used to accept or ignore the incremented value.)

ptorric · Post by **ptorric** » Wed Dec 23, 2009 9:01 pm

GARTHWILSON wrote:

This topic is related.

yes, read it, used the search function before bother.

thank you all for the reply, sorry but i miss something, that i try to explaing in one answer: between two edge clocks, what happens?
because 6502 is static, i think that there are internal states (i.e. t1-t8) for implemented in some way the chip to progress in the work doing lot of things.
how can it be possible that, without a controlled (static) internal clock, a processor can progress between little steps like increment pc after fetching address?

anyway, where i can find other info?

btw: merry chirstmas to all, i like think forum very much

Quote:

BigEd · Post by **BigEd** » Wed Dec 23, 2009 9:43 pm

Each circuit can do only one thing in a clock cycle. The PC incrementer can compute PC+1. The ALU can add, or subtract, or do a logical operation. So, there are many circuits, and each one does one specific operation, according to the control signals, which are computed as a function of the opcode sitting in the Instruction Register and of the T state (cycle counter, more or less.) Each register (or, generally, each storage node) can capture only one result, at the end of the clock cycle. Whether or not it captures, and from which of the several internal busses, and which circuits place their output values onto those busses, is again under the control of the control signals.

The only subtlety is that phase 1 and phase 2 are distinct, because of the style of storage elements used. Have a look at this lecture for example. (Edit: PDF here)

BigEd · Post by **BigEd** » Thu Dec 24, 2009 1:53 pm

ptorric wrote:

the "random control logic" (strange name ;) )

Not so strange: much of the control is in the PLA, which is structured logic with highly regular (and easily tweaked) layout. The rest of the control is in the form of cascades of complex logic gates (6 inputs, 8 inputs), more difficult to layout and to modify - it's not unusual to call this random logic, although of course it isn't random.

ptorric · Post by **ptorric** » Thu Dec 24, 2009 2:02 pm

BigEd wrote:

ptorric wrote:

the "random control logic" (strange name

)

Not so strange:....

I was kidding!
BigEd, thanks for your info, i think i didn't explain me in my last post: now i'm in hurry 'case hunting last xmas gifts, maybe i'll write a new post later.