Extra cycle in crossing page boundary question

Bryan Parkoff · Post by **Bryan Parkoff** » Thu Apr 16, 2015 2:13 am

I wonder why STA instruction takes five clock cycles when it DOES NOT cross the page boundary. If invalid address shows from crossing the page boundary on the address bus, then R/W signal is ALWAYS read only during clock cycle 4 before it is switched to write only during clock cycle 5. It is good!

However, I believe that 6502 designers made their decision to reduce thousands of transistors to 3510. Unfortunately, they were unwilling to implement four clock cycles for STA instruction from NOT crossing the page boundary while R/W signal is switched to write only during clock cycle 4. If they did, they required to add hundreds of more transistors in order to save more clock cycles.

Why didn't 6502 designers leave reserved instructions as unimplement by adding protection such as exception handler? This allowed to cause 6502 microprocessor to behave minor bugs prior crashes such as HLT instruction. They wanted to reduce the number of transistors or few gates.

Perhaps, they did not have time to draw their complete schematic in handwriting before they released 6502 microprocessor to the customers.

Are you willing to build extra number of transistors from scratch to revise 6502D?

Do you have idea why ASL, LSR, ROL, and ROR instructions do not use indexed indirect addressing and indirect indexed addressing which takes total 8 clock cycles?

Bryan

GARTHWILSON · Post by **GARTHWILSON** » Thu Apr 16, 2015 5:59 am

Quote:

However, I believe that 6502 designers made their decision to reduce thousands of transistors to 3510.

The NMOS 6502 apparently had about 9,000 transistors, about 3500 gates. Ed, who's on the visual6502 project, can probably tell us.

Quote:

I wonder why STA instruction takes five clock cycles when it DOES NOT cross the page boundary.

STA absolute always takes 4 clocks, and as far as I can tell, its other addressing modes' numbers of clocks do not vary based on whether or not indexing results in a page crossing.

Quote:

Why didn't 6502 designers leave reserved instructions as unimplement by adding protection such as exception handler? This allowed to cause 6502 microprocessor to behave minor bugs prior crashes such as HLT instruction. They wanted to reduce the number of transistors or few gates.

Perhaps, they did not have time to draw their complete schematic in handwriting before they released 6502 microprocessor to the customers.

I suspect it had more to do with cost and with not doing something that would pull the maximum clock speed down. An invalid op code read would probably indicate that problems exist that are too big to remedy with a HaLT interrupt. I've never had a need for such an interrupt myself.

Dr Jefyll · Post by **Dr Jefyll** » Thu Apr 16, 2015 6:11 am

Hi Bryan. There's a two-page article here that talks about some of the early 6502 design decisions.

You mentioned indexed STA taking 5 cycles, and I guess it's the absolute, X and absolute, Y indexed modes you mean. It sounds like you maybe already have the answer, but just in case...

The potential for page crossings creates uncertainty about how soon the complete, proper address will become available. The address presented in cycle 4 may or may not be correct, which would mean problems if writes were allowed then. So they're not. A dummy read occurs instead, simply because we're forced to wait, and with 6502 all cycles not writes must be reads -- there's no control pin that can tell the bus to do nothing. Then the write occurs on cycle 6, when the address is sure to be correct.

An absolute indexed read such as LDA can sometimes omit cycle 6. By the end of cycle 5 it'll be known whether that cycle's read had the complete, valid address. If it did, there's no reason to re-run the cycle.

Quote:

Do you have idea why ASL, LSR, ROL, and ROR instructions do not use indexed indirect addressing and indirect indexed addressing which takes total 8 clock cycles?

This and some of your other comments have light shed upon them by the article. For instance Chuck Peddle mentions that, for economy reasons, the design was purposely limited to seven states, meaning an 8-cycle instruction would be impossible. [Edit: relating states directly to cycles was overly simplistic of me. See Ed's comments below.]

cheers,
Jeff

BigEd · Post by **BigEd** » Thu Apr 16, 2015 7:39 am

Dr Jefyll wrote:

The potential for page crossings creates uncertainty about how soon the complete, proper address will become available. The address presented in cycle 4 may or may not be correct, which would mean problems if writes were allowed then. So they're not.

Quite so - the crucial point is that the CPU does not yet know if the page boundary was crossed. Of course, if more transistors were spent on this, or on other things, it could have been otherwise: but the whole point of the original 6502 was to be cheap, and to do that it had to be small, and that limited the amount of logic on chip. The photo at Jeff's link shows the (3 inch? 4 inch??) wafer: it's useful to picture how many complete rectangular die you can fit on such a wafer, and to realise that a small change in die size can make a big difference. Making an array of rectangles fit on a circle is a challenge!

Some good info here on the history and benefit of ever-increasing wafer sizes (which is the converse of trying to make a given design in the smallest die size)

So, for a chip designer, it's all about die size. That acts as a constraint on several things:
- number of pins
- number and connectivity of on-chip busses
- design style of logic (structured or unstructured)
- amount of complexity (number of transistors)
- speed (size of transistors)

On the subject of transistor count:

Quote:

However, I believe that 6502 designers made their decision to reduce thousands of transistors to 3510.

The NMOS 6502 apparently had about 9,000 transistors, about 3500 gates. Ed, who's on the visual6502 project, can probably tell us.

I found this: "The 6502 chip is made up of 4528 transistors (3510 enhancement transistors and 1018 depletion pullup transistors)."

Cheers
Ed

barrym95838 · Post by **barrym95838** » Fri Apr 17, 2015 3:07 pm

BigEd wrote:

... it's useful to picture how many complete rectangular die you can fit on such a wafer, and to realise that a small change in die size can make a big difference ...

I got a smile out of this link ...

http://www2.stetson.edu/~efriedma/squincir/

Mike B.

satpro · Post by **satpro** » Fri Apr 17, 2015 7:09 pm

Dr Jefyll wrote:

Quote:

Do you have idea why ASL, LSR, ROL, and ROR instructions do not use indexed indirect addressing and indirect indexed addressing which takes total 8 clock cycles?

This and some of your other comments have light shed upon them by the article. For instance Chuck Peddle mentions that, for economy reasons, the design was purposely limited to seven states, meaning an 8-cycle instruction would be impossible.

The 8-cycle instruction Bryan is referring to is for the '816 in 16-bit native mode (viewtopic.php?f=1&t=3273), and not for a 6502.

BigEd · Post by **BigEd** » Fri Apr 17, 2015 7:40 pm

I think Bryan's point is that the 6502 has RMW instructions, and indirect addressing modes, but it doesn't off the combination of an RMW instruction in an indirect addressing mode, and he wonders why.

Jeff's linked article says that the 6502 instruction sequencer goes up to 7 states, and not to 8, because to go to 8 would have made the chip too big.

RichTW · Post by **RichTW** » Mon Apr 20, 2015 6:43 am

But the 6502 does have 8 cycle RMW indirect instructions - just not documented ones. So it doesn't seem to be a limitation per se.

The state machine description here states that RMW instructions have an extra two cycles (named SD1 and SD2) which always perform the two stores and which don't require state logic from the PLA. So I assume there could have been a way to implement the regular RMW instructions with indirect addressing.

BigEd · Post by **BigEd** » Mon Apr 20, 2015 8:30 am

Ah - I've never studied the unsupported/illegal/undocumented instructions. But indeed you're right, according to
http://www.visual6502.org/wiki/index.ph ... 56_Opcodes
we have DCP, SLO, RLA, ISC, RRA, SLE, all of which do an RMW in both indirect addressing modes. The instructions are described at
http://www.ataripreservation.org/websit ... lopc31.txt
I wonder if the 6502 people got as far as fully implementing more useful versions of these before they had to hack the chip size down. (For example, DCP is like a DECrement on memory, but it doesn't set the flags properly)

As we see in Segher's document you linked, the 6502 state machine isn't a simple linear sequence, so it's a simplification to say that it's a 7-step machine but not an 8-step machine. It's a machine with a certain graph of states, and to make it more complex (and correct) would take more chip area.

Extra cycle in crossing page boundary question

Extra cycle in crossing page boundary question

Re: Extra cycle in crossing page boundary question

Re: Extra cycle in crossing page boundary question

Re: Extra cycle in crossing page boundary question

Re: Extra cycle in crossing page boundary question

Re: Extra cycle in crossing page boundary question

Re: Extra cycle in crossing page boundary question

Re: Extra cycle in crossing page boundary question

Re: Extra cycle in crossing page boundary question