We can see in the implementation the decoding of three specific opcode byte patterns: one of these patterns, xxxx10x0, is used to create a signal called ONEBYTE. It's the ONEBYTE instructions which need special handling: although the following byte is necessarily always read (because in setting up that cycle the machine doesn't know what the opcode is) the PC should not be advanced a second time.
The operations RTS and RTI might seem to be exceptions, but as they will restore the PC it doesn't matter if it's incremented a second time. (And indeed it is.)
Similarly, BRK is not of the ONEBYTE pattern, so the PC is incremented a second time.
(As interrupt dispatch begins with the use of the ClearIR signal to put a zero in where the opcode would have landed, for many purposes it looks like a BRK. However, in this case the famous D1x1 signal is already in play to prevent the PC increment.)
So, the 6502's behaviour is natural and simple, as appropriate for an extreme cost-reduced processor. Later efforts were able to spend more transistors, decode more opcodes, and behave in more complex ways - but they didn't change BRK, for obvious reasons.
Given the way BRK works, if a BRK handler intends to return, it will be easiest to return as if BRK is a two byte instruction. And given that this is so, it's simplest to explain to the user that BRK is a two byte instruction. And it's not harmful for the customer to believe it. And it's then fruitful to think of how to use that unexpected operand. And that story can then be written into training materials and the operand can be called 'the signature byte'.
Which is to say, if you start from the documents, the story looks one way, but if you start from the implementation, the story looks different. I have no doubt the implementation came first, in this case.
Ref:
http://visual6502.org/wiki/index.php?ti ... PC_controlhttp://visual6502.org/wiki/index.php?ti ... ing_Stateshttp://visual6502.org/wiki/index.php?ti ... te_Machinehttp://nparker.llx.com/a2/opcodes.htmlWithout revisiting the implementation, my recollection is that one-byte opcodes have to be handled specially, because the PC must not be incremented in the second cycle - and yet that's the earliest the machine can know what the opcode is. An interrupt is taken by forcing the incoming opcode byte to a zero.