Page 1 of 1
wdc65c02 cycle by cycle
Posted: Mon Jul 22, 2024 1:18 pm
by barnacle
My search-fu is poor today; I know I've seen one of these for a NMOS 6502 (though I can't currently find it!) but I'm curious to know what happens when reading some of the indexed mode. I recall that some of these did two read cycles of the target address in the NMOS version but I'm unsure if it was fixed in the wdc65c02 - the data sheet rather hints that it does but it's unclear.
My potential issue is reading a CF card in ISA 8-bit mode; grabbing data requires (iirc) 256 sequential reads, so I don't want to have accidental unexpected reads.
Neil
Re: wdc65c02 cycle by cycle
Posted: Mon Jul 22, 2024 3:08 pm
by barrym95838
IIRC, the "wasted" read comes from the following instruction, not the ill-formed operand address. I'm link-challenged ATM.
Re: wdc65c02 cycle by cycle
Posted: Mon Jul 22, 2024 3:50 pm
by John West
I'm not aware of that sort of document for the 65C02. The 65816 datasheet does document the cycle-by-cycle behaviour, and that looks similar enough to what I remember of the NMOS 6502 that I'll believe the 65C02 will be the same. Instructions like LDA abs,X will have an extra cycle that reads from the wrong page if adding X to the address crosses a page boundary. STA abs, X always has the extra cycle: it will read from an address that has X added to the low 8 bits, then write to the final address. INC abs, X will have an extra read (possibly from the wrong page), then the real read, then write the old data, then write the new data.
Re: wdc65c02 cycle by cycle
Posted: Mon Jul 22, 2024 3:53 pm
by BigDumbDinosaur
Perhaps the attached data sheet will help. It’s from 2004, but should still be valid. See page 32.
- 65c02_2004.pdf
- 65C02 Data Sheet (2004)
- (1011.99 KiB) Downloaded 151 times
Re: wdc65c02 cycle by cycle
Posted: Mon Jul 22, 2024 4:37 pm
by barnacle
Thanks BDD. I have two other data sheets for the wdc65c02, and neither include that table... which answers my question.
Neil
Re: wdc65c02 cycle by cycle
Posted: Tue Jul 23, 2024 1:56 am
by BigDumbDinosaur
Thanks BDD. I have two other data sheets for the wdc65c02, and neither include that table... which answers my question.
BTW, the 65C02 has an undocumented behavior when using absolute-indexed addressing.
Let’s suppose you are executing STA $8000,X and .X is loaded with $02. The 65C02 will do a dummy read of $8000, followed by the write to $8002. If $8000 is RAM or ROM, the dummy read will be harmless. However, if $8000 is a hardware register that is affected by reading it, you will have a problem on your hands.
I seem to recall floobydust encountered this when he was trying to implement my NXP 2692 driver code on his Pocket PC, which is C02-powered. I consider this behavior to be a bug, since it is not mentioned in any 65C02 data sheet I have...and my collection goes back to 1984.
Re: wdc65c02 cycle by cycle
Posted: Tue Jul 23, 2024 8:19 pm
by Osric
Let’s suppose you are executing STA $8000,X and .X is loaded with $02. The 65C02 will do a dummy read of $8000, followed by the write to $8002. If $8000 is RAM or ROM, the dummy read will be harmless. However, if $8000 is a hardware register that is affected by reading it, you will have a problem on your hands.
I'm not very sure how to read these datasheets. But in the table you cited, it looks to me like section 3a applies to the situation for STA $8000,X and note (1) applies to the PC+2 cycle and note (1) says "Add 1 cycle for indexing across page boundaries, or write. This cycle contains invalid addresses." Where is the page boundary? I can't tell if this note applies or even be sure I am reading the right section of the datasheet table, but if $8000 to $8002 is considered to cross a page boundary, it seems like the invalid address is documented? Actually, reading it again, it says "or write" so maybe this invalid address happens for every STA?!
Re: wdc65c02 cycle by cycle
Posted: Tue Jul 23, 2024 8:59 pm
by Osric
I failed to reproduce the situation in my cycle by cycle trace - the extra cycle occurs, but the address being read is the address being written to. The bus state is being captured and printed at on the rising edge of Φ2.
Code: Select all
000001 ff00: 1 a9 ff00 a9 02 lda #$02
000002 ff01: 1 02
000003 ff02: 1 aa ff02 aa tax
000004 ff03: 1 9d ff03 9d 00 80 sta $8000,x
000005 ff03: 1 9d
000006 ff04: 1 00
000007 ff05: 1 80
000008 8002: 1 02
000009 8002: 0 02
000010 ff06: 1 9d ff06 9d 01 80 sta $8001,x
000011 ff07: 1 01
000012 ff08: 1 80
000013 8003: 1 02
000014 8003: 0 02
000015 ff09: 1 9d ff09 9d 02 80 sta $8002,x
000016 ff0a: 1 02
000017 ff0b: 1 80
000018 8004: 1 02
000019 8004: 0 02
000020 ff0c: 1 4c L ff0c 4c 00 ff jmp $ff00
(edit) I had a bug where my code doesn't know TAX requires two cycles and so the STA was decoded a cycle early and a spurious instruction was shown. Fixed in this edit. Still, focus on the left side of the output which is cycle count, address, RWB, data bus, (ascii for data bus if A-Z a-z) before the long gap to the disassembler's version of the trace. The text on the right is from the disassembler and not from the bus.
Re: wdc65c02 cycle by cycle
Posted: Tue Jul 23, 2024 10:52 pm
by fachat
That extra read at the write address is probably because there's an optimized core in the WDC chip.
IIRC the 65816 as originally planned was faster in terms of cycles than the 6502 - but this broke the Apple II disk controller when they wanted to use the 816 in the Apple II gs, so some extra cycles had to be reintroduced. Thanks, Woz!
This could be one such occasion. The original 6502 would have probably always read the unmodified operand address ($8000 here) before accessing the updated one. The 65816 originally was that one cycle faster, but had to be 'fixed'. But they probably just added a read from the [edit: not 'final address' but] PC address instead of reintroducing the access to the wrong address.
The W65C02 could have at some point just inherited that from the more modern 65C816 core.
Note: this is deduced from what I remember without consulting any datasheet, and more or less pure speculation, so take with a grain of salt.
André
Re: wdc65c02 cycle by cycle
Posted: Wed Jul 24, 2024 12:21 am
by Osric
I'm not aware of that sort of document for the 65C02. The 65816 datasheet does document the cycle-by-cycle behaviour, and that looks similar enough to what I remember of the NMOS 6502 that I'll believe the 65C02 will be the same. Instructions like LDA abs,X will have an extra cycle that reads from the wrong page if adding X to the address crosses a page boundary. STA abs, X always has the extra cycle: it will read from an address that has X added to the low 8 bits, then write to the final address. INC abs, X will have an extra read (possibly from the wrong page), then the real read, then write the old data, then write the new data.
I should have read John's answer more carefully. He describes the situation perfectly I think - the address arithmetic is being done piecemeal and results in the wrong high byte on the read and an invalid access as a result. My attempt to reproduce is a bit confusing though:
Code: Select all
000000 fffd: 1 ff
000001 ff00: 1 a9 ff00 a9 02 lda #$02
000002 ff01: 1 02
000003 ff02: 1 aa ff02 aa tax
000004 ff03: 1 9d ff03 9d fd 80 sta $80fd,x
000005 ff03: 1 9d
000006 ff04: 1 fd
000007 ff05: 1 80
000008 80ff: 1 7f
000009 80ff: 0 02
000010 ff06: 1 9d ff06 9d fe 80 sta $80fe,x
000011 ff07: 1 fe
000012 ff08: 1 80
000013 ff08: 1 80
000014 8100: 0 02
000015 ff09: 1 9d ff09 9d ff 80 sta $80ff,x
000016 ff0a: 1 ff
000017 ff0b: 1 80
000018 ff0b: 1 80
000019 8101: 0 02
000020 ff0c: 1 4c L ff0c 4c 00 ff jmp $ff00
Why do the extra read cycles happen at the PC? Arguably this is better behaviour in *both* scenarios, so if this is a "fix" why not fix it all the time?
Re: wdc65c02 cycle by cycle
Posted: Wed Jul 24, 2024 5:41 am
by barrym95838
Why do the extra read cycles happen at the PC? Arguably this is better behaviour in *both* scenarios, so if this is a "fix" why not fix it all the time?
That's the behavior I was clumsily hinting about in my post above. You seem to have explained it better (i.e. more accurately), but your question adds an extra twist that may not have an easy answer. Perhaps this is a detailed peek at the "Woz factor" in action.
Re: wdc65c02 cycle by cycle
Posted: Wed Jul 24, 2024 7:34 am
by John West
Code: Select all
000003 ff02: 1 aa ff02 aa tax
000004 ff03: 1 9d ff03 9d fd 80 sta $80fd,x
000005 ff03: 1 9d
000006 ff04: 1 fd
000007 ff05: 1 80
000008 80ff: 1 7f
000009 80ff: 0 02
Why do the extra read cycles happen at the PC?
Are you talking about the two fetches of $9d from $ff03? The first actually belongs to the TAX, which is a two cycle instruction. Every 6502 instruction starts with a fetch of the byte after the opcode: the opcode has only just been latched so decoding hasn't taken place, the next cycle has to do something, and the next byte is a useful default as most instructions have at least one byte of operand.
So it's more like
Code: Select all
000003 ff02: 1 aa ff02 aa tax
000004 ff03: 1 9d
000005 ff03: 1 9d ff03 9d fd 80 sta $80fd,x
000006 ff04: 1 fd
000007 ff05: 1 80
000008 80ff: 1 7f
000009 80ff: 0 02
Oh, or are you talking about the second and third STAs, which read from $ff08/$ff0b in cycle 13/18? That one I can't explain. I'm fairly sure the NMOS 6502 would have read from $8000/$8001 there.
My earlier description was speculation regarding the 65C02, as I wasn't aware of any documentation about it and wasn't able to pull out a board and experiment at the time (I hope my post made that clear). I'm fairly confident on the NMOS 6502 behaviour though - what STA does across page boundaries is the sort of thing I'd have been interested in when I had a logic analyser hooked up to one.
Re: wdc65c02 cycle by cycle
Posted: Wed Jul 24, 2024 11:15 am
by Osric
Are you talking about the two fetches of $9d from $ff03? The first actually belongs to the TAX, which is a two cycle instruction. ...
So it's more like
Code: Select all
000003 ff02: 1 aa ff02 aa tax
000004 ff03: 1 9d
000005 ff03: 1 9d ff03 9d fd 80 sta $80fd,x
000006 ff04: 1 fd
000007 ff05: 1 80
000008 80ff: 1 7f
000009 80ff: 0 02
Not this one - this is a result of an ongoing bug in my disassembly display in the monitor where I am not using the right value for the cycle time required for tax in the output. As you say this read is just the bus state while TAX executes before the real fetch of 9d.
Oh, or are you talking about the second and third STAs, which read from $ff08/$ff0b in cycle 13/18? That one I can't explain. I'm fairly sure the NMOS 6502 would have read from $8000/$8001 there. ... I'm fairly confident on the NMOS 6502 behaviour though - what STA does across page boundaries is the sort of thing I'd have been interested in when I had a logic analyser hooked up to one.
This. The oddball thing to me here is that when the addition crosses page boundaries it reads from the PC, but when it doesn't it reads from the correctly computed destination of STA. I still argue that always reading from the PC would be better behaviour in both scenarios, so if they fixed the old NMOS behaviour by doing it this way (to avoid reading the wrong address from the prior page whose effects could be quite unpredictable) they should have fixed it in the non-page boundary case too (because even in this case a read from a hardware register before a write might be undesirable).