6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 9:19 pm

All times are UTC




Post new topic Reply to topic  [ 13 posts ] 
Author Message
 Post subject: wdc65c02 cycle by cycle
PostPosted: Mon Jul 22, 2024 1:18 pm 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 984
Location: Potsdam, DE
My search-fu is poor today; I know I've seen one of these for a NMOS 6502 (though I can't currently find it!) but I'm curious to know what happens when reading some of the indexed mode. I recall that some of these did two read cycles of the target address in the NMOS version but I'm unsure if it was fixed in the wdc65c02 - the data sheet rather hints that it does but it's unclear.

My potential issue is reading a CF card in ISA 8-bit mode; grabbing data requires (iirc) 256 sequential reads, so I don't want to have accidental unexpected reads.

Neil


Top
 Profile  
Reply with quote  
PostPosted: Mon Jul 22, 2024 3:08 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
IIRC, the "wasted" read comes from the following instruction, not the ill-formed operand address. I'm link-challenged ATM.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Mon Jul 22, 2024 3:50 pm 
Offline

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 336
I'm not aware of that sort of document for the 65C02. The 65816 datasheet does document the cycle-by-cycle behaviour, and that looks similar enough to what I remember of the NMOS 6502 that I'll believe the 65C02 will be the same. Instructions like LDA abs,X will have an extra cycle that reads from the wrong page if adding X to the address crosses a page boundary. STA abs, X always has the extra cycle: it will read from an address that has X added to the low 8 bits, then write to the final address. INC abs, X will have an extra read (possibly from the wrong page), then the real read, then write the old data, then write the new data.


Top
 Profile  
Reply with quote  
PostPosted: Mon Jul 22, 2024 3:53 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8506
Location: Midwestern USA
Perhaps the attached data sheet will help.  It’s from 2004, but should still be valid.  See page 32.

Attachment:
File comment: 65C02 Data Sheet (2004)
65c02_2004.pdf [1011.99 KiB]
Downloaded 36 times

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Mon Jul 22, 2024 4:37 pm 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 984
Location: Potsdam, DE
Thanks BDD. I have two other data sheets for the wdc65c02, and neither include that table... which answers my question.

Neil


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 23, 2024 1:56 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8506
Location: Midwestern USA
barnacle wrote:
Thanks BDD. I have two other data sheets for the wdc65c02, and neither include that table... which answers my question.

BTW, the 65C02 has an undocumented behavior when using absolute-indexed addressing.

Let’s suppose you are executing STA $8000,X and .X is loaded with $02.  The 65C02 will do a dummy read of $8000, followed by the write to $8002.  If $8000 is RAM or ROM, the dummy read will be harmless.  However, if $8000 is a hardware register that is affected by reading it, you will have a problem on your hands.

I seem to recall floobydust encountered this when he was trying to implement my NXP 2692 driver code on his Pocket PC, which is C02-powered.  I consider this behavior to be a bug, since it is not mentioned in any 65C02 data sheet I have...and my collection goes back to 1984.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 23, 2024 8:19 pm 
Offline

Joined: Sat Jul 20, 2024 3:27 pm
Posts: 51
BigDumbDinosaur wrote:
Let’s suppose you are executing STA $8000,X and .X is loaded with $02.  The 65C02 will do a dummy read of $8000, followed by the write to $8002.  If $8000 is RAM or ROM, the dummy read will be harmless.  However, if $8000 is a hardware register that is affected by reading it, you will have a problem on your hands.
I'm not very sure how to read these datasheets. But in the table you cited, it looks to me like section 3a applies to the situation for STA $8000,X and note (1) applies to the PC+2 cycle and note (1) says "Add 1 cycle for indexing across page boundaries, or write. This cycle contains invalid addresses." Where is the page boundary? I can't tell if this note applies or even be sure I am reading the right section of the datasheet table, but if $8000 to $8002 is considered to cross a page boundary, it seems like the invalid address is documented? Actually, reading it again, it says "or write" so maybe this invalid address happens for every STA?!


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 23, 2024 8:59 pm 
Offline

Joined: Sat Jul 20, 2024 3:27 pm
Posts: 51
I failed to reproduce the situation in my cycle by cycle trace - the extra cycle occurs, but the address being read is the address being written to. The bus state is being captured and printed at on the rising edge of Φ2.

Code:
000001 ff00: 1  a9                   ff00 a9 02    lda  #$02
000002 ff01: 1  02
000003 ff02: 1  aa                   ff02 aa       tax
000004 ff03: 1  9d                   ff03 9d 00 80 sta  $8000,x
000005 ff03: 1  9d
000006 ff04: 1  00
000007 ff05: 1  80
000008 8002: 1  02
000009 8002: 0  02
000010 ff06: 1  9d                   ff06 9d 01 80 sta  $8001,x
000011 ff07: 1  01
000012 ff08: 1  80
000013 8003: 1  02
000014 8003: 0  02
000015 ff09: 1  9d                   ff09 9d 02 80 sta  $8002,x
000016 ff0a: 1  02
000017 ff0b: 1  80
000018 8004: 1  02
000019 8004: 0  02
000020 ff0c: 1  4c L                 ff0c 4c 00 ff jmp  $ff00


(edit) I had a bug where my code doesn't know TAX requires two cycles and so the STA was decoded a cycle early and a spurious instruction was shown. Fixed in this edit. Still, focus on the left side of the output which is cycle count, address, RWB, data bus, (ascii for data bus if A-Z a-z) before the long gap to the disassembler's version of the trace. The text on the right is from the disassembler and not from the bus.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 23, 2024 10:52 pm 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 1043
Location: near Heidelberg, Germany
That extra read at the write address is probably because there's an optimized core in the WDC chip.

IIRC the 65816 as originally planned was faster in terms of cycles than the 6502 - but this broke the Apple II disk controller when they wanted to use the 816 in the Apple II gs, so some extra cycles had to be reintroduced. Thanks, Woz!

This could be one such occasion. The original 6502 would have probably always read the unmodified operand address ($8000 here) before accessing the updated one. The 65816 originally was that one cycle faster, but had to be 'fixed'. But they probably just added a read from the [edit: not 'final address' but] PC address instead of reintroducing the access to the wrong address.

The W65C02 could have at some point just inherited that from the more modern 65C816 core.

Note: this is deduced from what I remember without consulting any datasheet, and more or less pure speculation, so take with a grain of salt.

André

_________________
Author of the GeckOS multitasking operating system, the usb65 stack, designer of the Micro-PET and many more 6502 content: http://6502.org/users/andre/


Last edited by fachat on Wed Jul 24, 2024 6:48 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 24, 2024 12:21 am 
Offline

Joined: Sat Jul 20, 2024 3:27 pm
Posts: 51
John West wrote:
I'm not aware of that sort of document for the 65C02. The 65816 datasheet does document the cycle-by-cycle behaviour, and that looks similar enough to what I remember of the NMOS 6502 that I'll believe the 65C02 will be the same. Instructions like LDA abs,X will have an extra cycle that reads from the wrong page if adding X to the address crosses a page boundary. STA abs, X always has the extra cycle: it will read from an address that has X added to the low 8 bits, then write to the final address. INC abs, X will have an extra read (possibly from the wrong page), then the real read, then write the old data, then write the new data.
I should have read John's answer more carefully. He describes the situation perfectly I think - the address arithmetic is being done piecemeal and results in the wrong high byte on the read and an invalid access as a result. My attempt to reproduce is a bit confusing though:
Code:
000000 fffd: 1  ff
000001 ff00: 1  a9                   ff00 a9 02    lda  #$02
000002 ff01: 1  02
000003 ff02: 1  aa                   ff02 aa       tax
000004 ff03: 1  9d                   ff03 9d fd 80 sta  $80fd,x
000005 ff03: 1  9d
000006 ff04: 1  fd
000007 ff05: 1  80
000008 80ff: 1  7f
000009 80ff: 0  02
000010 ff06: 1  9d                   ff06 9d fe 80 sta  $80fe,x
000011 ff07: 1  fe
000012 ff08: 1  80
000013 ff08: 1  80
000014 8100: 0  02
000015 ff09: 1  9d                   ff09 9d ff 80 sta  $80ff,x
000016 ff0a: 1  ff
000017 ff0b: 1  80
000018 ff0b: 1  80
000019 8101: 0  02
000020 ff0c: 1  4c L                 ff0c 4c 00 ff jmp  $ff00

Why do the extra read cycles happen at the PC? Arguably this is better behaviour in *both* scenarios, so if this is a "fix" why not fix it all the time?


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 24, 2024 5:41 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Osric wrote:
Why do the extra read cycles happen at the PC? Arguably this is better behaviour in *both* scenarios, so if this is a "fix" why not fix it all the time?

That's the behavior I was clumsily hinting about in my post above. You seem to have explained it better (i.e. more accurately), but your question adds an extra twist that may not have an easy answer. Perhaps this is a detailed peek at the "Woz factor" in action.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 24, 2024 7:34 am 
Offline

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 336
Osric wrote:
Code:
000003 ff02: 1  aa                   ff02 aa       tax
000004 ff03: 1  9d                   ff03 9d fd 80 sta  $80fd,x
000005 ff03: 1  9d
000006 ff04: 1  fd
000007 ff05: 1  80
000008 80ff: 1  7f
000009 80ff: 0  02

Why do the extra read cycles happen at the PC?

Are you talking about the two fetches of $9d from $ff03? The first actually belongs to the TAX, which is a two cycle instruction. Every 6502 instruction starts with a fetch of the byte after the opcode: the opcode has only just been latched so decoding hasn't taken place, the next cycle has to do something, and the next byte is a useful default as most instructions have at least one byte of operand.
So it's more like
Code:
000003 ff02: 1  aa                   ff02 aa       tax
000004 ff03: 1  9d
000005 ff03: 1  9d                   ff03 9d fd 80 sta  $80fd,x
000006 ff04: 1  fd
000007 ff05: 1  80
000008 80ff: 1  7f
000009 80ff: 0  02


Oh, or are you talking about the second and third STAs, which read from $ff08/$ff0b in cycle 13/18? That one I can't explain. I'm fairly sure the NMOS 6502 would have read from $8000/$8001 there.

My earlier description was speculation regarding the 65C02, as I wasn't aware of any documentation about it and wasn't able to pull out a board and experiment at the time (I hope my post made that clear). I'm fairly confident on the NMOS 6502 behaviour though - what STA does across page boundaries is the sort of thing I'd have been interested in when I had a logic analyser hooked up to one.


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 24, 2024 11:15 am 
Offline

Joined: Sat Jul 20, 2024 3:27 pm
Posts: 51
John West wrote:
Are you talking about the two fetches of $9d from $ff03? The first actually belongs to the TAX, which is a two cycle instruction. ...
So it's more like
Code:
000003 ff02: 1  aa                   ff02 aa       tax
000004 ff03: 1  9d
000005 ff03: 1  9d                   ff03 9d fd 80 sta  $80fd,x
000006 ff04: 1  fd
000007 ff05: 1  80
000008 80ff: 1  7f
000009 80ff: 0  02

Not this one - this is a result of an ongoing bug in my disassembly display in the monitor where I am not using the right value for the cycle time required for tax in the output. As you say this read is just the bus state while TAX executes before the real fetch of 9d.
John West wrote:

Oh, or are you talking about the second and third STAs, which read from $ff08/$ff0b in cycle 13/18? That one I can't explain. I'm fairly sure the NMOS 6502 would have read from $8000/$8001 there. ... I'm fairly confident on the NMOS 6502 behaviour though - what STA does across page boundaries is the sort of thing I'd have been interested in when I had a logic analyser hooked up to one.
This. The oddball thing to me here is that when the addition crosses page boundaries it reads from the PC, but when it doesn't it reads from the correctly computed destination of STA. I still argue that always reading from the PC would be better behaviour in both scenarios, so if they fixed the old NMOS behaviour by doing it this way (to avoid reading the wrong address from the prior page whose effects could be quite unpredictable) they should have fixed it in the non-page boundary case too (because even in this case a read from a hardware register before a write might be undesirable).


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 13 posts ] 

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 10 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron