6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 21, 2024 11:09 pm

All times are UTC




Post new topic Reply to topic  [ 11 posts ] 
Author Message
PostPosted: Wed Aug 30, 2023 9:06 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
Today I looked into the bus activity during a WAI instruction on the WDC 65C02, and noticed that it is pulling RDY low during its second cycle, at the rising edge of phi2, if IRQB is high. Also, it seems to be a three cycle instruction - if you discount all the cycles where RDY was low at the end of phase two, there are three cycles left.

So I started to wonder - it's widely stated that it allows you to respond to an interrupt in only one clock cycle, but how can it possibly do that? It pauses itself during its second cycle, and surely needs to re-execute its second cycle with RDY high, and then execute its third cycle, before the next instruction can start. That's at least one and a half clock cycles, and maybe half a cycle more if the interrupt arrives soon after the falling edge of phi2.

I had already built a circuit to test WAI's bus behaviour so it was pretty easy to capture the timing on a scope. This system is running at 10MHz, and only consists of a CPU and some ROM.

In this scope trace the blue lower line shows the level on IRQB and the yellow upper line shows RWB. The CPU was executing WAI followed by STZ 0. I've aligned the scope trace so that the centre of the screen is at the start of phase 1 of the first cycle of the STZ instruction.
Attachment:
20230830_195547.jpg
20230830_195547.jpg [ 4.14 MiB | Viewed 4254 times ]

I count two whole cycles (and change) after the IRQB transition before the centre of the screen where STZ starts. Note that at 10 MHz one clock cycle is two large grid squares wide.

My first capture was even worse, due I think to the point at which IRQB changed during the clock cycle:
Attachment:
20230830_195052.jpg
20230830_195052.jpg [ 4.24 MiB | Viewed 4254 times ]


So I'm curious - has anyone here timed this before, and did your results match this? Is this timing what you'd expect? And is the "within one clock cycle" claim just a myth?

Edit - I thought it could be related to RDY having a slow rise rate due to the resistor pull-up that's required. Perhaps the one cycle response time only applies to the variants that have a separate pin for WAI, or the 65816.

I was using a 1k pull-up, which is already rather low - I think the datasheet suggests we should only draw 1.6mA to keep the voltage below 0.4V when the CPU is pulling it low. So my resistor is already about a third of the value it should be. I recaptured the test case with RDY and PHI2 measured as well, and it's pretty clear that RDY starts to rise at the beginning of phase 1 after IRQB goes low. It was quite slow but with a 10 MHz clock it got to about 4V before phase 2 and was pretty stable by the end of the cycle. So I think that cycle must then count as the second cycle of the WAI instruction, with the third following, just before the middle of my oscilloscope screen.

I also tried replacing the 1K pull-up with 330 ohms. This is probably pulling up with about 15mA, nearly ten times what the datasheet said. The voltage seemed pretty nice and low nonetheless, though I didn't measure it exactly. The rise time of RDY was now about a third of a cycle, as you'd expect compared to the 1k pull-up - it was comfortably high by the time the second cycle's phase 2 started. The overall cycle counts were unchanged by this, so I think this is not really due to the weak pull-up, it seems to be just the way the CPU works.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 31, 2023 3:21 am 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
AFAIK the way WAI works is that it takes 3 cycles to setup (like the datasheet shows) but after that it sits in a loop and can respond to an interrupt in 1 cycle.
which i interpret as "if IRQ/NMI is pulled low in this cycle, it takes 1 additional cycle to finish the instruction, and then the cycle after that it continues execution or starts the interrupt sequence".

which is pretty much how i expect it to work and seems to mostly line up with what you measured, ±1 cycle or so.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 31, 2023 10:55 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
That's what I expected too, but my measurements show it taking two clear cycles after the interrupt before the next instruction starts. I also haven't seen anything very specific in the WDC 65C02 datasheet - especially no mention of cycle counts, it just talks about RDY being bidirectional, and that in the soft core model it's a separate pin and you need to plumb them together yourself or it won't wait at all.

I took some more captures, with blue=RDY, yellow=SYNC, red=A0, green=RWB - the horizontal timebase is set to one square per clock cycle now. This one is the start of the WAI instruction:
Attachment:
20230831_112407.jpg
20230831_112407.jpg [ 5.07 MiB | Viewed 4195 times ]

And this one is the interrupt ending the WAI instruction - the interrupt will have been during the cycle before the trigger point:
Attachment:
20230831_112546.jpg
20230831_112546.jpg [ 4.63 MiB | Viewed 4195 times ]

So it's clear that WAI does not execute its three cycles and then wait - it waits during its second cycle, and on resuming, needs to reexecute the second cycle, and then the third cycle.

I also captured this which was interesting - RDY was briefly pulled low again at the end of the second cycle, my manually-driven interrupt must have bounced:
Attachment:
20230831_112625.jpg
20230831_112625.jpg [ 4.86 MiB | Viewed 4195 times ]

This trace seems unusual because as in my first photo above, WAI normally pulls RDY low halfway through the cycle, at the start of phase 2 - not at the end of the cycle. Perhaps it is responding to IRQB being high asynchronously throughout phase 2, not just checking it at the start of phase 2. The unpause is definitely synchronous though, releasing RDY in time with the end of phase 2. It is unfortunate, as if it did this at the start it might save a cycle in some cases.

Regarding the third cycle, I wonder why it is there at all - WAI is really not doing anything except manipulating RDY, which it does during its second cycle. The third cycle doesn't seem to need to do anything, but I'm sure the designers had a reason for it! It feels like an external flipflop could do a quicker job though, you could hook it up to a one-cycle NOP so that you're ready to execute the next instruction in less than one cycle; or even delay the pause a bit so that the next instruction is already loaded, perhaps on its second cycle, when RDY is pulled low... if you wanted really quick response times.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 31, 2023 11:37 am 
Offline
User avatar

Joined: Tue Mar 05, 2013 4:31 am
Posts: 1385
While I've not actually implemented anything using the WAI instruction, I've not been able to find any information that states that the response from WAI is a single clock cycle. If you have a link to something that does state that, please share it.

Also, the current datasheets for the W65C02S are pretty "dumbed down" and missing a lot of detail. I went thru my older directories and found a datasheet from August 2002. This has much more information than any of the current ones and is attached here:

Attachment:
W65C02S.pdf [1.72 MiB]
Downloaded 36 times


Looking at Page 34: states that WAI takes 3 clock cycles to execute and shows the status at each clock cycle. It also shows a reference to Note 4.

Looking at Page 36: Note 4 states that the processor will wait at cycle 2 for two clock cycles "after" NMI or IRQ active input.

This would appear to backup what you're finding. Hope it helps.

_________________
Regards, KM
https://github.com/floobydust


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 31, 2023 1:27 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
Ah yes, thanks, that's a better datasheet and it all matches up. The references I found before were unofficial - some on 6502.org I think, possibly also Garth's interrupts page, and I think a Wikipedia page.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 31, 2023 1:31 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
Attached is a 1990 copy of the 65C02 data sheet, the oldest one I have that mentions with WAI instruction.  See the bottom of page 22 regarding the cycle-by-cycle behavior of WAI.

Attachment:
File comment: W65C02S Data Sheet, Feb 1990
65c02_1990.pdf [2.32 MiB]
Downloaded 43 times

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 31, 2023 2:11 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
Nice, thanks - it's interesting to see a type-written one! The data seems mostly the same as in the 2002 one. Looking at the archive here: http://6502.org/documents/datasheets/wdc/older/ It looks like the 2004 version still had this data but it was gone in the 2008 one - so that's when the cycle table was removed, and it's been this way for some time.

I wonder whether it's worth getting that 1990 one uploaded there as well, for historical interest?


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 01, 2023 6:46 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
I use WAI.

I've not tried to cycle-count it, but I have observed some interesting behaviour.

What I think can happen is that an incoming IRQ during the WAI setup time+ Rdy going low can cause the WAI to terminate and code carries on.

This caused me some headaches at first until I tracked it down.

So. e.g. I have a VIA that sends periodic interrupts to the CPU.

I execute WAI. And a cycle later the VIA IRQ fires and the WAI is effectively ignored.

My code is:

Code:
_hostCall:
        sta     sBufCmd
:       wai                     ; All STOP until the host pokes us.
        lda     sBufCmd         ; Check to see if the host actually did something
        bne     :-              ; ... if not

        rts


Now it's fully possible that this is expected behaviour and I just didn't real the documentation in full or this above noted behaviour. I have tried bracketing the WAI with CLI/SEI but found it didn't seem to help. (Or I didn't do enough testing), but as it added latency then when I found this way works, then left it as it is. It happens very infrequently, but it does happen - my timer interrupt is 1000Hz (in a 16Mhz system)

And as this may not be clear, what happens here is that the code we store in sBufCmd (part of a mutually exclusive shared RAM area between the '816 and host MCU) is never zero and the host MCU has to set it to zero as part of it's processing, if we check immediately after the WAI and it's not zero, we try again.


One other thing I do is block IRQs when Rdy is low. This is easy as it's done in a GAL and I'd need some logic there anyway to "wire-or" the IRQ sources due to the driven nature of the IRQ output from the 65C22N VIA. Otherwise any old IRQ would cause the WAI to terminate and that also causes headaches.

The host MCU sends an NMI when it's done - I have no other need for NMI, so that was easy.

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 01, 2023 8:31 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
I think the intended usage of WAI is to be placed immediately before the interrupt service routine, and executed with interrupts already masked and the stack already prepared for a subsequent RTI. Thus when the CPU has nothing better to do, it can be set ready to respond to an interrupt with minimum latency, ie. not having to go through the stacking of PC and S. If an interrupt is already signalled, continuing through to the service routine would be the expected behaviour.

The '816 datasheet still has a detailed cycle table, and that does say that WAI continues for 2 cycles after an IRQ or NMI. Only then will the following opcode be fetched. In most respects the cycle behaviour of the '816 is very similar to that of the 6502.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 02, 2023 10:20 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
drogon wrote:
What I think can happen is that an incoming IRQ during the WAI setup time+ Rdy going low can cause the WAI to terminate and code carries on.

I think that's to be expected - WAI's purpose is to wait for an interrupt state, and if one already exists or occurs during its first cycle then it doesn't wait at all. NMIB is edge triggered so can't persist, but if IRQB is already low then there is a pending interrupt there (albeit masked) and WAI shouldn't wait (i.e. it won't pull RDY low).

From what I can tell, it looks like in practice it checks this at the rising edge of PHI2 during its second clock cycle and transitions RDY accordingly. This fits with the older datasheets that specify that the wait is during the second cycle.

Quote:
My code is:

Code:
_hostCall:
        sta     sBufCmd
:       wai                     ; All STOP until the host pokes us.
        lda     sBufCmd         ; Check to see if the host actually did something
        bne     :-              ; ... if not

        rts


I think if you're aiming for fast response to an anticipated interrupt then more typically you would SEI before starting the hardware operation that's going to cause the interrupt, then WAI, then handle the anticipated interrupt inline instead of having the interrupt handler do it, as Chromatix said. You may need to deal with the possibility that some other interrupt occurred and ended WAI though it sounds like you've tried to mask them out already.

The other purpose for WAI is saving power when a system is idle, in which case you don't know or care which interrupt will fire next, you just want to go into the low power state until one does, so using WAI in an idling loop with interrupts enabled makes sense as well, letting the regular interrupt vector deal with it.

Quote:
One other thing I do is block IRQs when Rdy is low.

I think you may be better off not using RDY to drive that - typically to get a fast response you'd need these other interrupts to be disabled even before RDY goes low, and as above, it might not go low at all if the other interrupts were already pending.

Quote:
The host MCU sends an NMI when it's done - I have no other need for NMI, so that was easy.

The edge trigger is nice. However NMI will always cause a branch through the vector so you'll lose the faster response that you could get with WAI and IRQB.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 02, 2023 11:05 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
gfoot wrote:
drogon wrote:
What I think can happen is that an incoming IRQ during the WAI setup time+ Rdy going low can cause the WAI to terminate and code carries on.

I think that's to be expected - WAI's purpose is to wait for an interrupt state, and if one already exists or occurs during its first cycle then it doesn't wait at all. NMIB is edge triggered so can't persist, but if IRQB is already low then there is a pending interrupt there (albeit masked) and WAI shouldn't wait (i.e. it won't pull RDY low).

From what I can tell, it looks like in practice it checks this at the rising edge of PHI2 during its second clock cycle and transitions RDY accordingly. This fits with the older datasheets that specify that the wait is during the second cycle.

Quote:
My code is:

Code:
_hostCall:
        sta     sBufCmd
:       wai                     ; All STOP until the host pokes us.
        lda     sBufCmd         ; Check to see if the host actually did something
        bne     :-              ; ... if not

        rts


I think if you're aiming for fast response to an anticipated interrupt then more typically you would SEI before starting the hardware operation that's going to cause the interrupt, then WAI, then handle the anticipated interrupt inline instead of having the interrupt handler do it, as Chromatix said. You may need to deal with the possibility that some other interrupt occurred and ended WAI though it sounds like you've tried to mask them out already.


It's been stable like this for over 3 years now. No plans to change. Stability is more important than anything else. The board works and runs for weeks at a time until I make a stupid programming mistake and have to hit reset...

Quote:
The other purpose for WAI is saving power when a system is idle, in which case you don't know or care which interrupt will fire next, you just want to go into the low power state until one does, so using WAI in an idling loop with interrupts enabled makes sense as well, letting the regular interrupt vector deal with it.


Power isn't an issue here. Or at least it's not something I care about. For me it's purely a communication means to the host µC. My Ruby 816 board draws about 180mA which is OK for me.

Quote:
Quote:
One other thing I do is block IRQs when Rdy is low.

I think you may be better off not using RDY to drive that - typically to get a fast response you'd need these other interrupts to be disabled even before RDY goes low, and as above, it might not go low at all if the other interrupts were already pending.


3+ years later and I'm OK with it that way.

Quote:
Quote:
The host MCU sends an NMI when it's done - I have no other need for NMI, so that was easy.

The edge trigger is nice. However NMI will always cause a branch through the vector so you'll lose the faster response that you could get with WAI and IRQB.


3+ years later and I'm OK with it that way.

I did use it to count host calls at one point when I was interested in that thing, but not really bothered now.

I know it's probably not the best way to do it - during development (of the 6502 predecessor) I went through several iterations of ideas - trying to use Rdy input so I could send unsolicited messages from the host µC and so on - that didn't work well. Using an IRQ to wake up the WAI - that also didn't work well as the wrong IRQ would wake it and some other ideas before settling on this way. If I were to do it again I'd stick the lot inside a CPLD and be done with it - but there's line-drawing to be done, else I'll just end up stuffing the whole thing inside some little ARM core or FPGA and emulate the lot in software....

But I am thinking of a board re-spin at some point, but also feel I've done quite enough with the '816 (and I've already ported my entire BCPL OS over to RISC-V, so who knows where next).

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 46 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: