6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 10, 2024 6:28 am

All times are UTC




Post new topic Reply to topic  [ 47 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Wed Apr 01, 2020 7:42 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
The original motivating idea was not exactly what came out as the challenge: it was to have a routine which detects that a port is connected to something interesting as opposed to something else. An interesting connection will be pretty active on all pins. If the port isn't connected to something interesting, I don't want the main program to run, so the loop should not exit. The loop is running much faster than expected changes on the port: let's say we get at least 10 samples per pulse, for a not-especially-tightly-coded loop.

In this original scenario, it's not especially important if any transitions are missed, but I don't expect to miss any.

To make it a short and simple challenge (something which is fun) I added in the idea of waiting for two transitions. Waiting for one transition is too easy; counting to ten is too hard.

So, please have fun, and don't feel a need to tie down a specification.


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 01, 2020 3:09 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Right, I think it was implicit in the original spec that the signals would be slow enough to make a software routine viable, otherwise a software routine would not have been requested. For faster signals you could build hardware that performs a similar function, and the corresponding software would just reset it, then watch for its status. Having written a software routine, you can then analyse it to see how fast it runs and whether that's fast enough in practice.

As for detecting ten transitions… cracks knuckles
Code:
port = $xxxx
zp = $zz

prev = zp
trans = zp  ; 1-based array of ten
first = trans+1
tenth = trans+10

WaitPortTenTrans:
  ; initialise
  LDA port
  STA prev
  LDX #10
: STZ trans,X
  DEX
  BNE :-

  ; main loop
: LDA port
  TAY
  EOR prev
  STY prev
  TAY          ; mask of changed bits
  LDX #9
: AND trans,X    ; bits that have already changed this many times
  ORA first,X   ; set them in the next condition
  STA first,X
  TYA          ; return to bits just changed
  DEX
  BNE:-
  TSB first    ; record first transitions
  LDX tenth   ; now test to see if all final flags set
  INX
  BNE :--
  RTS
Obviously this is bigger and significantly slower than a routine which detects only two transitions per bit. It suffers slightly from the fact that TSB/TRB do not have indexed addressing modes, so I had to replace one TSB with an ORA/STA pair. I count exactly 200 cycles per sample; at 8MHz that is 25µs, so fmax of the input signals would be 20kHz.

More speed could be obtained by unrolling the inner loop, which would remove 46 cycles of loop overhead and permit using TSB again, at the expense of code size. I think that brings the total down to 120 cycles per loop, so that's significantly faster.

Another alternative approach, which saves ZP space now that there are more transitions than port bits, is to keep an individual counter per bit:
Code:
port = $xxxx
zp = $zz

prev = zp
waiting = zp+1  ; bitmask
trans = zp+1  ; 1-based array of eight
last = trans+8

WaitPortTenTrans:
  ; initialise
  LDA port
  STA prev
  LDA #$FF
  STA waiting
  LDA #10
  LDX #8
: STA trans,X
  DEX
  BNE :-

  ; main loop
: LDA port
  TAY
  EOR prev
  STY prev
  LDX #8
: ASL A         ; test each bit in turn for a transition
  BCC :++
  DEC trans,X  ; decrement corresponding counter, and check if it reached zero
  BNE :++

  ; somewhat nasty routine to clear corresponding bit in waiting mask
  PHX
  PHA
  LDA #0
  SEC
: ROR A
  DEX
  BNE :-
  TRB waiting
  PLA
  PLX

: DEX           ; move on to the next bit
  BNE :---
  LDX waiting  ; still waiting?
  BNE :----
  RTS
Timing analysis is more difficult here because the loop body is no longer branch-free. But this example can be used to test for an arbitrary number of transitions on all bits, up to the capacity of a byte. A simpler version could simply count the total number of bit transitions.


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 01, 2020 3:21 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1948
Location: Sacramento, CA, USA
BigEd wrote:
... In this original scenario, it's not especially important if any transitions are missed, but I don't expect to miss any ...

I'm sensing that you may have an actual real-life use case for this code? If I'm right, and it's not for a top-secret defense project, when are you gonna spill the beans to us?

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 01, 2020 3:42 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
Well, it's turned out that my idea probably has no traction, for other reasons... but it's in the context of PiTubeDirect, where a Raspberry Pi sits on a 6502 bus and emulates an 8 bit peripheral. The GPU is running a tight loop to interpret and respond to device accesses. I thought it might be a useful safety feature to check that we are on a bus and not just connected to some other system, such as might happen if someone put the wrong SD card into their Pi, or had their Pi connected to an FPGA but with a different design loaded. But one of the usual supported situations has the Pi behind a bus interface which doesn't pass through all the bus traffic, so in that case the bus looks quiet anyhow. Detecting a busy bus is no longer a useful tactic.


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 01, 2020 5:21 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1948
Location: Sacramento, CA, USA
Thanks for sharing! That was a fun little adventure, useful or not.

(I couldn't help but notice that hoglet was lurking during some of the thread activity)

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 03, 2020 9:37 pm 
Offline

Joined: Wed Mar 02, 2016 12:00 pm
Posts: 343
kakemoms wrote:
Alternately you can use two flip-flops (2-bit shift register) and a XOR per bit, then OR all the outputs and put them on the NMI/IRQ. That way you get an interrupt when one of the bits change.


Given the time delays of NMI/IRQ, and if you need this for every instance of I/O read (I was of the impression you wanted to look for bit changes as a trigger for reading the I/O data), you can connect the OR'ed input through a NOT to the SO pin and use overflow flag to trigger a read.

Code:
BVS *-2
LDA $IOREG





Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 03, 2020 11:03 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
With a WDC 6502, you can use SEI : WAI, and then IRQ has no latency at all as it just resumes execution.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 4:51 am 
Offline

Joined: Wed Mar 02, 2016 12:00 pm
Posts: 343
You can do WAI and IRQ, but I think it may require more cycles between each input than simply doing:
Code:
    LDX   #0
t   BVC   t
    LDA   $REG
    STA   $mem,X
    INX
    BNE   t

The circuit may look something like this: (sorry about the part numbers)
Attachment:
8-input-V-trigger.png
8-input-V-trigger.png [ 50.32 KiB | Viewed 629 times ]

Obviously this is not going to work since the SYNC will trigger at each instruction (but it would work if you only wanted to look for activity with BVC).
I then though that you may use one of the address pins as flip-flip trigger (instead of SYNC), but since the loop is 10 or 11 bytes long, that would require a trigger each 16'th byte which means that one could use the A4 output and insert some extra NOP's.
As a last option, you could do the input storage as a direct memory storage thing by using address lines as memory address (without A0), input lines as data to that storage memory, and only have the code:
Code:
BVC  *-2

repeating indefinitely or for as long as the input would require it to..


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 5:05 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
BVC takes 3 cycles on a taken branch. WAI takes 3 cycles to set up, then resumes normal execution as soon as an IRQ arrives. With I set, the interrupt handler is not entered and the usual overhead does not exist. So the performance is at worst similar, and you don't have to give up having a reliable V flag for arithmetic.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 9:16 pm 
Offline

Joined: Wed Mar 02, 2016 12:00 pm
Posts: 343
Chromatix wrote:
BVC takes 3 cycles on a taken branch. WAI takes 3 cycles to set up, then resumes normal execution as soon as an IRQ arrives. With I set, the interrupt handler is not entered and the usual overhead does not exist. So the performance is at worst similar, and you don't have to give up having a reliable V flag for arithmetic.

Ok! Thats interesting. I never used the WAI since it was a WDC instruction, but for new designs that would make sense unless you need the IRQ for something else.


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 06, 2020 11:01 pm 
Offline

Joined: Wed Mar 02, 2016 12:00 pm
Posts: 343
Chromatix wrote:
BVC takes 3 cycles on a taken branch. WAI takes 3 cycles to set up, then resumes normal execution as soon as an IRQ arrives. With I set, the interrupt handler is not entered and the usual overhead does not exist. So the performance is at worst similar, and you don't have to give up having a reliable V flag for arithmetic.


Thinking about it, the sync pulse would not be active if you use the WAI... so the flip-flops would not be transmitting and no IRQ signal would reach the mpu. It may work off the clock, but the IRQ would only trigger for one clock. Maybe its enough?


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 6:13 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
I believe /NMI is edge sensitive, but /IRQ is not - normally it would be sampled on a SYNC cycle. I think a single full cycle of IRQ is enough to wake up from WAI, though.


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 6:44 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
That's a good idea for an experiment!


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 8:18 am 
Offline

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 334
Chromatix wrote:
I believe /NMI is edge sensitive, but /IRQ is not - normally it would be sampled on a SYNC cycle.


Yes, I'm pretty sure that's right (although I'm not sure where /IRQ is sampled). /NMI can't be level sensitive, or it would interrupt its own handler before the source could be acknowledged. /IRQ doesn't have that problem, as interrupts are disabled while the handler is running. Instead, it has to be level sensitive, because another source might request an interrupt while the handler for the first is running, and you don't want to miss that.

Some Commodore 64 games used this behaviour of /NMI to protect again 'freezer' cartridges. These cartridges would assert /NMI and enable their ROM so they could handle it, then save the entire machine state to a file. That file can then be used to restart the game at the point it was frozen, bypassing any copy protection in its loader.

But what if the game triggers an NMI itself and never acknowledges it? The /NMI line will stay asserted, so when the cartridge tries to assert it again, nothing happens. The game might crash, because it now has the freezer's ROM in memory instead of its own code or data, but the freezer never gets to run.


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 8:25 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8538
Location: Southern California
BigEd wrote:
Chromatix wrote:
I believe /NMI is edge sensitive, but /IRQ is not - normally it would be sampled on a SYNC cycle.

That's a good idea for an experiment!

The Rockwell data sheet for the 'C02 says, "IRQ\ is sampled at the falling edge of Φ2 prior to the last cycle of the instruction." So if the IRQ\ falls (with enough setup time) before the last cycle starts, there would be potentially several cycles of an instruction where it would be caught in time to avoid starting the next instruction before the interrupt sequence starts.

I did not try this specifically when I was doing tests on the '816 as we talked about some time back, but one thing I did see is that if IRQ\ falls during the last half of the last cycle of a conditional branch that is taken, the branch is taken, but the first instruction at the new address is read and discarded, not executed, before the interrupt sequence starts. Again, that's on the '816; but I suspect it's the same on the '02.

For NMI\, the Rockwell data sheet says it's sampled during Φ2, but does not say anything about which cycle in the instruction. It does not say anything about SYNC being involved in the IRQ\ or NMI\ sampling, and neither does the WDC data sheet. RDY does work with SYNC, but that's a different matter.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 47 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: