65816 COP instruction

Jmstein7 · Post by **Jmstein7** » Mon Sep 20, 2021 9:36 pm

barrym95838 wrote:

Jmstein7 wrote:

How does one handle COP as a co-processor decoded instruction?

Page one of this thread touches on some ideas, but I have no direct knowledge of anyone actually implementing these ideas in real life.

Naturally, you need some hardware goodies to get the '816 and the co-processor to play nicely with each other, and some software goodies to tell them how to play, but that's about as much as I can offer at this stage of the game.

Understood. I jumped into this thread because this is the next step in a project I’m working on, connecting a 65c816 to an FPGA. The connection is made. Now I want to connect various IPs as “coprocessors.”

This one: viewtopic.php?f=1&t=6788&start=15#p87207

BigDumbDinosaur · Post by **BigDumbDinosaur** » Mon Sep 20, 2021 10:08 pm

GARTHWILSON wrote:

Jmstein7 wrote:

Wait, SEI? Why are we turning off interrupts?

If it gets there from the hardware reset line, the SEI won't be necessary as it's an automatic part of the reset sequence; however, if you jump there from software for any reason, you'll want to ignore interrupts until things are set up properly again. I've never done it that way myself.

Correct. It's a hedge against something jumping to the reset routine while IRQs are active. Otherwise, SEI would be redundant.

Jmstein7 · Post by **Jmstein7** » Tue Sep 21, 2021 1:11 pm

BigDumbDinosaur wrote:

GARTHWILSON wrote:

Jmstein7 wrote:

Wait, SEI? Why are we turning off interrupts?

If it gets there from the hardware reset line, the SEI won't be necessary as it's an automatic part of the reset sequence; however, if you jump there from software for any reason, you'll want to ignore interrupts until things are set up properly again. I've never done it that way myself.

Correct. It's a hedge against something jumping to the reset routine while IRQs are active. Otherwise, SEI would be redundant.

Here is my solution (STKPC = $0B):

Code: Select all

CO_PRO      SEI
            phb                   ;save DB
            phd                   ;save DP
            rep #%00110000        ;select 16 bit registers
            pha                   ;save .C
            phx                   ;save .X
            phy                   ;save .Y
            LDA  STKPC,S          ;save pc location
            DEC  a                ;less one
            STA  BITLOC           ;save signature byte location
            LDA  BITLOC           ;load sig byte
            STA  CPRO             ;store sig byte
            rep #%00110000        ;16 bit registers
            ply                   ;restore .Y
            plx                   ;restore .X
            pla                   ;restore .C
            pld                   ;restore DP
            plb                   ;restore DB
            CLI
            rti                   ;resume foreground task

I wrapped it in my general 'c816 ISR parameters.

barrym95838 · Post by **barrym95838** » Tue Sep 21, 2021 2:58 pm

Jmstein7 wrote:

Here is my solution (STKPC = $0B):

Code: Select all

...
            LDA  STKPC,S          ;save pc location
            DEC  a                ;less one
            STA  BITLOC           ;save signature byte location
            LDA  BITLOC           ;load sig byte
            STA  CPRO             ;store sig byte
...

I wrapped it in my general 'c816 ISR parameters.

I think the LDA BITLOC needs to be LDA (BITLOC). You're actually loading a word instead of a byte, but that should be harmless. You also seem to be assuming that your data bank coincides with the COP-caller's program bank, and that may also be harmless. That stack offset of $0B may need to be carefully investigated, but I have no evidence at hand regarding its correctness or lack thereof. Those might not be the only issues, but the remainder is beyond the scope of my experience.

BDD wrote a nice article here. If you haven't seen it yet, you should give it a glance.

Jmstein7 · Post by **Jmstein7** » Tue Sep 21, 2021 3:52 pm

barrym95838 wrote:

Jmstein7 wrote:

Here is my solution (STKPC = $0B):

Code: Select all

...
            LDA  STKPC,S          ;save pc location
            DEC  a                ;less one
            STA  BITLOC           ;save signature byte location
            LDA  BITLOC           ;load sig byte
            STA  CPRO             ;store sig byte
...

I wrapped it in my general 'c816 ISR parameters.

I think the LDA BITLOC needs to be LDA (BITLOC). You're actually loading a word instead of a byte, but that should be harmless. You also seem to be assuming that your data bank coincides with the COP-caller's program bank, and that may also be harmless. That stack offset of $0B may need to be carefully investigated, but I have no evidence at hand regarding its correctness or lack thereof. Those might not be the only issues, but the remainder is beyond the scope of my experience.

BDD wrote a nice article here. If you haven't seen it yet, you should give it a glance.

Ugh, you're right. I want to get the value stored at the address location STORED explicitly at BITLOC. So confusing!

Jon

BigDumbDinosaur · Post by **BigDumbDinosaur** » Tue Sep 21, 2021 8:06 pm

Jmstein7 wrote:

Here is my solution (STKPC = $0B):

Code: Select all

CO_PRO      SEI
            phb                   ;save DB
            phd                   ;save DP
            rep #%00110000        ;select 16 bit registers
            pha                   ;save .C
            phx                   ;save .X
            phy                   ;save .Y
            LDA  STKPC,S          ;save pc location
            DEC  a                ;less one
            STA  BITLOC           ;save signature byte location
            LDA  BITLOC           ;load sig byte
            STA  CPRO             ;store sig byte
            rep #%00110000        ;16 bit registers
            ply                   ;restore .Y
            plx                   ;restore .X
            pla                   ;restore .C
            pld                   ;restore DP
            plb                   ;restore DB
            CLI
            rti                   ;resume foreground task

I wrapped it in my general 'c816 ISR parameters.

As noted by Mike, LDA BITLOC needs to be LDA (BITLOC). Also, as he noted, you are assuming DB = PB, which may not be the case. Prior to fetching the signature you need to load DB with the stack copy of PB to guarantee you are fetching the signature from the bank in which the interrupted program resides.

The SEI at the beginning of your ISR and the CLI at the end are unnecessary. All interrupts, be they hard or soft, automatically set the I bit in SR (status register) after SR has been pushed. When RTI is executed at the end of the ISR SR will be restored as it was prior to the interrupt, which means the state of the I bit will be likewise restored—IRQs will be re-enabled if they were enabled at the time of the interrupt. The MPU does this for you at no extra charge.

Of course, you can't afford to forget that if you are using COP as an operating system API calling mechanism your system might deadlock if IRQs remain disabled while processing an API.

It is useful in any function that involves stack accesses to define local symbols for stack offsets. For example, in my POC firmware, I have the following preamble for the COP handler:

Code: Select all

;===============================================================================
;
;icop: COPROCESSOR INTERRUPT SERVICE ROUTINE
;
icop	  rep #%00110000        ;16-bit everything
         phb                   ;save machine state
         phd
         pha
         phx
         phy
;
;———————————————————————————————
;COP REGISTER STACK FRAME
;
cop_yrx  =1                    ;.Y
cop_xrx  =cop_yrx+s_word       ;.X
cop_arx  =cop_xrx+s_word       ;.A
cop_dpx  =cop_arx+s_word       ;DP
cop_dbx  =cop_dpx+s_mpudpx     ;DB
cop_srx  =cop_dbx+s_mpudbx     ;SR
cop_pcx  =cop_srx+s_mpusrx     ;PC
cop_pbx  =cop_pcx+s_mpupcx     ;PB
;———————————————————————————————
;
         jmp (ivcop)           ;COP indirect vector

Constants such as S_MPUDPX and S_WORD are externally defined, with the former defining the size in bytes of DP (direct page register) and the latter defining the size of a word (symbols that start with S_ are "size-of" constants). I, as a rule, never hard-code such constants as instruction operands and always define them in an INCLUDE file to assure all programs that refer to such constants are singing from the same hymnal, so to speak. With the above arrangement, I can, for example, pick up what was in PB at the time of the interrupt with LDA COP_PBX,S.

Getting back to the program, so to speak, there is a method of fetching the signature that uses less code and no direct page space. It takes advantage of the fact that if a base address is $0000 it is possible to load from anywhere in the 64KB address space of any bank with a 16-bit register offset. The following procedure starts after the above preamble:

Code: Select all

;	continuation of COP ISR after jumping thru IVCOP...
;
icopa    cli                   ;resume IRQ processing
         rep #%00110000        ;16-bit everything
         lda cop_pcx,S         ;get RTI address
         dec A                 ;point at signature
         tax                   ;use as index
         sep #%00100000        ;8-bit accumulator
         lda cop_pbx,S         ;calling bank
         pha                   ;set as a...
         plb                   ;temporary data bank
         lda \2$00,X           ;fetch signature...
;
;	———————————————————————————————————————————————————————
;	The \2 prefix forces absolute addressing.  Absent that,
;	the fetch would be from bank $00, not from the bank in
;	DB.
;	———————————————————————————————————————————————————————

...program continues...

Something any ISR needs to consider is the state of DP. You cannot assume DP is pointing to your operating system kernel's direct page, since the foreground may well have pointed DP somewhere else. Ergo you might have to add some code to set DP as required. Otherwise, you may end up wearing out the reset button.

Similarly, you may need to point DB to where the kernel's data tables and other structures, e.g., buffers and queues, are located. Most often they would be in the same bank as the kernel, which means a simple PHK - PLB sequence is all that is needed. Otherwise, you will have to load the correct value for DB into a 8-bit register, push it and then execute PLB.

Finally, it's useful to define a common interrupt return that all ISRs can use. Here's what I have set up:

Code: Select all

;crti: COMMON INTERRUPT RETURN
;
crti     rep #%00110000        ;16-bit everything
         ply                   ;restore MPU state
         plx
         pla
         pld
         plb
         rti

All of the above and more is covered here.

Jmstein7 · Post by **Jmstein7** » Tue Sep 21, 2021 10:43 pm

BigDumbDinosaur wrote:

Jmstein7 wrote:

Here is my solution (STKPC = $0B):

Code: Select all

CO_PRO      SEI
            phb                   ;save DB
            phd                   ;save DP
            rep #%00110000        ;select 16 bit registers
            pha                   ;save .C
            phx                   ;save .X
            phy                   ;save .Y
            LDA  STKPC,S          ;save pc location
            DEC  a                ;less one
            STA  BITLOC           ;save signature byte location
            LDA  BITLOC           ;load sig byte
            STA  CPRO             ;store sig byte
            rep #%00110000        ;16 bit registers
            ply                   ;restore .Y
            plx                   ;restore .X
            pla                   ;restore .C
            pld                   ;restore DP
            plb                   ;restore DB
            CLI
            rti                   ;resume foreground task

I wrapped it in my general 'c816 ISR parameters.

As noted by Mike, LDA BITLOC needs to be LDA (BITLOC). Also, as he noted, you are assuming DB = PB, which may not be the case. Prior to fetching the signature you need to load DB with the stack copy of PB to guarantee you are fetching the signature from the bank in which the interrupted program resides.

The SEI at the beginning of your ISR and the CLI at the end are unnecessary. All interrupts, be they hard or soft, automatically set the I bit in SR (status register) after SR has been pushed. When RTI is executed at the end of the ISR SR will be restored as it was prior to the interrupt, which means the state of the I bit will be likewise restored—IRQs will be re-enabled if they were enabled at the time of the interrupt. The MPU does this for you at no extra charge.

Of course, you can't afford to forget that if you are using COP as an operating system API calling mechanism your system might deadlock if IRQs remain disabled while processing an API.

It is useful in any function that involves stack accesses to define local symbols for stack offsets. For example, in my POC firmware, I have the following preamble for the COP handler:

Code: Select all

;===============================================================================
;
;icop: COPROCESSOR INTERRUPT SERVICE ROUTINE
;
icop	  rep #%00110000        ;16-bit everything
         phb                   ;save machine state
         phd
         pha
         phx
         phy
;
;———————————————————————————————
;COP REGISTER STACK FRAME
;
cop_yrx  =1                    ;.Y
cop_xrx  =cop_yrx+s_word       ;.X
cop_arx  =cop_xrx+s_word       ;.A
cop_dpx  =cop_arx+s_word       ;DP
cop_dbx  =cop_dpx+s_mpudpx     ;DB
cop_srx  =cop_dbx+s_mpudbx     ;SR
cop_pcx  =cop_srx+s_mpusrx     ;PC
cop_pbx  =cop_pcx+s_mpupcx     ;PB
;———————————————————————————————
;
         jmp (ivcop)           ;COP indirect vector

Constants such as S_MPUDPX and S_WORD are externally defined, with the former defining the size in bytes of DP (direct page register) and the latter defining the size of a word (symbols that start with S_ are "size-of" constants). I, as a rule, never hard-code such constants as instruction operands and always define them in an INCLUDE file to assure all programs that refer to such constants are singing from the same hymnal, so to speak. With the above arrangement, I can, for example, pick up what was in PB at the time of the interrupt with LDA COP_PBX,S.

Getting back to the program, so to speak, there is a method of fetching the signature that uses less code and no direct page space. It takes advantage of the fact that if a base address is $0000 it is possible to load from anywhere in the 64KB address space of any bank with a 16-bit register offset. The following procedure starts after the above preamble:

Code: Select all

;	continuation of COP ISR after jumping thru IVCOP...
;
icopa    cli                   ;resume IRQ processing
         rep #%00110000        ;16-bit everything
         lda cop_pcx,S         ;get RTI address
         dec A                 ;point at signature
         tax                   ;use as index
         sep #%00100000        ;8-bit accumulator
         lda cop_pbx,S         ;calling bank
         pha                   ;set as a...
         plb                   ;temporary data bank
         lda \2$00,X           ;fetch signature...
;
;	———————————————————————————————————————————————————————
;	The \2 prefix forces absolute addressing.  Absent that,
;	the fetch would be from bank $00, not from the bank in
;	DB.
;	———————————————————————————————————————————————————————

...program continues...

Something any ISR needs to consider is the state of DP. You cannot assume DP is pointing to your operating system kernel's direct page, since the foreground may well have pointed DP somewhere else. Ergo you might have to add some code to set DP as required. Otherwise, you may end up wearing out the reset button.

Similarly, you may need to point DB to where the kernel's data tables and other structures, e.g., buffers and queues, are located. Most often they would be in the same bank as the kernel, which means a simple PHK - PLB sequence is all that is needed. Otherwise, you will have to load the correct value for DB into a 8-bit register, push it and then execute PLB.

Finally, it's useful to define a common interrupt return that all ISRs can use. Here's what I have set up:

Code: Select all

;crti: COMMON INTERRUPT RETURN
;
crti     rep #%00110000        ;16-bit everything
         ply                   ;restore MPU state
         plx
         pla
         pld
         plb
         rti

All of the above and more is covered here.

That is a lot to digest! I'm still getting used to the whole 16-bit thing. When the processor moves 16 bits of data at a time, I'm not quite sure how to get my FPGA data in/out to play nice.

barrym95838 · Post by **barrym95838** » Tue Sep 21, 2021 10:53 pm

Jmstein7 wrote:

I'm still getting used to the whole 16-bit thing. When the processor moves 16 bits of data at a time, I'm not quite sure how to get my FPGA data in/out to play nice.

The 16-bit stuff is only internal. Externally it's still 8-bit, so the only real hardware complication is dealing with the way it sneezes out bank addresses, should you choose to utilize them.

Jmstein7 · Post by **Jmstein7** » Tue Sep 21, 2021 11:11 pm

barrym95838 wrote:

Jmstein7 wrote:

I'm still getting used to the whole 16-bit thing. When the processor moves 16 bits of data at a time, I'm not quite sure how to get my FPGA data in/out to play nice.

The 16-bit stuff is only internal. Externally it's still 8-bit, so the only real hardware complication is dealing with the way it sneezes out bank addresses, should you choose to utilize them.

I haven’t utilized them yet, but “always_latch” in system verilog makes short work of grabbing the bank addresses.

BigDumbDinosaur · Post by **BigDumbDinosaur** » Tue Sep 21, 2021 11:43 pm

Jmstein7 wrote:

That is a lot to digest!

Not really. The concepts are what matter, not the code.

Quote:

I'm still getting used to the whole 16-bit thing. When the processor moves 16 bits of data at a time, I'm not quite sure how to get my FPGA data in/out to play nice.

What Mike said.

Fetches and stores occur eight bits at a time, even when the register doing the fetching and storing is set to 16 bits. The first clock cycle that accesses memory or I/O accesses the least significant byte, which is at the target address. The next clock cycle accesses the most significant byte, which is at the target address +1. Needless to say, if you are accessing a chip register you want the MPU register doing the accessing to be set to 8 bits.

All of this is extensively covered in the Eyes & Lichty programming manual.

BigEd · Post by **BigEd** » Wed Sep 22, 2021 8:10 am

"sneezes"!!

Jmstein7 · Post by **Jmstein7** » Wed Sep 22, 2021 11:10 am

BigEd wrote:

"sneezes"!!

Ed, what about what you said before - the hardware method? Can you elucidate?

J

BigEd · Post by **BigEd** » Wed Sep 22, 2021 11:16 am

All I have is Sam's sketch in the second post of this thread. Once you know the 6502 family well enough, and you're adept enough (Jeff is a good example) you can do this kind of thing.

Note that in the fourth post, Sam (kc5tja) offers the idea that handling COP as an SWI is a fallback for systems which lack a hardware copro.

(Although Sam is still a relatively active hacker, he only very rarely visits the forum.)

Jmstein7 · Post by **Jmstein7** » Wed Sep 22, 2021 12:27 pm

BigEd wrote:

All I have is Sam's sketch in the second post of this thread. Once you know the 6502 family well enough, and you're adept enough (Jeff is a good example) you can do this kind of thing.

Note that in the fourth post, Sam (kc5tja) offers the idea that handling COP as an SWI is a fallback for systems which lack a hardware copro.

(Although Sam is still a relatively active hacker, he only very rarely visits the forum.)

Ed,

When he says "presenting a no-op," does he mean sending $EA back to the 65c816? I'm confused, because Jeff then says, "in this thread the term NOP doesn't necessarily imply $EA, the official NOP." Okay... then what does it mean? How do you "feed" the 'c816 no-ops from your external hardware?

BigEd · Post by **BigEd** » Wed Sep 22, 2021 12:40 pm

The rest of Jeff's sentence should help you. Without wishing to discourage you, it does appear that you need to read more carefully and think more deeply: this is not easy stuff, it requires study and consideration. What you're aiming to do is build up a mental model of the cycle by cycle behaviour of an MPU in its system, so you know what's happening. This isn't going to work if you just jump from one guess to another: the questions are harder than you think.

What's more, you risk running everyone out of patience: there are half a dozen people here with enough goodwill to help you, but unfortunately that's not an unlimited resource.

Here's what Jeff said. Get paper and pencil and work through cycle by cycle what happens when an MPU reads an instruction. This is going to take you at least a couple of hours.

Dr Jefyll wrote:

BTW in case anyone's wondering, in this thread the term NOP doesn't necessarily imply $EA, the official NOP. $EA isn't ideal for the job of manipulating the '816 to read (but ignore) a series of bytes at PC because it only accesses one byte every 2 cycles. A better choice is to feed the '816 WDM ($42) -- a two-byte NOP that executes in 2 cycles.

65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction

Re: 65816 COP instruction