Page 3 of 4
Re: 65816 COP instruction
Posted: Mon Sep 20, 2021 9:36 pm
by Jmstein7
How does one handle COP as a co-processor decoded instruction?
Page one of this thread touches on some ideas, but I have no direct knowledge of anyone actually implementing these ideas in real life.
Naturally, you need some hardware goodies to get the '816 and the co-processor to play nicely with each other, and some software goodies to tell them how to play, but that's about as much as I can offer at this stage of the game.
Understood. I jumped into this thread because this is the next step in a project I’m working on, connecting a 65c816 to an FPGA. The connection is made. Now I want to connect various IPs as “coprocessors.”
This one:
viewtopic.php?f=1&t=6788&start=15#p87207
Re: 65816 COP instruction
Posted: Mon Sep 20, 2021 10:08 pm
by BigDumbDinosaur
Wait, SEI? Why are we turning off interrupts?
If it gets there from the hardware reset line, the SEI won't be necessary as it's an automatic part of the reset sequence; however, if you jump there from software for any reason, you'll want to ignore interrupts until things are set up properly again. I've never done it that way myself.
Correct. It's a hedge against something jumping to the reset routine while IRQs are active. Otherwise, SEI would be redundant.
Re: 65816 COP instruction
Posted: Tue Sep 21, 2021 1:11 pm
by Jmstein7
Wait, SEI? Why are we turning off interrupts?
If it gets there from the hardware reset line, the SEI won't be necessary as it's an automatic part of the reset sequence; however, if you jump there from software for any reason, you'll want to ignore interrupts until things are set up properly again. I've never done it that way myself.
Correct. It's a hedge against something jumping to the reset routine while IRQs are active. Otherwise, SEI would be redundant.
Here is my solution (STKPC = $0B):
Code: Select all
CO_PRO SEI
phb ;save DB
phd ;save DP
rep #%00110000 ;select 16 bit registers
pha ;save .C
phx ;save .X
phy ;save .Y
LDA STKPC,S ;save pc location
DEC a ;less one
STA BITLOC ;save signature byte location
LDA BITLOC ;load sig byte
STA CPRO ;store sig byte
rep #%00110000 ;16 bit registers
ply ;restore .Y
plx ;restore .X
pla ;restore .C
pld ;restore DP
plb ;restore DB
CLI
rti ;resume foreground task
I wrapped it in my general 'c816 ISR parameters.
Re: 65816 COP instruction
Posted: Tue Sep 21, 2021 2:58 pm
by barrym95838
Here is my solution (STKPC = $0B):
Code: Select all
...
LDA STKPC,S ;save pc location
DEC a ;less one
STA BITLOC ;save signature byte location
LDA BITLOC ;load sig byte
STA CPRO ;store sig byte
...
I wrapped it in my general 'c816 ISR parameters.
I think the LDA BITLOC needs to be LDA (BITLOC). You're actually loading a word instead of a byte, but that should be harmless. You also seem to be assuming that your data bank coincides with the COP-caller's program bank, and that may also be harmless. That stack offset of $0B may need to be carefully investigated, but I have no evidence at hand regarding its correctness or lack thereof. Those might not be the only issues, but the remainder is beyond the scope of my experience.
BDD wrote a nice article
here. If you haven't seen it yet, you should give it a glance.
Re: 65816 COP instruction
Posted: Tue Sep 21, 2021 3:52 pm
by Jmstein7
Here is my solution (STKPC = $0B):
Code: Select all
...
LDA STKPC,S ;save pc location
DEC a ;less one
STA BITLOC ;save signature byte location
LDA BITLOC ;load sig byte
STA CPRO ;store sig byte
...
I wrapped it in my general 'c816 ISR parameters.
I think the LDA BITLOC needs to be LDA (BITLOC). You're actually loading a word instead of a byte, but that should be harmless. You also seem to be assuming that your data bank coincides with the COP-caller's program bank, and that may also be harmless. That stack offset of $0B may need to be carefully investigated, but I have no evidence at hand regarding its correctness or lack thereof. Those might not be the only issues, but the remainder is beyond the scope of my experience.
BDD wrote a nice article
here. If you haven't seen it yet, you should give it a glance.
Ugh, you're right. I want to get the value stored at the address location STORED explicitly at BITLOC. So confusing!
Jon
Re: 65816 COP instruction
Posted: Tue Sep 21, 2021 8:06 pm
by BigDumbDinosaur
Here is my solution (STKPC = $0B):
Code: Select all
CO_PRO SEI
phb ;save DB
phd ;save DP
rep #%00110000 ;select 16 bit registers
pha ;save .C
phx ;save .X
phy ;save .Y
LDA STKPC,S ;save pc location
DEC a ;less one
STA BITLOC ;save signature byte location
LDA BITLOC ;load sig byte
STA CPRO ;store sig byte
rep #%00110000 ;16 bit registers
ply ;restore .Y
plx ;restore .X
pla ;restore .C
pld ;restore DP
plb ;restore DB
CLI
rti ;resume foreground task
I wrapped it in my general 'c816 ISR parameters.
As noted by Mike, LDA BITLOC needs to be LDA (BITLOC). Also, as he noted, you are assuming DB = PB, which may not be the case. Prior to fetching the signature you need to load DB with the stack copy of PB to guarantee you are fetching the signature from the bank in which the interrupted program resides.
The SEI at the beginning of your ISR and the CLI at the end are unnecessary. All interrupts, be they hard or soft, automatically set the I bit in SR (status register) after SR has been pushed. When RTI is executed at the end of the ISR SR will be restored as it was prior to the interrupt, which means the state of the I bit will be likewise restored—IRQs will be re-enabled if they were enabled at the time of the interrupt. The MPU does this for you at no extra charge.
Of course, you can't afford to forget that if you are using COP as an operating system API calling mechanism your system might deadlock if IRQs remain disabled while processing an API.
It is useful in any function that involves stack accesses to define local symbols for stack offsets. For example, in my POC firmware, I have the following preamble for the COP handler:
Code: Select all
;===============================================================================
;
;icop: COPROCESSOR INTERRUPT SERVICE ROUTINE
;
icop rep #%00110000 ;16-bit everything
phb ;save machine state
phd
pha
phx
phy
;
;———————————————————————————————
;COP REGISTER STACK FRAME
;
cop_yrx =1 ;.Y
cop_xrx =cop_yrx+s_word ;.X
cop_arx =cop_xrx+s_word ;.A
cop_dpx =cop_arx+s_word ;DP
cop_dbx =cop_dpx+s_mpudpx ;DB
cop_srx =cop_dbx+s_mpudbx ;SR
cop_pcx =cop_srx+s_mpusrx ;PC
cop_pbx =cop_pcx+s_mpupcx ;PB
;———————————————————————————————
;
jmp (ivcop) ;COP indirect vector
Constants such as S_MPUDPX and S_WORD are externally defined, with the former defining the size in bytes of DP (direct page register) and the latter defining the size of a word (symbols that start with S_ are "size-of" constants). I, as a rule, never hard-code such constants as instruction operands and always define them in an INCLUDE file to assure all programs that refer to such constants are singing from the same hymnal, so to speak. With the above arrangement, I can, for example, pick up what was in PB at the time of the interrupt with LDA COP_PBX,S.
Getting back to the program, so to speak, there is a method of fetching the signature that uses less code and no direct page space. It takes advantage of the fact that if a base address is $0000 it is possible to load from anywhere in the 64KB address space of any bank with a 16-bit register offset. The following procedure starts after the above preamble:
Code: Select all
; continuation of COP ISR after jumping thru IVCOP...
;
icopa cli ;resume IRQ processing
rep #%00110000 ;16-bit everything
lda cop_pcx,S ;get RTI address
dec A ;point at signature
tax ;use as index
sep #%00100000 ;8-bit accumulator
lda cop_pbx,S ;calling bank
pha ;set as a...
plb ;temporary data bank
lda \2$00,X ;fetch signature...
;
; ———————————————————————————————————————————————————————
; The \2 prefix forces absolute addressing. Absent that,
; the fetch would be from bank $00, not from the bank in
; DB.
; ———————————————————————————————————————————————————————
...program continues...
Something any ISR needs to consider is the state of DP. You cannot assume DP is pointing to your operating system kernel's direct page, since the foreground may well have pointed DP somewhere else. Ergo you might have to add some code to set DP as required. Otherwise, you may end up wearing out the reset button. 
Similarly, you may need to point DB to where the kernel's data tables and other structures, e.g., buffers and queues, are located. Most often they would be in the same bank as the kernel, which means a simple PHK - PLB sequence is all that is needed. Otherwise, you will have to load the correct value for DB into a 8-bit register, push it and then execute PLB.
Finally, it's useful to define a common interrupt return that all ISRs can use. Here's what I have set up:
Code: Select all
;crti: COMMON INTERRUPT RETURN
;
crti rep #%00110000 ;16-bit everything
ply ;restore MPU state
plx
pla
pld
plb
rti
All of the above and more is covered here.
Re: 65816 COP instruction
Posted: Tue Sep 21, 2021 10:43 pm
by Jmstein7
Here is my solution (STKPC = $0B):
Code: Select all
CO_PRO SEI
phb ;save DB
phd ;save DP
rep #%00110000 ;select 16 bit registers
pha ;save .C
phx ;save .X
phy ;save .Y
LDA STKPC,S ;save pc location
DEC a ;less one
STA BITLOC ;save signature byte location
LDA BITLOC ;load sig byte
STA CPRO ;store sig byte
rep #%00110000 ;16 bit registers
ply ;restore .Y
plx ;restore .X
pla ;restore .C
pld ;restore DP
plb ;restore DB
CLI
rti ;resume foreground task
I wrapped it in my general 'c816 ISR parameters.
As noted by Mike, LDA BITLOC needs to be LDA (BITLOC). Also, as he noted, you are assuming DB = PB, which may not be the case. Prior to fetching the signature you need to load DB with the stack copy of PB to guarantee you are fetching the signature from the bank in which the interrupted program resides.
The SEI at the beginning of your ISR and the CLI at the end are unnecessary. All interrupts, be they hard or soft, automatically set the I bit in SR (status register) after SR has been pushed. When RTI is executed at the end of the ISR SR will be restored as it was prior to the interrupt, which means the state of the I bit will be likewise restored—IRQs will be re-enabled if they were enabled at the time of the interrupt. The MPU does this for you at no extra charge.
Of course, you can't afford to forget that if you are using COP as an operating system API calling mechanism your system might deadlock if IRQs remain disabled while processing an API.
It is useful in any function that involves stack accesses to define local symbols for stack offsets. For example, in my POC firmware, I have the following preamble for the COP handler:
Code: Select all
;===============================================================================
;
;icop: COPROCESSOR INTERRUPT SERVICE ROUTINE
;
icop rep #%00110000 ;16-bit everything
phb ;save machine state
phd
pha
phx
phy
;
;———————————————————————————————
;COP REGISTER STACK FRAME
;
cop_yrx =1 ;.Y
cop_xrx =cop_yrx+s_word ;.X
cop_arx =cop_xrx+s_word ;.A
cop_dpx =cop_arx+s_word ;DP
cop_dbx =cop_dpx+s_mpudpx ;DB
cop_srx =cop_dbx+s_mpudbx ;SR
cop_pcx =cop_srx+s_mpusrx ;PC
cop_pbx =cop_pcx+s_mpupcx ;PB
;———————————————————————————————
;
jmp (ivcop) ;COP indirect vector
Constants such as S_MPUDPX and S_WORD are externally defined, with the former defining the size in bytes of DP (direct page register) and the latter defining the size of a word (symbols that start with S_ are "size-of" constants). I, as a rule, never hard-code such constants as instruction operands and always define them in an INCLUDE file to assure all programs that refer to such constants are singing from the same hymnal, so to speak. With the above arrangement, I can, for example, pick up what was in PB at the time of the interrupt with LDA COP_PBX,S.
Getting back to the program, so to speak, there is a method of fetching the signature that uses less code and no direct page space. It takes advantage of the fact that if a base address is $0000 it is possible to load from anywhere in the 64KB address space of any bank with a 16-bit register offset. The following procedure starts after the above preamble:
Code: Select all
; continuation of COP ISR after jumping thru IVCOP...
;
icopa cli ;resume IRQ processing
rep #%00110000 ;16-bit everything
lda cop_pcx,S ;get RTI address
dec A ;point at signature
tax ;use as index
sep #%00100000 ;8-bit accumulator
lda cop_pbx,S ;calling bank
pha ;set as a...
plb ;temporary data bank
lda \2$00,X ;fetch signature...
;
; ———————————————————————————————————————————————————————
; The \2 prefix forces absolute addressing. Absent that,
; the fetch would be from bank $00, not from the bank in
; DB.
; ———————————————————————————————————————————————————————
...program continues...
Something any ISR needs to consider is the state of DP. You cannot assume DP is pointing to your operating system kernel's direct page, since the foreground may well have pointed DP somewhere else. Ergo you might have to add some code to set DP as required. Otherwise, you may end up wearing out the reset button. 
Similarly, you may need to point DB to where the kernel's data tables and other structures, e.g., buffers and queues, are located. Most often they would be in the same bank as the kernel, which means a simple PHK - PLB sequence is all that is needed. Otherwise, you will have to load the correct value for DB into a 8-bit register, push it and then execute PLB.
Finally, it's useful to define a common interrupt return that all ISRs can use. Here's what I have set up:
Code: Select all
;crti: COMMON INTERRUPT RETURN
;
crti rep #%00110000 ;16-bit everything
ply ;restore MPU state
plx
pla
pld
plb
rti
All of the above and more is covered here.
That is a
lot to digest! I'm still getting used to the whole 16-bit thing. When the processor moves 16 bits of data at a time, I'm not quite sure how to get my FPGA data in/out to play nice.
Re: 65816 COP instruction
Posted: Tue Sep 21, 2021 10:53 pm
by barrym95838
I'm still getting used to the whole 16-bit thing. When the processor moves 16 bits of data at a time, I'm not quite sure how to get my FPGA data in/out to play nice.
The 16-bit stuff is only internal. Externally it's still 8-bit, so the only real hardware complication is dealing with the way it sneezes out bank addresses, should you choose to utilize them.
Re: 65816 COP instruction
Posted: Tue Sep 21, 2021 11:11 pm
by Jmstein7
I'm still getting used to the whole 16-bit thing. When the processor moves 16 bits of data at a time, I'm not quite sure how to get my FPGA data in/out to play nice.
The 16-bit stuff is only internal. Externally it's still 8-bit, so the only real hardware complication is dealing with the way it sneezes out bank addresses, should you choose to utilize them.
I haven’t utilized them yet, but “always_latch” in system verilog makes short work of grabbing the bank addresses.
Re: 65816 COP instruction
Posted: Tue Sep 21, 2021 11:43 pm
by BigDumbDinosaur
Not really. The concepts are what matter, not the code.
I'm still getting used to the whole 16-bit thing. When the processor moves 16 bits of data at a time, I'm not quite sure how to get my FPGA data in/out to play nice.
What Mike said.
Fetches and stores occur eight bits at a time, even when the register doing the fetching and storing is set to 16 bits. The first clock cycle that accesses memory or I/O accesses the least significant byte, which is at the target address. The next clock cycle accesses the most significant byte, which is at the target address +1. Needless to say, if you are accessing a chip register you want the MPU register doing the accessing to be set to 8 bits.
All of this is extensively covered in the Eyes & Lichty programming manual.
Re: 65816 COP instruction
Posted: Wed Sep 22, 2021 8:10 am
by BigEd
"sneezes"!!
Re: 65816 COP instruction
Posted: Wed Sep 22, 2021 11:10 am
by Jmstein7
Ed, what about what you said before - the hardware method? Can you elucidate?
J
Re: 65816 COP instruction
Posted: Wed Sep 22, 2021 11:16 am
by BigEd
All I have is Sam's sketch in the second post of this thread. Once you know the 6502 family well enough, and you're adept enough (Jeff is a good example) you can do this kind of thing.
Note that in the fourth post, Sam (kc5tja) offers the idea that handling COP as an SWI is a fallback for systems which lack a hardware copro.
(Although Sam is still a relatively active hacker, he only very rarely visits the forum.)
Re: 65816 COP instruction
Posted: Wed Sep 22, 2021 12:27 pm
by Jmstein7
All I have is Sam's sketch in the second post of this thread. Once you know the 6502 family well enough, and you're adept enough (Jeff is a good example) you can do this kind of thing.
Note that in the fourth post, Sam (kc5tja) offers the idea that handling COP as an SWI is a fallback for systems which lack a hardware copro.
(Although Sam is still a relatively active hacker, he only very rarely visits the forum.)
Ed,
When he says "presenting a no-op," does he mean sending $EA back to the 65c816? I'm confused, because Jeff then says, "in this thread the term NOP doesn't necessarily imply $EA, the official NOP." Okay... then what
does it mean? How do you "feed" the 'c816 no-ops from your external hardware?

Re: 65816 COP instruction
Posted: Wed Sep 22, 2021 12:40 pm
by BigEd
The rest of Jeff's sentence should help you. Without wishing to discourage you, it does appear that you need to read more carefully and think more deeply: this is not easy stuff, it requires study and consideration. What you're aiming to do is build up a mental model of the cycle by cycle behaviour of an MPU in its system, so you know what's happening. This isn't going to work if you just jump from one guess to another: the questions are harder than you think.
What's more, you risk running everyone out of patience: there are half a dozen people here with enough goodwill to help you, but unfortunately that's not an unlimited resource.
Here's what Jeff said. Get paper and pencil and work through cycle by cycle what happens when an MPU reads an instruction. This is going to take you at least a couple of hours.
BTW in case anyone's wondering, in this thread the term NOP doesn't necessarily imply $EA, the official NOP. $EA isn't ideal for the job of manipulating the '816 to read (but ignore) a series of bytes at PC because it only accesses one byte every 2 cycles. A better choice is to feed the '816 WDM ($42) -- a two-byte NOP that executes in 2 cycles.