BRK opcode without using SYNC
BRK opcode without using SYNC
I thought others might be interested in Brad Taylor's doc about using external hardware to make use of the 1-byte operand of BRK.
He explains how a BRK can be spotted by external hardware even when SYNC is not available - as on the NES - by watching for three consecutive write accesses.
The same idea could be used to relocate or modify all the vectors at top of memory (or top of bank 0) - which allows some extra freedom in the memory map.
Gideon Zweijtzer has written a document "Safely freezing the C64 on an asynchronous event"
which explores other useful ways of reacting to the pattern of reads and writes. (I found it best to read the document from Google's cache.)
He explains how a BRK can be spotted by external hardware even when SYNC is not available - as on the NES - by watching for three consecutive write accesses.
The same idea could be used to relocate or modify all the vectors at top of memory (or top of bank 0) - which allows some extra freedom in the memory map.
Gideon Zweijtzer has written a document "Safely freezing the C64 on an asynchronous event"
which explores other useful ways of reacting to the pattern of reads and writes. (I found it best to read the document from Google's cache.)
Completely untested as yet, but would something like this work:
Code: Select all
IRQBRKV: cld ; A source of no end of problems
phx
tsx
pha
inx
inx
lda 0x0100,x ; Get stacked status register
and #0x10 ; B bit set?
bne 1$ ; If so, it's a BRK
pla
plx
jmp [Z_IRQV] ; If not, jump to IRQ handler
1$: jmp BRK_DISP
Code: Select all
BRK_DISP:: inx
lda 0x0100,x ; Get stacked return address
sec
sbc #0x01 ; Decrement it
sta Z_BRKTMP+1 ; And store in zero page
inx
lda 0x0100,x
sbc #0x00
sta Z_BRKTMP+2
ldx #0x00
lda [Z_BRKTMP+1] ; At that address is the second byte of BRK
clc
rol ; Shift left
sta Z_BRKTMP+1 ; Forms low byte of vector
rol
and #0x01 ; This was bit 7 of the second byte of BRK
ora #>BRK_VECTBL ; High byte of BRK vector table address
; (must be aligned on 512-byte boundary)
sta Z_BRKTMP+2 ; High byte of vector
lda #0x6C ; Opcode for indirect JMP
sta Z_BRKTMP
pla ; Restore registers to entry state
plx
jsr Z_BRKTMP ; Call addressed subroutine
rti
This probably should go into its own thread. I don't see any relationship with detecting BRK at the hardware level versus how to discover BRK in software.
That being said, instead of:
you could just write:
Otherwise, your code looks reasonable on the surface.
As another simplification, I would not push and pull registers so much though. I'd keep the interrupted process context on the stack, like so:
This has some advantages -- for starters, you have the ability to tweak the interrupted code's registers if appropriate (for example, if you're using BRK to invoke OS calls, your OS routines can use the stacked registers to provide return values). Somewhat related, you can also update the calling program's PC to point back at the BRK instruction, with updated register values, to 'restart' an interrupted operation. This is called PC-LSRing (see http://www.falvotech.com/content/public ... pclsr.html )
That being said, instead of:
Code: Select all
inx
inx
lda $0100,x
Code: Select all
lda $0102,x
As another simplification, I would not push and pull registers so much though. I'd keep the interrupted process context on the stack, like so:
Code: Select all
pha
phx
phy
tsx
lda $0104,x
and #$10
bne BRK_HANDLER
...
ply
plx
pla
rti
;
; ...
;
BRK_HANDLER:
...
ply
plx
pla
rti
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: BRK opcode without using SYNC
BigEd wrote:
I thought others might be interested in Brad Taylor's doc about using external hardware to make use of the 1-byte operand of BRK.
Note that both the 'C02 and '816 have the useful VPB (Vector Pull) output that tells the system when one of the hardware vectors is being accessed. That signal could be used to modify vectors on the fly. Also, the '816 has the COP instruction, which is a quasi-interrupt, in that it takes its own vector and has an operand like BRK. So one is not as limited with the '816 when it comes to creative uses of software interrupts. In effect, one could code an '816 equivalent to the x86 INT N instruction, e.g., INT $13.
x86? We ain't got no x86. We don't NEED no stinking x86!
The only problem is performance. BRK vectoring is much slower than, say, declaring your OS entry point to be at $FFF0, at which sits a JMP (ROMTABLE,X) instruction.
On the other hand, BRK has the advantage of being completely operating mode independent (you have separate vectors for emulation vs. native modes). But, really, invoking $FFF0 for 6502 code and $FFF3 for 65816-native code doesn't seem like an overbearing requirement.
On the other hand, BRK has the advantage of being completely operating mode independent (you have separate vectors for emulation vs. native modes). But, really, invoking $FFF0 for 6502 code and $FFF3 for 65816-native code doesn't seem like an overbearing requirement.
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
kc5tja wrote:
The only problem is performance. BRK vectoring is much slower than, say, declaring your OS entry point to be at $FFF0, at which sits a JMP (ROMTABLE,X) instruction.
On the other hand, BRK has the advantage of being completely operating mode independent (you have separate vectors for emulation vs. native modes). But, really, invoking $FFF0 for 6502 code and $FFF3 for 65816-native code doesn't seem like an overbearing requirement.
On the other hand, BRK has the advantage of being completely operating mode independent (you have separate vectors for emulation vs. native modes). But, really, invoking $FFF0 for 6502 code and $FFF3 for 65816-native code doesn't seem like an overbearing requirement.
Regarding support of emulation mode, why bother if it's a new design? I can see reverting to emulation mode if the '816 is being used in a system that was originally 6502 powered so existing OS routines can be used. But for a new design, it wouldn't make any sense. That would be like getting a GPS receiver and then referring to a road map while on a trip.
x86? We ain't got no x86. We don't NEED no stinking x86!
BigDumbDinosaur wrote:
Without the need to maintain a jump table for the benefit of applications, the OS can be more portable, I would think (as well as slightly smaller).
Obscuring the call path, again, isn't much of a deal breaker for me. Consider CP/M, which explicitly defined CALL 0 as a means of rebooting, and CALL 5 to enter BDOS. 20 years later, the L4 microkernel defines an explicit call path to invoke its functions too.
One thing having a centralized receiver for all OS calls permits, though, is common preprocessing for all OS invokations. For example, you can copy buffers between kernel and user space (or vice versa), log system calls for tracing purposes, etc.
Quote:
Regarding support of emulation mode, why bother if it's a new design?
If he's intending on upgrading later on, a new entry point is warranted. It's not worth overloading a single entry point with mode detection logic.
Also, you wouldn't revert to emulation mode. The old entry point would include logic to go into, and return from, native-mode, not the other way around. That way, new apps get the full benefit, while old apps continue to run, albeit with a performance hit.
L4/x86 uses the jump to isolate apps from the fact that on x86 you can enter the kernel via the INT, SYSENTER and SYSCALL instructions - with differing performance and support.
L4/AMD64 uses SYSCALL only and directly. Its as fast as SYSENTER and supported everywhere
(I would go into more depth but thats far too much to type on a touchscreen)
L4/AMD64 uses SYSCALL only and directly. Its as fast as SYSENTER and supported everywhere
(I would go into more depth but thats far too much to type on a touchscreen)
When you have multiple virtual address spaces, using something like BRK or SYSCALL makes more sense because the hardware can trigger the MMU to change to a known-good state (in the 65816's case, you'd use VPB to accomplish this task).
But, again, in the case of a 6502 (or even 65816), having a fixed address to dispatch into the OS with is not a sin, and should be considered along with its other merits. Having to back up in the code to examine the BRK operand byte (or COP if you use it) takes a huge amount of time above and beyond the 6502's overhead for tearing out BRKs from genuine IRQs.
Never have I advocated NOT using BRK for this purpose. I'm just saying, the 6502/65816 CPUs don't make that choice compelling, like it is with, for example, the x86 architecture.
But, again, in the case of a 6502 (or even 65816), having a fixed address to dispatch into the OS with is not a sin, and should be considered along with its other merits. Having to back up in the code to examine the BRK operand byte (or COP if you use it) takes a huge amount of time above and beyond the 6502's overhead for tearing out BRKs from genuine IRQs.
Never have I advocated NOT using BRK for this purpose. I'm just saying, the 6502/65816 CPUs don't make that choice compelling, like it is with, for example, the x86 architecture.
If BRK is intended for entering the OS, encoding an operand byte into the opcode seems a bit pointless, and every other processor in existence seems to agree with the idea that it's better to pass the desired operation (or whatever) in a register (or occasionally on the stack)...
-----
OK, to cover the x86 weirdness from earlier, there are 3 methods commonly used for entering the kernel (I'm not going to cover call gates, because they're slow and only OS/2 uses them, task gates, because they're slow and nobody uses them, or intentionally invoking the illegal opcode handler, because that one's just silly)
-----
OK, to cover the x86 weirdness from earlier, there are 3 methods commonly used for entering the kernel (I'm not going to cover call gates, because they're slow and only OS/2 uses them, task gates, because they're slow and nobody uses them, or intentionally invoking the illegal opcode handler, because that one's just silly)
- Software interrupts via INT xx (or INT3, a special 1 byte instruction often used for software breakpoints). Supported on everything since the 8086/8088. Slow. Lots of vectors available.
- SYSENTER. Intel's preferred method of entering the kernel. Significantly faster than software interrupts, but also requires somewhat more complex code. Works on all modern Intel processors in any mode, and AMD processors in 32-bit mode only (They designed the 64-bit extensions and deprecated SYSENTER; Intel ignored them)
- SYSCALL. AMD's preferred method of entering the kernel. Slightly more barebones than Intel's; does little more than store the return address in a register and jump to the kernel (OK, it fiddles about with the CS segment register a bit, but x86 is messy like that...); even leaves the kernel stack-less. Supported on all modern AMD processors in all modes, and Intel processors in 64-bit mode.
OwenS wrote:
or intentionally invoking the illegal opcode handler, because that one's just silly
I don't know if L4/amd64 uses this technique though.
Concerning the various methods of entering kernel-space, I already knew of all of them.
Back to the 65xx architecture, though, the operand byte of BRK and COP exist explicitly for "operation selection" purposes. That's how it's documented, and that's how the O.P. was intending on using it.
This is why I said, "I don't recommend that," in not so many words. I think it'd be much, much easier to load a table offset in the X register, and then put a jump table routine at a well-known address, and dispatch from there.
And, even simpler still, is to maintain a well-known jump table at fixed offsets relative to the top of ROM space. This has the advantage that you don't have to spend time pushing registers onto the stack just to compute the jump table offset. The Commodore KERNAL used this approach, as did GEOS for the Commodore platform.
Although, all things considered, the 6502's registers are so poor for use as general-purpose information-carrying tools that you might as well conduct your OS parameter passing using well-known zero-page locations anyway.
Code: Select all
os6502ep:
; 6502 and 65816 emulation-mode entry-point.
; CPU registers are treated as caches for ZP locations ONLY.
; Hence, all registers are used for our purposes, regardless of their
; previous contents.
;
; Assumes the following four bytes in zero-page:
; osCall: an 8-bit OS function ID
; indjmp: $4C (JMP opcode)
; indjmp+1/+2: address for the aforementioned JMP
ldx osCall
lda jmpTabLegacy,x
sta indjmp+1
lda jmpTabLegacy,x
sta indjmp+2
jmp indjmp
Code: Select all
os65816ep:
jmp (jmpTabNative,x)
jmpTabNative:
.word charOut, charIn, errorOut
; ...etc...
kc5tja wrote:
The only problem is performance. BRK vectoring is much slower than, say, declaring your OS entry point to be at $FFF0, at which sits a JMP (ROMTABLE,X) instruction.
On the other hand, BRK has the advantage of being completely operating mode independent (you have separate vectors for emulation vs. native modes). But, really, invoking $FFF0 for 6502 code and $FFF3 for 65816-native code doesn't seem like an overbearing requirement.
On the other hand, BRK has the advantage of being completely operating mode independent (you have separate vectors for emulation vs. native modes). But, really, invoking $FFF0 for 6502 code and $FFF3 for 65816-native code doesn't seem like an overbearing requirement.
In some systems I use AC, XR, YR and Carry for parameters - see "fwrite" in http://www.6502.org/users/andre/lib6502/lib6502.html for example. You could argue though to use different operations instead of the carry flag.
For my OS I have defined my relocatable file format, that allows late binding of operations to a program. I could, in the source define
Code: Select all
JMP BASE_OS+15
This way I use very the same program on a system where the kernel is located at $F000 (my selfbuilt computer, CBM8x96) or even on a PET 3032 where the kernal sits at $7000.
André