Re: My 65816 Computer Concept
Posted: Wed May 20, 2020 4:55 am
dagenius wrote:
I do like the relocatable page 0 of the '816, so 1 page of I/O allows apps that use it often to switch their ZP to it.
There's is a booby trap involved with pointing direct page at the I/O block: if you do so you can't get to any pointers, indices and/or flags that are on the real direct page. Given that much I/O involves accessing such data structures, you may find mapping direct page onto I/O to not be viable.
When I implemented SCSI on my POC V1.1 unit, I went through a fairly lengthy exercise of getting throughput as high as possible. One experiment was that of pointing direct page at the SCSI host adapter's base address (which falls on a page boundary) and using a 16-bit index register to point to the buffer. In order to get at the direct page variables needed by the SCSI driver I had to force the assembler to use absolute addressing so the '816 wouldn't mistakenly try to fetch or store from/to the host adapter's address space when the access was supposed to be RAM in the real direct page.
After running a series of tests reading and writing 32KB chunks of data from one of the disks and comparing the throughput to not having the host adapter appear in direct page, I abandoned the idea. Code compaction and better anticipation of branches that might be taken proved to be more productive.
The reality is much of the activity involving I/O is managing buffers, queues and pointers, which is all compute-bound. So while having an I/O port appear in direct page improves access to the port, it may actually degrade overall performance if you have to resort to using absolute addressing to get at direct page variables. Even if that isn't the case, it's only one instruction of many in an I/O loop that is saving a clock cycle.
A good use of direct page redirection is pointing direct page at the stack in a subroutine after that sub has allocated some stack space. Doing so opens the door to allocating a fugacious direct page for any subroutine that has to make use of direct page addressing. The code to do so is surprisingly simple:
Code: Select all
phd ;save current direct page
rep #%00100000 ;16-bit accumulator
sec
tsc ;current stack pointer
sbc #tmp_spac ;number of bytes needed (16-bit value)
tcs ;reserve stack space
inc A ;point at offset $00 in temp space
tcd ;which is now direct page
;
; —————————————————————————————————————————————————————————————
; At this point, the instruction LDA $00 would actually fetch
; from SP + 1, which is the bottom of the reserved stack space.
; —————————————————————————————————————————————————————————————
;
..... ;program does things...
rep #%00100000 ;now we clean up...
clc ;after ourselves
tsc ;current stack pointer
adc #tmp_spac ;number of bytes that were used
tcs ;get rid of them
pld ;restore old direct page location
rts ;we're doneI use this technique extensively in much of my code.
Worth noting is if DP is not pointing at the start of a page a one cycle penalty will occur on each fetch or store to direct page. The main value in doing the above is in using direct page addressing modes to get at data structures elsewhere in RAM. You can do indirect-long addressing on the stack, whereas if you want to do long addressing with stack-relative instructions you'd have to change DB and would be limited to a maximum of 64KB of addressable space. With DP pointing at the stack and use of indirect-long addressing you would have 16MB of addressable space.
Quote:
There definitely is a shortage of 65816 projects. A lot of people say it's harder, but the circuits for decoding the upper address lines are readily available in the datasheet, and the 16-bit operations make a lot of code smaller and easier.
There are other '816 projects around here: Marco Demont's, 8BIT's, mine, etc. However, they definitely are outnumbered by eight bit projects.
As for generating A16-A23, the data sheet circuit appears to have a potential timing issue as the Ø2 clock rate closes in on the 65C816's upper limits. Mostly, the use of an inverter to generating the latch strobe may not be satisfactory due to the cumulative lag of the inverter's propagation time and the time required for the latch to close when its /LE input goes low. There is discussion elsewhere on this.