POC Computer Version One
Re: POC Computer Version One
That is rather surprising.
I don't know about now, but Windows used to have a "legacy" limit of 64K bytes on the largest guaranteed size of a named pipe message. Even 64K can be a problem as a "hex" file representation more than doubles the size of the data. A full 32K ROM will not fit.
I don't know about now, but Windows used to have a "legacy" limit of 64K bytes on the largest guaranteed size of a named pipe message. Even 64K can be a problem as a "hex" file representation more than doubles the size of the data. A full 32K ROM will not fit.
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: POC Computer Version One
Are we talking about an S-record file for a 10K binary, a 10K file of S-records or a 10K binary image constructed from S-records? (not that it matters much ... just curious ...)
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!
Mike B. (about me) (learning how to github)
Mike B. (about me) (learning how to github)
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: POC Computer Version One
drogon wrote:
Possibly the s-record file (pipe) not being fully written to the server before the receiving program wakes up?
That's what I thought might be happening, but not so. I conclusively determined that the named pipe mechanism has a 10K buffering limit, contrary to what is claimed for the Linux 3 kernel. That jibes with what I was familiar with in traditional UNIX. Obviously, there is no limit to the aggregate data flow possible through a named pipe, but the most that seems bufferable (new word?) is 10K.
I abandoned the use of a named pipe—I'm not one for flogging lame horses. The new script just watches for the appearance of a (non-zero length) file—the S-record file—and when it detects it, runs rsync to send it on its way. Works like a charm, even with an S-record file representing 64KB of binary data.
barrym95838 wrote:
Are we talking about an S-record file for a 10K binary, a 10K file of S-records or a 10K binary image constructed from S-records? (not that it matters much ... just curious ...)
It's an S-record file with a size of 10KB or larger. As Bill noted, the data part of an S-record file (or an equivalent Intel or MOS Technology hex file) is effectively twice the size of the binary data it represents. Added to that is overhead per record. So a 10KB S-record file would represent about 4.8KB of binary data.
x86? We ain't got no x86. We don't NEED no stinking x86!
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
POC Computer Version One: Reattaching SCSI
REATTACHING SCSI
Thanks to an increasing tendency for my heart to go into arrhythmia, requiring rescue therapy at least once in the last several weeks, I've had to lay low for a while. That means more time spent at home with light physical activity, which in turn, means I have a good excuse to devote quite a bit time to hobby computing—without my wife objecting too strenuously.
Some software development is needed involving bare-metal code, which is my favorite kind of programming. With POC V1.3 now running in a stable fashion with firmware using interrupt-driven API calls, I decided to see about getting SCSI working on the unit.
A key component of SCSI is, of course, the host bus adapter (HBA), of which I have one. That HBA, illustrated below, was designed to work with POC V1.1—its prototype was developed in 2011 to work with POC V1.0.
The V1.1 HBA is electrically compatible with POC V1.2 and V1.3, but none of the mounting holes align with their equivalents on those two units. The mounting differences were not a design goof. I decided starting with POC V1.2 that a different layout was best. Among other things, the new layout would use four screws to secure the HBA to the rest of the unit, thus eliminating a tendency for the HBA to occasionally unseat from the mainboard expansion socket. Also, during the layout work on V1.2, I had found it expedient to relocate the expansion socket to accommodate the two DUARTs that would give me four TIA-232 channels.
The plan was to build a mechanically-compatible HBA as soon as V1.2 was up-and-running. What derailed this plan was a change in status of the Mill-Max 351-10-114-type extender pin assemblies used to connect the HBA to the expansion socket. See below.
These assemblies were production items at the time I designed V1.2 but went to "special order" status right after I had completed V1.2 and had it operational. I had finished the new HBA layout and immediately before I placed an order for PCBs was when I discovered I could no longer order the extender pin assemblies from distribution. A query to Mill-Max confirmed that the part was now special-order and could be had in a 50-piece minimum order—which amounted to over 700 USD. So much for that idea!
At least I discovered this before I had placed a PCB order. 
I also did a search for compatible alternatives, but ran into the same problem: non-stock parts with a minimum order requirement. Evidently, everyone decided at the same time that no one needed these parts anymore.
Almost makes me wish I had stocked up on them when they were readily available. 
Meanwhile, I had commenced work on the design of POC V2.0, which was planned to use the same mechanical layout as V1.3, including the DIP28 expansion socket. That the extender pin assemblies had effectively become “unobtanium” prompted me to come up with a different expansion connector arrangement for V2.0. That, however, wasn't going to be helpful with V1.2 or V1.3. So what to do?
The “solution” was to monkey-rig the V1.1 HBA onto V1.3 with some long, skinny cable ties cinched so the extender pins would stay seated in the expansion socket. The only bad thing about this arrangement is the ties pass over the socket into which the real-time clock (RTC) plugs in (originally, the expansion socket was the RTC socket). Okay, I can live without the RTC; nothing in the firmware is dependent on its presence. It's an ugly arrangement but it mostly keeps the HBA in place, as long as I'm careful with handling the unit.
Anyhow, with the HBA more-or-less “installed,” I can commence work on some code. The SCSI driver I have is relatively old in the scheme of things. I started on it back when I had built the prototype HBA, and when I designed the current HBA unit with a more technically-advanced host interface, I patched the prototype driver and added features to support quasi-DMA. By then, it was getting very crowded in the 8KB of ROM that V1.1 had, so no more SCSI features could be added. The last significant change to the driver was made more than seven years ago.
With the driver being a mélange of prototype code and a bunch of patches, frankly it’s a mess. Furthermore, the driver doesn't know anything about extended RAM, limiting SCSI transactions to bank $00. Ergo the code sections that access command descriptor blocks and I/O buffers have to be reworked, either with more patches (which tactic is possible with V1.3 due to its 12KB of ROM) or with fresh code. I cogitated on both approaches for a while, contemplated my past work, cringed while reading some of what I had done, and decided to write a whole new driver from scratch. The concepts behind the design of the original driver were fine, the execution not so much.
As was the case with the original's development, this new driver will be run entirely in RAM while testing and debugging. This approach means I can quickly and easily upload test content to V1.3—no ROM swapping required. It also means if a driver error puts the machine into the ditch, I can press the “panic button” (NMI push button) to try to regain control or if that doesn't work, hit reset and start over. The test code loads at $00A000 and occupies about 1KB. RAM in the range $000200-$00B6FF will survive a reset (as will all extended RAM), so I should be able to conduct a post mortem following a major wreck, which will help with the debugging process.
The SCSI driver primitives are interrupt-driven, as are BIOS API calls, including those that access SCSI services. This means the test environment has to patch into the firmware's interrupt processing. So the first step in writing the new driver was to develop “wedge” and “unwedge” functions to make the test environment part of the interrupt system.
Here I take advantage of the page $01 indirect vectors that are set up during POST. I also take advantage of the fact that a 16-bit fetch or store is an atomic operation—no need to bracket the IRQ vector changes with SEI and CLI. Incidentally, the !# notation tells the Kowalski assembler to assemble immediate-mode operands as 16-bit quantities.
The IRQ patch itself is straightforward and checks for an interrupt generated by the HBA. If no HBA IRQ has occurred execution will continue with the ROM-resident IRQ handler. Otherwise, the 53CF94 SCSI controller registers that are of interest are read and returned to the foreground via the stack. Also, the return address that the 65C816 pushed while servicing the interrupt is modified so the foreground code is routed according to why the HBA interrupted.
The COP wedge basically duplicates the firmware's COP handler so the API calls associated with SCSI services are intercepted and directed to the test environment—I won't display that code here, since it is described in previous posts. Were the COP wedge not there, the SCSI API calls would be intercepted in the firmware and fail with “undefined API” errors.
The above patching has been tested and, astonishingly, worked on the first try. It always seems when I write an IRQ patch, I make a silly typo, e.g., TXA when I meant TAX, and when the patch gets wedged, the system crashes and burns.
Next step is to flesh out the body of the driver.
Thanks to an increasing tendency for my heart to go into arrhythmia, requiring rescue therapy at least once in the last several weeks, I've had to lay low for a while. That means more time spent at home with light physical activity, which in turn, means I have a good excuse to devote quite a bit time to hobby computing—without my wife objecting too strenuously.
A key component of SCSI is, of course, the host bus adapter (HBA), of which I have one. That HBA, illustrated below, was designed to work with POC V1.1—its prototype was developed in 2011 to work with POC V1.0.
The V1.1 HBA is electrically compatible with POC V1.2 and V1.3, but none of the mounting holes align with their equivalents on those two units. The mounting differences were not a design goof. I decided starting with POC V1.2 that a different layout was best. Among other things, the new layout would use four screws to secure the HBA to the rest of the unit, thus eliminating a tendency for the HBA to occasionally unseat from the mainboard expansion socket. Also, during the layout work on V1.2, I had found it expedient to relocate the expansion socket to accommodate the two DUARTs that would give me four TIA-232 channels.
The plan was to build a mechanically-compatible HBA as soon as V1.2 was up-and-running. What derailed this plan was a change in status of the Mill-Max 351-10-114-type extender pin assemblies used to connect the HBA to the expansion socket. See below.
These assemblies were production items at the time I designed V1.2 but went to "special order" status right after I had completed V1.2 and had it operational. I had finished the new HBA layout and immediately before I placed an order for PCBs was when I discovered I could no longer order the extender pin assemblies from distribution. A query to Mill-Max confirmed that the part was now special-order and could be had in a 50-piece minimum order—which amounted to over 700 USD. So much for that idea!
I also did a search for compatible alternatives, but ran into the same problem: non-stock parts with a minimum order requirement. Evidently, everyone decided at the same time that no one needed these parts anymore.
Meanwhile, I had commenced work on the design of POC V2.0, which was planned to use the same mechanical layout as V1.3, including the DIP28 expansion socket. That the extender pin assemblies had effectively become “unobtanium” prompted me to come up with a different expansion connector arrangement for V2.0. That, however, wasn't going to be helpful with V1.2 or V1.3. So what to do?
The “solution” was to monkey-rig the V1.1 HBA onto V1.3 with some long, skinny cable ties cinched so the extender pins would stay seated in the expansion socket. The only bad thing about this arrangement is the ties pass over the socket into which the real-time clock (RTC) plugs in (originally, the expansion socket was the RTC socket). Okay, I can live without the RTC; nothing in the firmware is dependent on its presence. It's an ugly arrangement but it mostly keeps the HBA in place, as long as I'm careful with handling the unit.
Anyhow, with the HBA more-or-less “installed,” I can commence work on some code. The SCSI driver I have is relatively old in the scheme of things. I started on it back when I had built the prototype HBA, and when I designed the current HBA unit with a more technically-advanced host interface, I patched the prototype driver and added features to support quasi-DMA. By then, it was getting very crowded in the 8KB of ROM that V1.1 had, so no more SCSI features could be added. The last significant change to the driver was made more than seven years ago.
With the driver being a mélange of prototype code and a bunch of patches, frankly it’s a mess. Furthermore, the driver doesn't know anything about extended RAM, limiting SCSI transactions to bank $00. Ergo the code sections that access command descriptor blocks and I/O buffers have to be reworked, either with more patches (which tactic is possible with V1.3 due to its 12KB of ROM) or with fresh code. I cogitated on both approaches for a while, contemplated my past work, cringed while reading some of what I had done, and decided to write a whole new driver from scratch. The concepts behind the design of the original driver were fine, the execution not so much.
As was the case with the original's development, this new driver will be run entirely in RAM while testing and debugging. This approach means I can quickly and easily upload test content to V1.3—no ROM swapping required. It also means if a driver error puts the machine into the ditch, I can press the “panic button” (NMI push button) to try to regain control or if that doesn't work, hit reset and start over. The test code loads at $00A000 and occupies about 1KB. RAM in the range $000200-$00B6FF will survive a reset (as will all extended RAM), so I should be able to conduct a post mortem following a major wreck, which will help with the debugging process.
The SCSI driver primitives are interrupt-driven, as are BIOS API calls, including those that access SCSI services. This means the test environment has to patch into the firmware's interrupt processing. So the first step in writing the new driver was to develop “wedge” and “unwedge” functions to make the test environment part of the interrupt system.
Code: Select all
;wedge: WEDGE PATCHES INTO INTERRUPT SYSTEM
;
wedge rep #m_seta ;16-bit accumulator
lda ivcop ;current COP vector
cmp !#newcop ;already wedged?
beq .0000010 ;yes, skip
;
sta ivspareb ;no, save old vector &...
lda !#newcop ;set...
sta ivcop ;new vector
;
.0000010 lda ivirq ;current IRQ vector
cmp !#newirq ;already wedged?
beq .0000020 ;yes
;
sta ivsparea ;no
lda !#newirq
sta ivirq
;
.0000020 sep #m_setr
brk
;
;===============================================================================
;
;unwedge: UNWEDGE PATCHES FROM INTERRUPT SYSTEM
;
unwedge rep #m_seta ;16-bit accumulator
lda ivcop ;current COP vector
cmp !#newcop ;wedged?
bne .0000010 ;no, skip
;
lda ivspareb ;yes, get original vector
beq .0000010 ;not valid!
;
sta ivcop ;put it back
stz ivspareb ;invalidate alternate vector
;
.0000010 lda ivirq ;current IRQ vector
cmp !#newirq ;wedged?
bne .0000020 ;no
;
lda ivsparea
beq .0000020
;
sta ivirq
stz ivsparea
;
.0000020 sep #m_setr
brkHere I take advantage of the page $01 indirect vectors that are set up during POST. I also take advantage of the fact that a 16-bit fetch or store is an atomic operation—no need to bracket the IRQ vector changes with SEI and CLI. Incidentally, the !# notation tells the Kowalski assembler to assemble immediate-mode operands as 16-bit quantities.
The IRQ patch itself is straightforward and checks for an interrupt generated by the HBA. If no HBA IRQ has occurred execution will continue with the ROM-resident IRQ handler. Otherwise, the 53CF94 SCSI controller registers that are of interest are read and returned to the foreground via the stack. Also, the return address that the 65C816 pushed while servicing the interrupt is modified so the foreground code is routed according to why the HBA interrupted.
Code: Select all
;PATCH TO IRQ SERVICE ROUTINE
;
newirq phk ;select kernel's...
plb ;data bank
sep #m_setr ;8-bit registers
ldy io_scsi+sr_stat ;get HBA general status
bpl iirq0500 ;HBA not interrupting
;
tsc ;make IRQ stack frame...
tcd ;the local direct page
ldx io_scsi+sr_isr ;get HBA command status
lda io_scsi+sr_irqst ;get HBA interrupt status
;
; ——————————————————————————————————————————————————————————————————————
; The following code modifies the stack frame that was pushed by the IRQ
; preamble, thus affecting the behavior of the foreground code that was
; interrupted. The changes are as follows:
;
; Frame MPU
; Offset Register Description
; —-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—
; irq_arx .C HBA interrupt status
; irq_xrx .X HBA command status
; irq_yrx .Y HBA general status
; irq_pcx PC SCSI foreground execution vector
; irq_srx SR C & D cleared, m & x set
; —-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—
;
; No analysis of status is made here; the foreground handles that.
; ——————————————————————————————————————————————————————————————————————
;
rep #m_setr ;16-bit registers
and !#%11111111 ;squelch noise in .B
sta irq_arx ;interrupt status
stx irq_xrx ;command status
sty irq_yrx ;general status
lda ivscsi ;get “next” driver vector &...
sta irq_pcx ;reroute foreground
sep #m_seta ;8-bit accumulator
lda irq_srx ;entry SR
ora #m_setr ;exit with m & x set
and #~{sr_bdm|sr_car} ;exit with d & c cleared
sta irq_srx ;exit SR
;
; ******************************************************
; next code segment is only for testing——it replaces the
; CRTI function in the firmware ISR...
; ******************************************************
;
rep #m_seta ;16-bit accumulator
ply ;restore MPU state
plx
pla
pld
plb
rti
;
iirq0500 jmp (ivsparea) ;goto regular IRQThe COP wedge basically duplicates the firmware's COP handler so the API calls associated with SCSI services are intercepted and directed to the test environment—I won't display that code here, since it is described in previous posts. Were the COP wedge not there, the SCSI API calls would be intercepted in the firmware and fail with “undefined API” errors.
The above patching has been tested and, astonishingly, worked on the first try. It always seems when I write an IRQ patch, I make a silly typo, e.g., TXA when I meant TAX, and when the patch gets wedged, the system crashes and burns.
Next step is to flesh out the body of the driver.
x86? We ain't got no x86. We don't NEED no stinking x86!
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
POC Computer Version One: Reattaching SCSI
REATTACHING SCSI
A scratch-written SCSI driver is now part of V1.3's firmware, and I have video of POC V1.3 coming up from a cold start, bringing the disks online and trying to boot from one of them. The driver only supports block devices at this time, meaning disks and CD/DVD drives. The old driver could work with stream devices, such as tape drives, but that was a capability I hadn't used. So I left it out in the interest of expediency.
As an aside, I decided to design and build a host bus adapter (HBA) specifically to be mechanically compatible with POC V1.3 (and V1.2, although I'm not contemplating doing anything with that unit in the near future). The reason is my makeshift setup with V1.1's HBA is not very satisfactory. It's too easy to accidentally unseat the HBA and crash the computer. The new version will be physically secured to the mainboard. Although I can't get the Mill-Max extender pins I used with V1.1's HBA, I did come up with an alternative involving some strip headers. It's a little Mickey Mouse but should work.
I have one more firmware project to complete and that is to enhance the S-record loader so it can process records with 24-bit addresses (S2 records). With that capability, I will be able to assemble programs to run outside of bank $00 and have them directly load to the right place in RAM. Currently, I have tell the Load monitor function in the firmware to load to other than bank $00. Forgetting to do that will result in a crash if the load goes into bank $00 and steps on a critical area in RAM.
A scratch-written SCSI driver is now part of V1.3's firmware, and I have video of POC V1.3 coming up from a cold start, bringing the disks online and trying to boot from one of them. The driver only supports block devices at this time, meaning disks and CD/DVD drives. The old driver could work with stream devices, such as tape drives, but that was a capability I hadn't used. So I left it out in the interest of expediency.
As an aside, I decided to design and build a host bus adapter (HBA) specifically to be mechanically compatible with POC V1.3 (and V1.2, although I'm not contemplating doing anything with that unit in the near future). The reason is my makeshift setup with V1.1's HBA is not very satisfactory. It's too easy to accidentally unseat the HBA and crash the computer. The new version will be physically secured to the mainboard. Although I can't get the Mill-Max extender pins I used with V1.1's HBA, I did come up with an alternative involving some strip headers. It's a little Mickey Mouse but should work.
I have one more firmware project to complete and that is to enhance the S-record loader so it can process records with 24-bit addresses (S2 records). With that capability, I will be able to assemble programs to run outside of bank $00 and have them directly load to the right place in RAM. Currently, I have tell the Load monitor function in the firmware to load to other than bank $00. Forgetting to do that will result in a crash if the load goes into bank $00 and steps on a critical area in RAM.
x86? We ain't got no x86. We don't NEED no stinking x86!
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: POC Computer Version One: Reattaching SCSI
REATTACHING SCSI
The new SCSI host adapter has been built and tested. Everything appears to be copacetic.
About the RTC socket. Some pins on the RTC are no-connects, but it wasn't practical to place the socket elsewhere on the host adapter so the no-connects would be isolated from the bus interface. So the solution was to extract some pins out of a socket that was then plugged into the bus interface headers. As I said in my previous post, the bus interface is a bit Mickey Mouse.
BigDumbDinosaur wrote:
As an aside, I decided to design and build a host bus adapter (HBA) specifically to be mechanically compatible with POC V1.3...
The new SCSI host adapter has been built and tested. Everything appears to be copacetic.
About the RTC socket. Some pins on the RTC are no-connects, but it wasn't practical to place the socket elsewhere on the host adapter so the no-connects would be isolated from the bus interface. So the solution was to extract some pins out of a socket that was then plugged into the bus interface headers. As I said in my previous post, the bus interface is a bit Mickey Mouse.
x86? We ain't got no x86. We don't NEED no stinking x86!
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: POC Computer Version One
Not too long after I got SCSI back in operation I ran across an obscure bug in the driver that I had theorized might occur but had not seen with POC V1.1. It has to do with interrupt timing relative to the clock when the final byte is received from the SCSI host adapter. Part of the host adapter redesign was to increase the 53CF94 SCSI ASIC's clock rate to 25 MHz (maximum allowed is 40), to decrease SCSI bus latency. Previously, it was 20 MHz, but faster is not always better, at least when reading from a SCSI device. Due to the CF94 running faster, the time that elapses from when the final byte is read from the DMA port to when the CF94 interrupts due to a bus phase change is shorter than it used to be. That interrupt occasionally sneaks in between the fetch from the DMA port and the store to RAM, causing that final byte to be lost.
For now, the workaround is to bracket the fetch-store sequence with SEI-CLI. Regrettably, that adds four clock cycles per transfer loop iteration during reading, which means an additional 2048 clock cycles are expended in reading one block from a disk. As Ø2 is 16 MHz, that theoretically costs an extra 128 microseconds per block read, producing a theoretical transfer rate of 666KB/second. Writing doesn't have this timing issue, so it runs faster, theoretically 800 KB/second.
For now, the workaround is to bracket the fetch-store sequence with SEI-CLI. Regrettably, that adds four clock cycles per transfer loop iteration during reading, which means an additional 2048 clock cycles are expended in reading one block from a disk. As Ø2 is 16 MHz, that theoretically costs an extra 128 microseconds per block read, producing a theoretical transfer rate of 666KB/second. Writing doesn't have this timing issue, so it runs faster, theoretically 800 KB/second.
Last edited by BigDumbDinosaur on Mon Mar 28, 2022 8:10 pm, edited 4 times in total.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: POC Computer Version One
Interesting result.
If you revert back to your old clock rate for your SCSI controller, does the read transfer rate approach the write transfer rate?
I've experienced also experienced read behavior with some of my projects in the past, and it's always been a bit unexpected until I dug into the components as you've described in your post.
If you revert back to your old clock rate for your SCSI controller, does the read transfer rate approach the write transfer rate?
I've experienced also experienced read behavior with some of my projects in the past, and it's always been a bit unexpected until I dug into the components as you've described in your post.
Michael A.
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: POC Computer Version One
MichaelM wrote:
If you revert back to your old clock rate for your SCSI controller, does the read transfer rate approach the write transfer rate?
I haven't tried it yet. Reverting to the old 20 MHz clock should, in theory, result in the same sort of performance I was seeing with POC V1.1, but would be counter to my desire to reduce bus latency and (eventually) take advantage of the fast SCSI-2 features available with the 53CF94 controller.
Quote:
I've experienced also experienced read behavior with some of my projects in the past, and it's always been a bit unexpected until I dug into the components as you've described in your post.
This particular problem is indirectly the result of doing “pretend DMA” with the 53CF94. If operated in PIO mode, the CF94 interrupts on every byte sent or received, which is too slow to be practical—reading or writing just one disk block would cause 512 interrupts in rapid succession. Ergo I came up with an alternative that reads/writes the CF94's DMA port, using two hardware handshaking lines for pacing purposes.
In DMA mode, the CF94 interrupts when the internal DMA counter goes down to zero or when there is a bus phase change. Once the target device has been selected, it controls the bus, not the host. The target will change the bus phase when an internal event occurs that requires the bus being in a different phase. For example, after selection, the target might switch the bus to the data-in phase. The C94 would interrupt and also update a register to indicate the current bus phase. That information is used in my driver to re-vector the foreground to process the new phase.
During the data-in phase, the driver runs a loop that reads from the CF94's DMA port and stores to RAM. If the byte read from the CF94 is the last one sent by the target, reading it will trigger an IRQ. That IRQ could hit exactly as the opcode fetch for the store instruction occurs. The store instruction will not be fetched and due to the CF94 interrupt handler re-vectoring the foreground to accommodate the new bus phase, that last byte will never get stored.
A hardware fix I've contemplated is that of intentionally inserting delay in the CF94's IRQ line to give the MPU enough time to finish that final store instruction when a bus phase change occurs. However, that is a Mickey Mouse solution at best.
Realistically, the best way to solve this (as well as greatly speed up transfers) is to have a DMA controller doing the work instead of the MPU.
x86? We ain't got no x86. We don't NEED no stinking x86!
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
POC Computer Version One: Firmware Update
From a while back:
Real-word experience with using .X to select the target serial I/O (SIO) channel has proved to be sub-optimal.
While it's convenient to simply load a register with a number and have data magically come from or go to a serial device, it means .X isn't available to act as a counter or index. Work I am doing right now on a programmable keyboard input module has highlighted this as an impediment to writing succinct code. It also has occasionally been a stumbling block with some other things on which I'm working.
So I am revisiting the SIO part of the API looking for alternatives that don't require use of .X. I've considered three routes:
By the way, anyone who is familiar with I/O operations on Commodore eight-bit machines will likely feel some deja vu, since the Commodore “kernal” has calls for selecting an input channel (CHKIN), output channel (CHKOUT), and resetting those channels back to the screen and keyboard (CLRCH). Or, you might consider what I am going to do to be an analog of the stdin and stdout features of UNIX-like operating systems.
All that said, it's time to fire up the editor-assembler and warm up the EPROM burner.
BigDumbDinosaur wrote:
The main purpose of building V1.2 is to create a machine on which I can concoct a virtual QUART (vQUART, vQUART meaning "virtual quad UART") driver...A BIOS call for SIO services will use the .X register as the zero-based channel index, which is convenient for higher-level applications.
While it's convenient to simply load a register with a number and have data magically come from or go to a serial device, it means .X isn't available to act as a counter or index. Work I am doing right now on a programmable keyboard input module has highlighted this as an impediment to writing succinct code. It also has occasionally been a stumbling block with some other things on which I'm working.
So I am revisiting the SIO part of the API looking for alternatives that don't require use of .X. I've considered three routes:
- A separate BIOS API call to connect to each channel.
I dismissed this route almost immediately. Currently, there are two API calls for serial I/O: one for input and the other for output (there are also two calls “hard wired” to the console channel). In POC V1.2, V1.3 and (the nascent) V2.0, separate SIO calls per channel would mean there would be four input calls and four output calls. Combined with the two direct-to-console calls, it would result in 10 total calls in the above machines. A future design would increase the total number of serial channels to eight, which would result in 18 separate API calls.
I just don't see this route as workable in the long-term.
- Push the channel index to the stack before making the API call.
This route is convenient with the 65C816, since it's easy to “cherry pick” items from the stack. It also doesn't result in more API calls, since all that's being done is changing the way in which the channel index gets passed into the API.
However, using the stack that way runs counter to a basic design philosophy of the entire BIOS API, which is to only use the registers for parameter passing. I'd have to make a special case with the SIO part of the API to clean up the stack at call completion. Also, having to push the channel on every SIO API call would mean more processing time and may also mean a register gets clobbered anyhow. Of the three available push instructions that don't involve a register, only PEI can readily push run-time data — PEA and PER work with assembly-time data, unless self-modifying code is used.
While this route is workable and wouldn't pose any problems in a future system with more SIO channels, it is still sub-optimal due to the required stack shenanigans.
- Use separate API calls to set the input and output channels.
With this route, a call for serial I/O would not specify a channel — the existing API calls would be made sans channel number. Instead, there would be three new API calls, one to set the input channel, one to set the output channel and a third to return the current channel settings to the caller. The “hard wired” connections to the console channel would remain, since they are convenient, especially during program debugging — normal I/O can go to a different channel, and debugging messages, register dumps, etc., can go to the console and not disrupt the display being used to receive program I/O.
I consider this a workable route, despite having to add to three API calls to the BIOS, thus increasing the total number of SIO calls to 7 — that number would be constant, regardless of the number of SIO channels. What led me to look at this route is it had become apparent during software development that a program typically is “connected” to an SIO channel at startup and doesn't communicate with another channel thereafter. So the reality would be the channel selection would only happen once at program startup in the majority of cases.
I'm going to give this route a try and see how it goes. Ironically, I did consider it when I developed the virtual QUART driver for POC V1.2, but went with passing the channel in .X.
By the way, anyone who is familiar with I/O operations on Commodore eight-bit machines will likely feel some deja vu, since the Commodore “kernal” has calls for selecting an input channel (CHKIN), output channel (CHKOUT), and resetting those channels back to the screen and keyboard (CLRCH). Or, you might consider what I am going to do to be an analog of the stdin and stdout features of UNIX-like operating systems.
All that said, it's time to fire up the editor-assembler and warm up the EPROM burner.
x86? We ain't got no x86. We don't NEED no stinking x86!
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
POC Computer Version One: Revised SIO API
I made the changes to serial I/O (SIO) in the BIOS API that I had mentioned in the previous post and gave it a go. It appears everything is copacetic.
Much of my SIO activity involves communicating with a terminal, so it proved to be convenient to set the “default” SIO channels at program startup and only change one or both if needed. Code in the programs that do terminal I/O has gotten slightly smaller as a result, both because a channel doesn't have to be specified on each SIO call and because I can once again use .X as a counter or index instead of tying it up as a channel selector.
Much of my SIO activity involves communicating with a terminal, so it proved to be convenient to set the “default” SIO channels at program startup and only change one or both if needed. Code in the programs that do terminal I/O has gotten slightly smaller as a result, both because a channel doesn't have to be specified on each SIO call and because I can once again use .X as a counter or index instead of tying it up as a channel selector.
x86? We ain't got no x86. We don't NEED no stinking x86!
- Sheep64
- In Memoriam
- Posts: 311
- Joined: 11 Aug 2020
- Location: A magnetic field
Re: POC Computer Version One
BigDumbDinosaur on Fri 29 Oct 2021 wrote:
I made the above changes to the COP handler and everything seems to be copacetic.
BigDumbDinosaur on Thu 2 Dec 2021 wrote:
Everything appears to be copacetic.
BigDumbDinosaur on Sat 5 Feb 2022 wrote:
By the way, anyone who is familiar with I/O operations on Commodore eight-bit machines will likely feel some deja vu, since the Commodore “kernal” has calls for selecting an input channel (CHKIN), output channel (CHKOUT), and resetting those channels back to the screen and keyboard (CLRCH). Or, you might consider what I am going to do to be an analog of the stdin and stdout features of UNIX-like operating systems.
More importantly, you may be interested in Acorn stream functionality which differs from Commodore or Unix. Specifically, output streams may be set as a bit mask. This allows, for example, simultaneous output to console, disk and printer.
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
POC Computer Version One: Still Running
POC V1.3 has been running in an exemplary fashion as I create the bits and pieces for my 816NIX (version 2) filesystem. The V2 SCSI host adapter is doing fine as well. My only immediate wish is that I had a DMA controller to speed up SCSI transactions. Right now, V1.3 has an Autobahn on which to travel (SCSI bus, good for 10 MB/second), but is driving a 1950s-era Volkswagen Beetle.
x86? We ain't got no x86. We don't NEED no stinking x86!
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: POC Computer Version One
I made one more detail change to the firmware. Previously, the stack pointer (SP) was set to $00BFFF, which is the highest accessible RAM address in bank $00. However, that turned out to be a bit of a problem.
High bank $00 RAM includes the eight serial I/O (SIO) circular queues, each of which is 256 bytes in size. Wanting to maximize available bank $00 RAM, the queue array started at $00B700, with the end of the highest queue at $00BEFF. The result was an effective stack size of 256 bytes, usually more than adequate.
However, since my programming model for all library functions uses the stack for parameter-passing and ephemeral workspace, stack growth got out-of-hand in at least once case, and the stack collided with the top-most SIO queue—SP just keeps going down with each push and doesn’t wrap until it hits $0000. I didn't want to have to start determining stack usage with each function call, as that is an error-prone process and presupposes some characteristics of programs that use library functions.
The solution was to move the SIO queues one page higher in RAM, with the last queue now occupying what used to be stack space. The top-of-stack is now at $00B7FF, one byte below the first SIO queue. This means stack growth will not overrun the queues and cause a variety of strange issues. Of course, stack growth could step on something associated with a running program, but that’s an altogether different matter.
High bank $00 RAM includes the eight serial I/O (SIO) circular queues, each of which is 256 bytes in size. Wanting to maximize available bank $00 RAM, the queue array started at $00B700, with the end of the highest queue at $00BEFF. The result was an effective stack size of 256 bytes, usually more than adequate.
However, since my programming model for all library functions uses the stack for parameter-passing and ephemeral workspace, stack growth got out-of-hand in at least once case, and the stack collided with the top-most SIO queue—SP just keeps going down with each push and doesn’t wrap until it hits $0000. I didn't want to have to start determining stack usage with each function call, as that is an error-prone process and presupposes some characteristics of programs that use library functions.
The solution was to move the SIO queues one page higher in RAM, with the last queue now occupying what used to be stack space. The top-of-stack is now at $00B7FF, one byte below the first SIO queue. This means stack growth will not overrun the queues and cause a variety of strange issues. Of course, stack growth could step on something associated with a running program, but that’s an altogether different matter.
x86? We ain't got no x86. We don't NEED no stinking x86!