The DREQ add-on circuit works as expected and now gives me the ability to make the W65C816S mimic the functions of a real DMA controller. The key word is "mimic." I'll have some more thoughts on how I might go about this in another post.
The purpose of fabricating a new host adapter was to make it possible to use the 53C94's DMA access functions. Programmed input/output (PIO) operation of the 'C94 is impractical during protracted data transfers, as the controller will interrupt for each byte moved. Reading a single hard disk block would entail 512 IRQs in rapid succession, plus other IRQs generated as the SCSI controller is sequenced. Throughput using PIO would be severely limited due to all the code the MPU would have to execute. Hence the focus on DMA.
When used with DMA, the 'C94 only interrupts when the last byte has been transferred (or an error occurs), which means that DMA transfer can be very efficient compared to PIO operation. I estimate the performance difference between PIO and DMA operation in the POC could be at least 10 to 1, maybe better—depending on how efficiently I write the driver. As I don't have the luxury of a real DMA controller I have to make the 65C816 pretend to be one. This is more complicated than it might appear at first blush. A real DMA controller can function independently of the MPU once the latter has programmed it. In the context of the 53C94, all a DMA controller has to do is handshake and move bytes. It doesn't have to worry about IRQs or what to do about them. Not so with the '816 and hence the greater programming complexity. In order to understand what is needed, detailed understanding of the behavior of the 53C94 is necessary.
The 'C94 is intended to automate the SCSI bus protocol and thus has been given a fair amount of intelligence. Ergo the chip has a number of registers that have to be programmed at boot time, while others have to be programmed before or during a SCSI transaction. The key registers are:
- Destination ID. The SCSI ID of the device that is to be accessed is written here before each operation. As the 'C94 is designed to connect to the "narrow" (8-bit) SCSI bus, the target ID ranges from $00 to $07 inclusive. Traditionally, the HBA is assigned SCSI ID $07, as that ID has the highest priority on the bus. Hence seven devices, in addition to the HBA, may be attached to the bus, these devices using IDs $00 through $06.
- FIFO. The 16-deep FIFO is the data conduit between the MPU's data bus and the SCSI bus. During a read operation, data coming in off the SCSI bus is deposited in the FIFO and eventually copied to RAM. During a write operation, data from RAM is written into the FIFO and eventually placed on the bus so it can be read by the target device. The details of how the 'C94 moves bytes between the FIFO and SCSI bus are of no concern to the MPU—the controller's intelligence insulates the MPU from the bus protocol details.
- Command. The 'C94 is told what to do by writing a command opcode into this register. Many commands may be PIO or DMA, the latter indicated by setting bit seven of the opcode. The difference comes in how data movement occurs between the FIFO and RAM, as well as how the 'C94 must be set up before data flow can occur.
- DMA Transfer count. Prior to using DMA to move bytes between the FIFO and RAM, the 'C94 must be told how many bytes to expect, up to 64 KB maximum. It uses this information to sequence the DMA handshake and also to report an error if the requested amount of data is not the same as the amount actually moved. If PIO is used to transfer between the FIFO and RAM this counter is not used.
Internally the DMA transfer counter consists of two sets of registers, one which stores the most recent count setting and another that down-counts with each DMA handshake. In order to start the DMA transfer an appropriate DMA command must be written into the command register. Doing so will copy the previously set DMA count into the down-counter and the 'C94 will start the DMA handshake. On each completed DMA handshake, the count will decrement. When the down-count reaches zero the 'C94 will discontinue the DMA handshake and interrupt the MPU.
The DMA handshake is a hardware sequence that the DMA controller utilizes to communicate with the 'C94's FIFO. Two 'C94 signals are involved:
- DREQ. This is an active high output from the 'C94, whose meaning depends on the direction of data transfer. During a read operation, DREQ will be true when the 'C94 has data for the DMA controller. During a write operation, DREQ will be true when the 'C94 is able to accept data from the DMA controller.
- /DACK. This is an active low input to the 'C94. When true, the 'C94 will connect the FIFO to the data bus so it can be read or written. Each toggle of /DACK will decrement the DMA transfer counter. Due to the way in which the 'C94 operates, /DACK must never be asserted unless either /RD (read data) or /WD (write data) is likewise asserted—the transfer counter can get out of sync with the SCSI bus if this dictum is not observed. Also, the use of /DACK and /CS (chip select) are mutually exclusive. /CS is used for general register access (including FIFO access during PIO operation), whereas /DACK only accesses the FIFO. Operation is undefined if /CS and /DACK are simultaneously asserted.
The handshake sequence is as follows:
- Test the state of DREQ. If it is false, the 'C94 is not ready, so keep testing until it is ready.
- Assert /DACK. As soon as /DACK is asserted read from or write to the FIFO.
- Increment buffer pointer. Go back to step 1.
The above is repeated for as many bytes as are to be transferred. When the full count has been transferred, or some other significant event occurs, the 'C94 will interrupt the MPU.
An interesting "gotcha" exists in this scenario and it has to do with the basic nature of the SCSI bus protocol. Once the 'C94 has selected the target device, sent a valid command descriptor block (CDB) to it and the target has accepted the command, the target assumes control of the SCSI bus and thereafter dictates which operations are to be performed. Usually the target will switch the bus to the data-in (read) or data-out (write) phase, either of which will cause the 'C94 to interrupt the MPU. Upon servicing the interrupt, the MPU would execute code that would load the 'C94's DMA transfer counter with the number of bytes to move and then start the DMA handshake sequence described above. Data should start flowing. So far, so good.
The problem is what happens in the event the target device stalls for some reason, e.g., it suffers an internal failure that prevents it from communicating with the 'C94. The 'C94 has no way of knowing this, as it can only react to the current state of the SCSI bus. With a dead target, no changes will occur on the bus and the 'C94 will endlessly wait. Meanwhile, since there isn't any bus activity, the DMA controller will also be twiddling its thumbs, as the 'C94 won't assert DREQ until there's some data to process. Since no data is moving, the transfer counter isn't decrementing and therefore the 'C94 will not interrupt the MPU. Everything comes to a screeching halt.
Fortunately, there is a solution. In almost any system that would avail itself of SCSI, there is a jiffy interrupt that is driven by a hardware timer. For example, in a UNIX-like system, the jiffy IRQ is used to update the software clock, select jobs to run and the like. UNIX also has an alarm function that can be used to interrupt a process. Therefore the solution to the stalled bus problem is to set an alarm that will break the DMA handshake loop should the 'C94 IRQ not occur in a reasonable amount of time. Since a stalled bus will most likely be caused by failed hardware or a cable becoming unplugged, the alarm interrupt would vector the MPU to an error handler that can report higher up in the operating system.
POC also has a jiffy IRQ, which is generated by the Dallas watchdog timer at 10 millisecond intervals. The jiffy IRQ drives a 32 bit uptime counter, as well as a 16 bit down-counter that can be used for programmed time delays. I was able to add some code to the BIOS ROM that makes it possible to harness the down-counter to activate an alarm function. The alarm is set up as follows:
longa ;16 bit .A
lda #seconds ;alarm time in seconds from now
ldx #<vector ;alarm vector LSB
ldy #>vector ;alarm vector MSB
jsr alarm ;set the alarm
In the above, VECTOR points to code that will be executed when the alarm goes off. Once the alarm has been set, the down-counter will decrement at one second intervals. When the counter reaches zero, the BIOS interrupt service routine (ISR) will replace the address that was pushed to the stack when the interrupt was serviced with VECTOR. Upon executing the RTI at the end of the ISR, the MPU will resume execution at VECTOR. The alarm automatically cancels itself when it goes off. It can be "manually" canceled by calling ALARM with zero in the 16 bit accumulator.
So my SCSI driver will have to set an alarm as part of the DMA transfer setup, just in case the bus stalls. If the transfer is successful the alarm would be canceled. The exact amount of alarm time needed is something I probably will have to determine by experimentation.
The most complicated part of the driver will be in the SCSI ISR. When the 'C94 interrupts there are four major conditions that have to be accounted for in the ISR:
- Successful operation.This condition will occur after each successful command to the 'C94. The meaning of "success" depends on context, so there are additional status bits that can be examined to determine what it was that was successful.
- Bus service request. This interrupt occurs when the target device changes the bus phase, and is generally paired with a successful operation IRQ (but not always—the target could change the bus phase to status-in because of an error). As I earlier noted, following selection and receipt of a valid CDB, the target will change the bus phase to data-in or data-out, depending on the direction of data transfer. When this happens, the 'C94, which will be waiting for the target to respond to the command, will interrupt with a bus service request.
- Selection timeout. According to the ANSI standard, the target device has a maximum of 250 milliseconds to respond to selection. If that time elapses without a response the 'C94 will interrupt with a selection timeout. This would constitute a "device not present" error.
- Gross error. The most recent operation can't be completed due to a hardware fault or because the command was not acceptable to the 'C94. For example, if the 'C94 is operating as an initiator (as it would be in the POC) and a target command is given to it, it will generate a gross error interrupt. Another possible error would be if a previously selected target unexpectedly disconnected from the bus. Yet another would be if somehow the direction of DMA was opposite of the direction of data flow (most likely the result of a logic error in the driver).
In each case, the ISR would not only have to recognize the condition, it would have to vector the MPU to the correct code block for that condition. The vectoring will be accomplished by tinkering with the RTI address on the stack. Obviously, monkeying with the stack in an ISR can be fraught with danger. If a mistake is made in the code, most likely the machine will crash and leave little or no evidence of what happened.
Therefore, my first tack in developing this driver will be to write a version that uses DMA access but no IRQs. Each of the code blocks will have to look at the 'C94's interrupt status register and decided what to do for each condition. Since this is all foreground code, it should be easier to debug. Once I have a suitable program flow worked out then I can rig up the IRQ stuff with reasonable confidence that it will work. I already have the alarm part working, so the rest is up to the driver.