I've been playing around with the SCSI driver in an effort to make it smaller and faster. Right now I am running the 53C94 controller in PIO mode, which is the only practical mode with the current hardware. PIO is slow...period. The theoretical transfer speed limit with the MPU running at 8 MHz is about 75 KB/sec, and I'm up to about 60 KB/sec (faster if I kill IRQs during a transfer), well below the asynchronous transfer rate of the SCSI bus, which is 3 or 5 MB/sec, depending on the device. The C94 also has a DMA mode, the use of which would substantially reduce the amount of overhead associated with reading and writing each byte. Unfortunately, DMA is not practical with any 65xx system, as no compatible DMA controller apparently exists. However, it appears that it may be possible to set up a quasi-DMA mode with some hardware trickery.
Like most I/O silicon, the C94 has a /CS (chip select) input, R/W inputs (separate, in this case) and some address lines. As is typical, you put an address pattern on A0-A3 (16 possible registers), assert /CS and the MPU is connected to the desired register. One of those registers is a 16-bit FIFO, which is the conduit between the SCSI bus and the MPU's data bus. Bytes written into the FIFO by the MPU are sent to the bus when the C94 is told to send them, and bytes coming in off the bus from the SCSI hardware (e.g., disk) are collected in the FIFO and then made available to the MPU when the C94 is told to collect them and make them available. In PIO mode, the C94 has to be told to send/receive for each byte, using a command sequence that takes between 150 and 200 clock cycles to complete—and that's on top of the code needed to actually read from or write to the FIFO. A standard hard drive block is 512 bytes, so it is easy to see when all the processing time is going (76,800 cycles best case per block).
The FIFO is different from the other C94 registers in that there are two ways to access it. It appears as register $02 when /CS is asserted. However, it is also possible to make the FIFO appear by asserting a separate C94 input called /DACK and not asserting /CS. In such a case, the bit pattern on the address bus is completely ignored and the FIFO is connected to D0-D7, just as it would if $02 appeared on the address bus and /CS had been asserted.
Unfortunately, the 53C94 data sheet that I followed in designing the host adapter and writing the driver does a poor job of explaining how this all works. Since I couldn't understand what the author of the data sheet was trying to explain, I settled for using PIO mode. However, I recently came across a copy of the data sheet for the C94's immediate ancestor, the NCR 53C90, in which a much better job of explaining the controller's DMA features and how it responds to the DMA control input and output can be found. So now I'm revisiting my host adapter design to see if I can use the DMA features to improve performance. My vague thinking at this point is to somehow make the MPU seem to be a DMA controller.
I already mentioned the /DACK input, which when asserted, causes the FIFO to connect to the data bus without regard to what is on the address bus. /DACK has a companion output, DREQ, which when asserted by the C94, tells external hardware that data is waiting in the FIFO during read mode, or that there is room in the FIFO for another byte during write mode. So, the theory in my little dinosaur brain goes, if I can get the MPU to see when DREQ has been asserted, then the MPU could assert DACK and grab a byte from the FIFO and store it in RAM (read mode), or grab a byte from RAM and put it in the FIFO (write mode). Since the SCSI bus runs at a high rate of speed (faster than the MPU can physically move bytes to and from the FIFO), and since DREQ changes state as fast as the C94 and SCSI bus can operate, the performance limit would now be a matter of high tightly can I code the I/O loop that polls DREQ, toggles /DACK and services the FIFO.
Toggling /DACK to get access to the FIFO isn't too difficult. What I would need is some glue logic that would assert /CS if the C94 is selected and the register number on A0-A3 is anything other than $02. The same logic would assert /DACK if the C94 is selected and the register number is $02. It can be done with discrete gates, but a PLD would be much faster. However, for proof of concept testing, I could do it with some 74ABT or 74F logic, which wouldn't add too much propagation delay.
DREQ is more problematic. If my POC unit were powered by the 65C02, I could tie DREQ to the MPU's SOB (set overflow) input through an inverter, which would be easily monitored with a BIT instruction. However, the 816 doesn't have SOB. So I would have to somehow make the DREQ output appear to be a bit on the data bus when a specific memory location was checked. It seems something like a 74ABT574 octal D-flop could do this, as it has very short prop time and can be tri-stated when the MPU is not looking at DREQ. I think I could tie the 574's /OE input through an inverter to A4 (which is not connected to the C94) and when the C94 I/O range is selected with $10 on the address bus, the 574 would strobe the state of DREQ to, say, D7 for easy testing with a BIT instruction.
What would be really cool, but definitely a project for another time, would be to figure out how to rig up a 65C02 to act as a DMA controller. That's a topic for another day. For now, I'm going to cogitate a bit and see if I can synthesize the DMA stuff with some gates and such.
Oh, one more thing, as Columbo would have said. It may be that watching DREQ is not necessary, as it appears the C94 and the SCSI bus would be waiting on the MPU, not the other way around.
-----
Edit: /DREQ is supposed to be DREQ.
Last edited by BigDumbDinosaur on Wed Oct 12, 2011 1:56 am, edited 1 time in total.
|