Page 1 of 2
Would this circuit work on a 65xx?
Posted: Sun Dec 13, 2020 4:07 pm
by tokafondo
From
Kiwi - a 68k Homebrew computer
8 to 16 bit buffer interface
While the CPU, memory and most of the subsystems are connected with 8 bit width, there are two subsystems which use a 16 bit bus. These are the ATA/IDE interface and the Ethernet interface (CS8900a). To access 16 bit with an 8 bit CPU, one has to break accesses. One 16 bit access works out as two consecutive 8 bit accesses. This works as follows.
To read an 16 bit word, the device is accessed with the first access. While it provides its data, the lower half (D0-D7) is read by the CPU while the upper half (D8-D15) is latched (U44). The second access just enables the output buffer of this latch.
To write an 16 bit word, the device is accessed with the second access. The data provided in the first access is latched (U45). With the next access, the latched data is provided as low byte (D0-D7), while the CPU data is provided as high byte (D8-D15).
As the MC68008 always accesses low byte first, word addressing modes with 16 bit are possible. In this way, the address line A0 can be used to determine the current state of an access. A0 is low for the first and high for the second access.
Can this be applied to the 6502/816, software and hardware wise?
Re: Would this circuit work on a 65xx?
Posted: Sun Dec 13, 2020 4:22 pm
by tokafondo
The S1D13513 Epson graphics chip (
product brief) is almost the perfect companion chip for a computer based in 65xx in 2020. And it's
almost because it uses a 16 bit data bus. That's why I ask for this info.
Re: Would this circuit work on a 65xx?
Posted: Sun Dec 13, 2020 5:19 pm
by handyandy
Re: Would this circuit work on a 65xx?
Posted: Sun Dec 13, 2020 9:07 pm
by Dr Jefyll
8 to 16 bit buffer interface
[ etc. ]
This looks entirely workable to me. In fact, it's the same approach as what I
posted in your other thread back in May!

Perhaps you overlooked that post. In any case, feel free to mention any outstanding questions you may have about how the scheme works.
The key point is this: a single instruction causes the CPU to generate two consecutive 8 bit accesses (either 2 consecutive reads or 2 consecutive writes). These are seen by the I/O device as one 16 bit read or write.
-- Jeff
Re: Would this circuit work on a 65xx?
Posted: Mon Dec 14, 2020 12:52 am
by BigDumbDinosaur
The key point is this: a single instruction causes the CPU to generate two consecutive 8 bit accesses (either 2 consecutive reads or 2 consecutive writes). These are seen by the I/O device as one 16 bit read or write.
Sounds like a job tailor-made for the 65C816. 
Re: Would this circuit work on a 65xx?
Posted: Mon Dec 14, 2020 3:26 am
by Dr Jefyll
Sounds like a job tailor-made for the 65C816. 
Yup, on an '816 you only need a single instruction to generate that 16-bit access. If you load or store a register that's configured to be 16-bit then the CPU will automatically break that into 2 consecutive 8-bit reads or 2 consecutive 8-bit writes. (And the hardware in the diagram translates those back to/from a single, 16-bit access.)
The lead post also mentions 6502, which is acceptable, too, but not as efficient. It will need a separate instruction for each of the two 8-bit chunks. For comparison, here are two versions of a simple example. (In both versions the I/O device sees only a single, 16-bit access. That's what the extra hardware is for.)
Code: Select all
; '816 version. Assumes A is set as 16-bit.
LDA FancyDevice ;load A reg with 8 bits from FancyDevice and 8 bits from FancyDevice+1
Code: Select all
; 6502 version
LDA FancyDevice ;load A reg with 8 bits from FancyDevice
LDY FancyDevice+1 ;load Y reg with 8 bits from FancyDevice+1
Re: Would this circuit work on a 65xx?
Posted: Mon Dec 14, 2020 5:44 am
by BigDumbDinosaur
Yup, on an '816 you only need a single instruction to generate that 16-bit access. If you load or store a register that's configured to be 16-bit then the CPU will automatically break that into 2 consecutive 8-bit reads or 2 consecutive 8-bit writes.
Your post reminds me of a yet-to-be-resolved hardware issue with POC V1.1, which was the version in which I added SCSI. The 53CF94 SCSI controller's DMA functions require that a 16-bit counter be loaded with the total number of bytes to be transferred by the next transaction. The count is set in two contiguous registers that are in big-endian order.
My plan was to load the (16 bit) accumulator with the 16-bit byte count, do an XBA to reverse endianess and then write the result to the CF94. The MSB would be written to the MSB register and on the next Ø2 cycle, the LSB would be written to the LSB register. Seems foolproof, eh?
Try as I might, I could not get it to work. Setting the count by writing the MSB to the MSB register as one instruction, followed by writing the LSB to the LSB register as a second instruction would work. Changing that working code to write the count as a 16-bit value consistently failed. At the time, I thought there had to be a timing bug involved, but never investigated it. It was around the time that my old Beckman scope went belly-up, so my ability to diagnose the problem was hobbled. Since SCSI was otherwise working fine I back-burnered the problem.
Having to split the write has no real performance implications, as it is only done once per SCSI bus transaction, no matter how much data is moved. So it has remained a low-priority matter. One of these days I'll revisit it...
Re: Would this circuit work on a 65xx?
Posted: Mon Dec 14, 2020 10:03 pm
by tokafondo
This looks entirely workable to me. In fact, it's the same approach as what I
posted in your other thread back in May!

Perhaps you overlooked that post. In any case, feel free to mention any outstanding questions you may have about how the scheme works.
The key point is this: a single instruction causes the CPU to generate two consecutive 8 bit accesses (either 2 consecutive reads or 2 consecutive writes). These are seen by the I/O device as one 16 bit read or write.
-- Jeff
Thanks, you are right!! I'm asking again the same questions, more than half a year later.
In your code you'll need to stick to simple loads and stores such as LDA BIT STA STY etc. Instructions like that always deal with the lowbyte first. But R-M-W instructions (INC DEC etc) are a no-no -- a minor limitation to keep in mind. The write portion of a R-M-W instruction will deal with the high byte first (according to WDC's '816 datasheet) and thus won't work as expected with this circuit.
How difficult would be to code things like games and demos following those rules? Because having the circuits and the CPU managing the 16 bits with
some instructions doesn't mean that there is an use for it because of that coding limitations...
Re: Would this circuit work on a 65xx?
Posted: Mon Dec 14, 2020 11:17 pm
by Chromatix
It should be possible to rig up something to accept writes in either byte order, using just a few more gates. In this case the write "buffers" will both need to be transparent octal latches, with their load-enables driven directly by the CPU bus decoding, and their output-enables driven only for the second write of the pair. One bit of storage is needed to indicate whether the first write has occurred.
A similar arrangement could also be used to permit reads in either byte order, but this might not be necessary.
Re: Would this circuit work on a 65xx?
Posted: Tue Dec 15, 2020 1:21 am
by Dr Jefyll
[...] The count is set in two contiguous registers that are in big-endian order.
My plan was to load the (16 bit) accumulator with the 16-bit byte count, do an XBA to reverse endianess and then write the result to the CF94. The MSB would be written to the MSB register and on the next Ø2 cycle, the LSB would be written to the LSB register. Seems foolproof, eh?
BDD, I understand this is a low-priority problem, one which you rightly back-burnered in favor of more important issues. But are those "two contiguous registers" indeed in Big-Endian order?
I gather the Count register is actually
24 bit, with the 3 bytes appearing at (in order of increasing significance) Addresses $00, $01 and $0E. Would I be correct in concluding it's $00 and $01 to which you're writing with a single instruction? I'd say the relation between those two is
Little-Endian, given that the less-significant byte is stored at the lower address.
Am I missing something? I tend to suspect your XBA is what
created the problem. (I know, I know... premature optimization is tough to resist!

)
-- Jeff
Re: Would this circuit work on a 65xx?
Posted: Tue Dec 15, 2020 1:23 am
by Dr Jefyll
It should be possible to rig up something to accept writes in either byte order, using just a few more gates.
Yes -- thanks, Chromatix.
tokafondo, as Chromatix says, the coding limitation can be eliminated with only a small increase in circuit complexity. On the other hand, the difficulties resulting from the coding limitation are also small. That's because...
- (a): in regard to an I/O device, it's pretty unlikely you'll want to use an INC DEC ROL ROR ASL or LSR (these are the Read-Modify-Write instructions) in the first place. (Edit: TSB and TRB might be somewhat useful, though.) And,
(b): it's easy to simulate the effect of these instructions anyway. For example, ...Code: Select all
INC DeviceRegister ; a Read-Modify-Write instruction
can be replaced byCode: Select all
LDA DeviceRegister
INC A
STA DeviceRegister
I understand your goal is to build a system around the S1D13513 graphics chip, which is a very fancy device indeed! I'm sure it'll present you with some challenges

but this Read-Modify-Write limitation is a minor matter.
To implement the 16-bit bus interface, do you plan to use discrete logic (eg, 74_574 etc) or will you use some sort of PLD? In the latter case a small increase in complexity (to beat the small limitation) may come for free.
-- Jeff
Re: Would this circuit work on a 65xx?
Posted: Tue Dec 15, 2020 2:06 am
by tokafondo
tokafondo, as Chromatix says, the coding limitation can be eliminated with only a small increase in circuit complexity. On the other hand, the difficulties resulting from the coding limitation are also small. That's because...
- (a): in regard to an I/O device, it's pretty unlikely you'll want to use an INC DEC ROL ROR ASL or LSR (these are the Read-Modify-Write instructions) in the first place. And,
(b): it's easy to simulate the effect of these instructions anyway. For example, ...Code: Select all
INC DeviceRegister ; a Read-Modify-Write instruction
can be replaced byCode: Select all
LDA DeviceRegister
INC A
STA DeviceRegister
Curious, wikipedia says, citing the
Including the 6502, 65C02, and 65802 book, that
When register sizes are set to 16 bits, memory access will access two contiguous bytes of memory, at the cost of one extra clock cycle. Furthermore, a read-modify-write instruction, such as ROR <addr>, when used while the accumulator is set to 16 bits, will affect two contiguous bytes of memory, not one. Similarly, all arithmetic and logical operations will be 16-bit operations.
I can't confirm or deny this as I'm no coder at all, but I just wanted to put it here for you to see.
I understand your goal is to build a system around the S1D13513 graphics chip, which is a very fancy device indeed! I'm sure it'll present you with some challenges

but this Read-Modify-Write limitation is a minor matter.
To implement the 16-bit bus interface, do you plan to use discrete logic (eg, 74_574 etc) or will you use some sort of PLD? In the latter case a small increase in complexity (to beat the small limitation) may come for free.
-- Jeff
Well, as I have zero experience programming a PLD, it would be easier for me to throw in a bunch of 74 chips in the board at the cost of space... I could try to manage the S1D13781 chip that I currently have this way too, but I already have it working at 8 bits so this 16 bit thing will have to wait until the next project.
Re: Would this circuit work on a 65xx?
Posted: Tue Dec 15, 2020 2:42 am
by Chromatix
Yes, that is true, RMW instructions operating solely on memory are affected by the M flag, just as ALU instructions involving the Accumulator are. The wrinkle is that read and write operations must each take 2 cycles to transfer 16 bits over an 8-bit bus, and for the interface circuit first proposed, the order of those half-transfers is important. Reads are always low address first, but writes are sometimes low address first (eg. STA) and sometimes high address first (eg. TSB, TRB).
A circuit which assumes low address first (as above) and thus latches reads on the low address and commits writes on the high address, can still be used with the 8-bit versions of the R-M-W instructions, provided they are always used on both bytes in the correct sequence. However, that won't always give the correct results without some finagling. A circuit which commits writes on the low address would work for 16-bit RMW instructions but fail on plain 16-bit stores.
Re: Would this circuit work on a 65xx?
Posted: Tue Dec 15, 2020 2:46 am
by BigDumbDinosaur
BDD, I understand this is a low-priority problem, one which you rightly back-burnered in favor of more important issues. But are those "two contiguous registers" indeed in Big-Endian order?...
Reading your reply had me go digging back through the code I was testing way back when (in 2012, to be more precise). My recollection was imperfect, but then that is why you put comments in your source files. 
My code was treating those registers as being in little-endian order and my mention of the XBA instruction was erroneous. So the 16-bit write to registers $00 and $01 should have worked. I left a terse comment in the SCSI driver source code about the problem to remind me why the count setup was being done a byte at a time.
I gather the Count register is actually 24 bit, with the 3 bytes appearing at (in order of increasing significance) Addresses $00, $01 and $0E. Would I be correct in concluding it's $00 and $01 to which you're writing with a single instruction? I'd say the relation between those two is Little-Endian, given that the less-significant byte is stored at the lower address.
Correct on all counts (sorry—couldn't resist that). Register $0E was reserved in the older 53C90 and 53C94 controllers, which could only do 64KB transfers. That register defaults to $00 in the 53CF94 to maintain compatibility if not touched. I don't use it because I can never transfer more than 64KB with the present POC hardware. When I finally build a unit with extended RAM...who knows?
Interesting bit of history about the 53xx9x series of SCSI controllers: they were originally designed for use in the Motorola 68K-powered minicomputers built by NCR in the 1980s and early 1990s. The 68K is little-endian, of course, so the word-size registers in the 53CF94 are little-endian as well. I'm going to revisit this to see if it works correctly in POC V1.2. Older versions of POC were a little sloppier with timing and didn't have wait-stating.
Re: Would this circuit work on a 65xx?
Posted: Tue Dec 15, 2020 2:59 am
by Chromatix
Actually the 68K is very much big-endian.