At 8MHz SPI clock, you can theoretically get 1MB/sec into or out of the card, 8 cycles each byte. You'll be limited by how fast you can move that data into a useful place in RAM using the CPU. The SPI interface will have to wait for the CPU to catch up. It's a good place to be.
You would program the ATF750CL to perform the parallel <-> serial conversion with correct SPI and 65xx bus signalling. Nothing more, nothing less. With a nominal Tpd of 15ns, it's plenty fast enough for the job. Write a byte to it, eight bits get sent to the SD card along with eight clock pulses, and the 8 bits sent by the card at the same time are then ready for reading. Very simple.
In CUPL it's a 9-state machine (idle, then eight active states, then back to idle). Eight of the output pins sit on the 65xx data bus; the other two are SPI clock and MOSI (and MISO is an input). In the idle state, you're sensitive to /CE, /OE and /WE, which indicate when you need to activate the output pins (.OE term), or latch them as inputs into the shift register and advance to the first active state. In each active state, the clock gets passed through to the card, the most significant bit of the shift register is presented on MOSI, and the correct clock edge both advances the shift register (pulling in the value on MISO in the process) and the active state.
There are a few auxiliary signals that the SD card may need to see, but which the ATF750 doesn't have enough output pins to drive by itself. They'll be low-frequency, so you can bit-bang them in the normal way. The most important one is /SS or /CS, depending on naming convention. When not actively accessing the card, deselect it to save power. You'll need to have it selected for it to respond. The card socket may provide a card presence detect signal, which you should be able to read, and maybe a write-protect signal, likewise.
You would then need to talk SD card protocol over that interface. That's all software, and you'd need to do it anyway if you were bit-banging, just more slowly. IIRC there's a page or two on the web detailing practical experience with implementing that in practice. There are enough differences between SDSC, SDHC and SDXC cards that I'd advise you to get a small clutch of bog-standard 2GB SD cards, which use the oldest and simplest version of the protocol.
Finally, don't forget that SD cards are 3.3V devices. If the rest of your machine is 5V, you'll need to insert a level shifter and provide a correct power supply. Not doing so will blow up the card.
65c816 address decoding help
Re: 65c816 address decoding help
Thank you very much for the help! It seems like this is the best solution. One question, though: how should I activate the /CE, /OE, and /WE lines? And are they all necessary? I'm assuming they'd be input pins.
-
DerTrueForce
- Posts: 483
- Joined: 04 Jun 2016
- Location: Australia
Re: 65c816 address decoding help
/CE is the chip enable, or chip select. This is generated by the decoding circuitry.
/RD and /WR are the read and write lines. These are intel-style signals, but they're easy to generate from the 6502's phase-2 and R/W. They're the same signals your ROM and RAM almost certainly use.
/RD and /WR are the read and write lines. These are intel-style signals, but they're easy to generate from the 6502's phase-2 and R/W. They're the same signals your ROM and RAM almost certainly use.
Re: 65c816 address decoding help
DerTrueForce wrote:
/CE is the chip enable, or chip select. This is generated by the decoding circuitry.
/RD and /WR are the read and write lines. These are intel-style signals, but they're easy to generate from the 6502's phase-2 and R/W. They're the same signals your ROM and RAM almost certainly use.
/RD and /WR are the read and write lines. These are intel-style signals, but they're easy to generate from the 6502's phase-2 and R/W. They're the same signals your ROM and RAM almost certainly use.
Re: 65c816 address decoding help
Right. Since the Phi2 clock is an input to the SPI interface anyway (so that the SPI clock can be generated from it), you could also use /CE and R/W to provide a true 65xx style interface. That would look like the one provided by the 6551 and 6522.
Re: 65c816 address decoding help
Skylie33 wrote:
It seems more convenient to use the ATF750CL as I don't think I'd need more than one SPI device. However, I'm not very well informed when it comes to how SD cards work and how I'd program the ATF750CL to read/write the data.
Curt J. Sampson - github.com/0cjs
Re: 65c816 address decoding help
cjs wrote:
I cannot recommend strongly enough doing a cheap bit-bang SPI interface and getting it working before designing and building custom hardware to assist you with SPI (unless perhaps that custom hardware is a microcontroller-level system that does all the work for you). Premature optimization without knowing exactly what needs optimizing and, even more importantly, what (due to the protocol details) can't be optimized has a high likelyhood of leading to a hardware assist that doesn't work. Don't make your SPI interface another Commodore 1541! 
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: 65c816 address decoding help
cjs wrote:
BigDumbDinosaur wrote:
cjs wrote:
Hm! So that sounds like yet another reason to move the direct page to cover your I/O addresses when loading data, and then run the load routine in the bank you're loading. That would fix this problem, would it not?
Quote:
I was responding to your earlier comment:
What I understood you to be saying here is that to load data from the address used for input from your device, say, $C012, you expected one would be using absolute long LDA $00C012, a 4 byte/5 cycle instruction. (I don't actually see any indirection here, though; the address is being used as given, not loaded from another address.) I proposed replacing that with LDA $12 with DP set to $C0 (2 bytes/3 cycles).
BigDumbDinosaur wrote:
Something to be aware of is reading data from a fixed address with a 65C816, as would be the case with disk I/O, will involve long indirection if the data is going into or coming out of a different bank than the one in which the I/O device is located. Indirection of any kind costs clock cycles because it involves additional internal steps in the MPU. Any 24-bit load or store will incur a one clock cycle penalty for each access.
Quote:
I'm not seeing any issue with the target location of the data transfer, so long as you're willing to limit single transfers to 64K or less: just load an index register with $10000 - length, set the operand of your STA instruction to destaddr - $10000 - length, and loop until the index hits 0. Yes, this requires self-modifying code, but it's pretty innocuous as far as self-modifying code goes.
Quote:
Also, it means you need not worry about the DBR if you don't want to; STA seems to be the same number of cycles for for absolute and long indexed X, according to the WDC book.
As for absolute index and absolute indexed long using the same number of cycles, that would be expected. If a 16-bit address is specified DB has to be used to construct the base address, which then has to be added to by the index. If a 24-bit address is specified, the cycle that would have read DB will instead be used to fetch the MSB of the address.
Quote:
Quote:
I went through this exercise when I was designing the SCSI and multi-channel UART drivers for my POC V1 units.... Pointing DP at hardware not only proved to be of no value in performance, it resulted in a a lot of hoop-jumping in order to get at things such as indices and pointers that were needed by the driver.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: 65c816 address decoding help
At least on the '816 you have an alternate place to stick a pointer - the stack. The stack-relative-indirect post-indexed addressing mode results in a 16-bit address to which the DBR is prepended (before indexing), and takes 7 cycles (or 8 in 16-bit mode) to complete.
This compares with the indirect-long post-indexed addressing mode which takes one less cycle, provided the DPR is page-aligned. In general the stack pointer is not page-aligned and the programmer cannot easily arrange for it to be so, so the optimisation of skipping the extra address addition cycle in that case isn't provided.
This compares with the indirect-long post-indexed addressing mode which takes one less cycle, provided the DPR is page-aligned. In general the stack pointer is not page-aligned and the programmer cannot easily arrange for it to be so, so the optimisation of skipping the extra address addition cycle in that case isn't provided.
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: 65c816 address decoding help
Chromatix wrote:
At least on the '816 you have an alternate place to stick a pointer - the stack. The stack-relative-indirect post-indexed addressing mode results in a 16-bit address to which the DBR is prepended (before indexing), and takes 7 cycles (or 8 in 16-bit mode) to complete.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: 65c816 address decoding help
Skylie33 wrote:
I'm sure the FPGA for my video generation would have some logic space left for that [SPI hardware], but I'm not sure I'll go that route just yet...
As for video generation, here is a VGA interface with all timing and sync in 5 1/2 Spartan3 slices...
https://www.fpgarelated.com/showarticle/42.php...Add a counter and memory, and you still have thousands if not tens of thousands of slices left to do something else.
In theory, there is no difference between theory and practice. In practice, there is. ...Jan van de Snepscheut