6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 2:36 am

All times are UTC




Post new topic Reply to topic  [ 86 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
Author Message
PostPosted: Sat Apr 04, 2020 12:32 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
You should expect hardware to give you access to the raw blocks of data stored on device, if anything. The filesystem should be handled in software.

FAT isn't a very complex filesystem. Handling it shouldn't slow you down at all.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 12:38 am 
Offline

Joined: Tue Mar 31, 2020 3:40 am
Posts: 33
Chromatix wrote:
You should expect hardware to give you access to the raw blocks of data stored on device, if anything. The filesystem should be handled in software.

FAT isn't a very complex filesystem. Handling it shouldn't slow you down at all.

Well, if you say so. I'm not very knowledgeable when it comes to FAT, but I do have some documentation for it that I plan to read up on. I wonder what sort of IC would give me the raw blocks of data, though.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 12:40 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
An SD card is just an IC in a user-friendly package. Use one of those instead of messing around with USB.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 12:41 am 
Offline

Joined: Tue Mar 31, 2020 3:40 am
Posts: 33
Well yes; sure. That's the plan. But bit-banging the protocol isn't going to give me fast enough speeds.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 12:43 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Then build a simple hardware interface that is fast enough. You'll need a shift register and a counter.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 12:44 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
Chromatix wrote:
I would probably map hardware to bank $01 or $FF on an '816 system, or possibly some other spare bank, whichever is convenient. You can still access it from emulation mode, by using the long addressing modes - but on an '816 the first thing you should do out of reset is select native mode anyway. Then it's straightforward to load the DBR for whichever page is convenient when accessing hardware, and use long or indirect-long addressing for the other.

Yep, and is also a scenario I have investigated.

Given the natural order of things with the '816, running an operating system kernel in something other than bank $00 should be feasible as long as I/O hardware also appears there. That said, if the kernel's data structures (e.g., buffers) are in a different bank than the I/O hardware you're back to indirection unless you are prepared to constantly diddle with DB, which is a slow process.

I'm envisioning a design in which the body of the kernel runs in bank $01. The front ends of the ISRs would be in high bank $00 RAM ($00FC00-$00FFFF would be enough, and could be write-protected to prevent "accidents") and would start by immediately long jumping back into bank $01 for processing. Although a long jump is a 6-cycle instruction, it would only happen once per interrupt, which is tolerable. The ISR preamble would save machine state and set DB to the kernel's bank ($01), which means only 16-bit accesses would be needed to touch hardware and manage kernel data structures. Hence little of bank $00 would be consumed by code.

Quote:
That means you can reserve the more precious bank zero for direct-page and stack RAM, ISRs and boot code.

You can set up a lot of direct pages in bank $00, especially considering the actual direct page usage of many functions is only a handful of bytes. Ditto for stacks. While bank $00 is a busy place, it does have a lot of room if most of it is RAM. It's all a matter of planning.

Quote:
The reset handler would copy the ROM into the RAM, then unmap the ROM. This permits using a slow ROM without much performance penalty, and you can load additional software from another form of storage that's easier to update.

Essentially you are describing the ROM shadowing that is common in PCs. As the glue logic to accomplish this is somewhat involved, it's a good application for a CPLD.

Quote:
You should expect hardware to give you access to the raw blocks of data stored on device, if anything. The filesystem should be handled in software.

FAT isn't a very complex filesystem. Handling it shouldn't slow you down at all.

...and it doesn't even have to be FAT. A roll-your-own filesystem is within the capability of any competent assembly language programmer.

Skylie33 wrote:
Well, if you say so. I'm not very knowledgeable when it comes to FAT, but I do have some documentation for it that I plan to read up on. I wonder what sort of IC would give me the raw blocks of data, though.

If you have a working USB port you have the capability to connect a thumb drive that is formatted with a FAT32 filesystem. There's your random access, mass storage.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 12:53 am 
Offline

Joined: Tue Mar 31, 2020 3:40 am
Posts: 33
Chromatix wrote:
Then build a simple hardware interface that is fast enough. You'll need a shift register and a counter.

I thought shift registers were unidirectional, so wouldn't I need 2 if I wanted bidirectional communication?


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 1:30 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Look up the datasheet for the 74HC299 - that can both load and store its internal register on common pins, and is tri-state so you can put it directly on the data bus. Connect the output at one end to the MOSI line of the SPI device, and the input at the opposite end to the MISO line. Then you just need to generate appropriate control signals including the SPI clock. At the very least you can do that part with a GAL, or part of one, if you don't feel like trawling through TI's 74-series part catalogue.

Most bidirectional SPI devices will read and write data simultaneously, so with this you would write to the register, wait 8 cycles for it to be sent and the register refilled with the read data, then read the register to see what came back. For multi-byte transactions just repeat the process.

You can also program a small CPLD, like the ATF750CL, to do the whole thing in one package; the 22V10 doesn't have enough state bits to contain both an 8-bit shift register and a 3-bit counter, but the 750 does and the control logic should be very simple (to handle a single device in just one SPI mode).


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 2:22 am 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 730
Location: Tokyo, Japan
First, as I mentioned, I'm not too familiar with the 65816, so feel free to correct any misperceptions I have about its addressing modes or whatever.

BigDumbDinosaur wrote:
cjs wrote:
Hm! So that sounds like yet another reason to move the direct page to cover your I/O addresses when loading data, and then run the load routine in the bank you're loading. That would fix this problem, would it not?

...While accessing an I/O register as a direct page location does eliminate one clock cycle per access—assuming DP points at a page boundary—...

Um, I think two cycles per access in the situation were were talking about, right? I was responding to your earlier comment:
BigDumbDinosaur wrote:
Something to be aware of is reading data from a fixed address with a 65C816, as would be the case with disk I/O, will involve long indirection if the data is going into or coming out of a different bank than the one in which the I/O device is located. Indirection of any kind costs clock cycles because it involves additional internal steps in the MPU. Any 24-bit load or store will incur a one clock cycle penalty for each access.

What I understood you to be saying here is that to load data from the address used for input from your device, say, $C012, you expected one would be using absolute long LDA $00C012, a 4 byte/5 cycle instruction. (I don't actually see any indirection here, though; the address is being used as given, not loaded from another address.) I proposed replacing that with LDA $12 with DP set to $C0 (2 bytes/3 cycles).

I'm not seeing any issue with the target location of the data transfer, so long as you're willing to limit single transfers to 64K or less: just load an index register with $10000 - length, set the operand of your STA instruction to destaddr - $10000 - length, and loop until the index hits 0. Yes, this requires self-modifying code, but it's pretty innocuous as far as self-modifying code goes. Also, it means you need not worry about the DBR if you don't want to; STA seems to be the same number of cycles for for absolute and long indexed X, according to the WDC book.

That said, when I look at the actual old and new code itself:

Code:
    .loop   LDA $00C012     ; 5 cycles
            STA (zp),Y      ; 6 cycles
            INY             ; 2 cycles
            BNE .loop       ; 3 cycles
                            ; total: 16 cycles

    .loop   LDA $12         ; 3 cycles
            STA $xxxxxxx,X  ; 5 cycles
            INX             ; 2 cycles
            BNE .loop       ; 3 cycles
                            ; total: 13 cycles: 20% speedup

It's only about a 20% speedup on the bulk data transfers themselves, which may or may not be worth it, depending on how much other overhead you've got, whether you're doing multi-block transfers, and so on.

Quote:
I went through this exercise when I was designing the SCSI and multi-channel UART drivers for my POC V1 units.... Pointing DP at hardware not only proved to be of no value in performance, it resulted in a a lot of hoop-jumping in order to get at things such as indices and pointers that were needed by the driver.

If the driver needed a bunch of indices and pointers, yeah, you'd want the zero page pointing at those. I was talking about the above just for the transfer itself. And I suppose I had a bit of 6809 on the brain, where you have a few more registers and addressing modes for helping to handle this kind of thing. (E.g., you could push a list of transfer descriptors containing block numbers and other information on the user stack, easily index into individual descriptors relative to the user stack pointer for getting the information about each transfer, and pop them off as you work through your transfer list.)

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 4:39 am 
Offline

Joined: Tue Mar 31, 2020 3:40 am
Posts: 33
I'd prefer to avoid using any more programmable logic devices. I'm already using an FPGA as a video card, and a GAL for address decoding logic. Is there no available IC for either SD or USB that provides a simple interface?


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 4:57 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
I'd have to work out the details, but I think with pure 74-series logic you can do it with 3 chips. One shift register, one counter, and one set of gates. Four chips at the outside. Maybe figuring out those details can go in another thread.

There is also a way to rig up a 6522 to interface SPI, not quite as fast as dedicated logic but still much faster than pure bit-banging. Unfortunately it uses the same pin (CB2) for shift-register input and output data, so you would need to add an external shift register (wired to one of the VIA's parallel ports) to collect the read data, while the 6522 itself drives the write data and the SPI clock. That is still a two-chip solution, though one of the chips is a bit on the big side.

But I'd also say that if you're already using a GAL for one thing, then you have everything you need to use a second one for something else. Most electronics parts vendors give discounts for volume orders, too (even if that volume is only ten, instead of one).


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 5:18 am 
Offline

Joined: Tue Mar 31, 2020 3:40 am
Posts: 33
Chromatix wrote:
But I'd also say that if you're already using a GAL for one thing, then you have everything you need to use a second one for something else. Most electronics parts vendors give discounts for volume orders, too (even if that volume is only ten, instead of one).


I suppose you have a good point there. The question becomes: do I change out the 22V10 for ATF750CL as well, even though I technically don't need it? I suppose it could be good if I later find that I'd like more logic for address decoding.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 6:05 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
For address decoding, the ATF750 probably won't hold any advantages. The number of output pins is the same, and you'd be using only combinatorial logic, in which the total resources of the ATF750 are almost the same as the 22V10. Unless you *also* use it to latch the bank address, in which case the ability to store that value in "buried" nodes would give you more output pins, thus the ability to delete some other discrete chips.

The major advantage is with the amount of state the chip can hold in registers, and that would let you implement a complete, simple SPI interface in one chip *if* you used the ATF750. But if you accept using a 74HC299 shift register externally, you could probably drive two such interfaces with a single 22V10. That's three chips for two SPI ports that can run in parallel.

It's your machine, of course. I'm just enumerating the possibilities.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 6:28 am 
Offline

Joined: Tue Mar 31, 2020 3:40 am
Posts: 33
It seems more convenient to use the ATF750CL as I don't think I'd need more than one SPI device. However, I'm not very well informed when it comes to how SD cards work and how I'd program the ATF750CL to read/write the data. Are there any resources you know of that could assist me with that, and what kind of speeds should I expect to get with a 8MHz (possibly 10, but haven't found fast enough ROM for that) 65816?


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2020 7:04 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
(It's feeling to me like it might be worth a new thread for SPI and SD card purposes. We seem to be oscillating between wanting simple hardware and wanting high performance - there are surely tradeoffs to be made, which means figuring out what matters most.)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 86 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: