Turnkey get-started-fast solution

GARTHWILSON · Post by **GARTHWILSON** » Fri Apr 06, 2007 6:45 pm

I'm not sure what you're calling "hold times," but hold times are never 80ns long. 10ns is common. Anyway, LCDs are usually too slow to put directly on the 65 bus. Go through a VIA (ie, a 6522) and you won't have to even think about timings, just the sequence of signals which will be controlled in software. You might think it's slower to go through the VIA, and in a way that's true; but overall the computer will be much faster if it is not held down to the slow speeds of common LCDs, RTCs, etc..

kc5tja · Post by **kc5tja** » Fri Apr 06, 2007 6:47 pm

Modern 65816s are all the 14MHz part, so unless you can still purchase the 2MHz model of 65816, you will find that the hold time maxes out at 10ns.

However, look at what it's saying: the hold time required of the LCD is 2-55ns. The CPU would deliver 80ns (in reality, more like 10ns due to the more modern fabrication processes). The CPU, assuming you have a real 2MHz part, would meet the requirements of the LCD.

Since you'll likely have a much faster part (even if you drive it slower), then there are still a variety of options available to you.

First, and the easiest possible solution, is to drive the LCD through a VIA chip. Use one of the I/O ports as the data bus, and the other port for bus control signals. Here, you will manually control the E line in software. This will allow you software-defined control over the LCD bus. However, it's slower. There will be no use of the MVN/MVP instructions to block-transfer graphics, for example.

Second, you could write the data to be written to a 74LS373 or compatible 8-bit register. Then, you could write to an I/O port that, via gating in the address decoder, drives the E line. For example:

Code: Select all

LDA #byte_to_write
STA lcd_write_latch
LDA #some_dummy_value
STA lcd_memory_address

The advantage of this approach is that the data in the latch remains asserted even after E goes high. The disadvantage is, while faster than the VIA approach, you still don't have full CPU bandwidth to graphics memory.

Finally, another approach is to just plain use a monostable multivibrator of some kind to bring RDY low for a single clock cycle's duration -- that is, to introduce a wait state. This will allow you direct CPU access to the video memory, but you'll top out at one half the CPU's maximum throughput, except for MVN and MVP instructions, where each byte transferred will take 8 cycles instead of 7 (since only one of those cycles involves actually writing to the LCD).

There is a variety of other ways I haven't covered (clock skewing, dynamic clock frequency adjustment, data bus delay lines, quadrature clocks, mutual synchronization of multiple clocks (doable in only 4 D-flipflops, I might add), etc), but I'm trying to find the simplest possible potential solutions.

If you're interested in using multiple clock synchronization, look through this forum for where I talk about bus synchronization. I forget specific links now, but it was a pretty hot topic sometime last year.

NOTE: ADVANCED MUSINGS...

I should point out that these problems all exists on other processor architectures too. This is why the 68000 has a fully asynchronous bus handshake system. The Intel processors now-a-days use a quasi-asynchronous system for I/O and a fully-synchronous, pipelined, burst-mode interface for synchronous DRAM and SRAM interfaces. The idea being, different modes of operation work for different classes of devices.

The 6502 bus architecture is simple, and some might say is too simple. However, looking at all the bus controller chips and whatnot for the Intel-style bus that include wait-state generators, one would think, "Gee, if it's on a chip, wait state generators must be complex pieces of logic!" In reality, they're dreadfully simple. A wait-state generator for the 6502 consists of a two 2-bit shift registers wired in a certain configuration. It takes only 2 74ACT74 chips and a small number of 2-input gates to implement. Total cost is less than $0.30 depending on where you get your chips from. Dedicating expensive chip real-estate to that simple a circuit would not be justified.

Another aspect to consider is that, if you're working with lots of different speed peripherals, you're going to have different sets of wait-states. The Intel-bus solutions don't generally handle this gracefully, at least not that I've seen, as long as you rely on those on-chip wait state generators.

So which bus is simpler? You're pulling your hair out over the 6502 bus now. If you used the Z-80, you might get your project running sooner. But, in terms of actual cost, I think you'll find the 6502 is substantially easier to work with. And it's substantially faster -- an 8MHz 6502 bus transfers data at 8MB/s (assuming no wait states). A Z-80 can pull off at best 2.6MB/s, and a 8085 only 2MB/s.

To transfer data faster than this, you'll start looking into wider buses instead of faster CPU speeds. A 6502 at 4MHz will compare favorably with a Z-80 at 8MHz. Thus, a 16-bit wide bus at 4MHz will still support a transfer speed of 8MB/s.

What this boils down to is one of philosophy. In terms of software, the 6502 bus is the Unix philosophy -- that is, "worse is better." The CPU's bus is just adequate enough to allow the CPU to talk with peripherals. It doesn't say how easily it can, but it can, and that's all that matters. The CPU's bus is optimized for what it has to do. Intel's bus (and to a large extent, Motorola's) has the philosophy of being as correct as possible. It's a nice goal to strive for, but how can you be correct for all possible people?

The result is greater complexity and more difficulty in understanding all the different modes and sub-modes of the bus. Intel-style buses have pretty sophisticated timing requirements that are not immediately obvious from reading the timing diagrams, for example. After _RD and _WR are negated, how long until the next cycle begins? That is critical knowledge which, if you read peripheral datasheets, can be anything, from 0ns to 100ns. How does _RD interact with _CS? If you read the datasheets of RAM chips, you'll find at least 3 different interactions. Etc.

I've also pulled my hair out when working with the 65816. Just search this forum for "Kestrel 1" and you'll see the problems I encountered very early on. But given a choice between the 6502 bus and the 68000 bus today, I'd definitely pick the 6502 bus. The fully synchronous design really is simpler, even if you don't see the simplicity up-front.

OK, I'll get off the soap-box now.

faybs · Post by **faybs** » Fri Apr 06, 2007 7:07 pm

You guys are a gold mine

So, it sounds like my one mistake was in how I read table 5-2 of the 65816 datasheet - I was assuming the columns in it refer to the speed you're running the CPU at, not its speed grade (Garth, the hold times I was referring to are labelled TAH and TDHR - if you look at the "2MHz" column in that table, you'll see that they're 40ns, so I just doubled it again for 1MHz). Since WDC only sells the 14MHz part nowadays and I'm planning on using 5V, that means I should only ever refer to the 14MHz column?

My other mistake seemingly was taking the sample circuit in the LCD controller's datasheet at face value - I know to never do that with code samples in programming manuals, but somehow my judgement failed me in this case

fachat · Post by **fachat** » Fri Apr 06, 2007 7:51 pm

faybs wrote:

@kc5tja
OK, now I know I'm doing something wrong, because I can't seem to find more than a handful of devices that have compatible timings with these CPUs. For example, for my initial design, I was going to use a 65C816 with 32KB RAM (to get the new opcodes but not have to deal with the multiplexed bank byte) and I thought it would be neat to have a graphic LCD for a display, so I got the datasheets for a 320x240 intelligent module from www.crystalfontz.com (http://www.crystalfontz.com/products/32 ... T_v1.1.pdf). However, when I looked at the timing diagrams in the datasheet, even the module's "6800 mode" (using phi2 as E) cannot be used with a 65C816 because if I run the '816 in 1MHz which is required to meet one timing parameter (the minumum time E can be high is 500ns) then the maximum hold times are too small (it requires between 2-55ns hold when reading, and at 1MHz the 816's setup and hold times at 1MHz would be, extrapolating from the datasheet, 80ns). The only solution in my mind to fix the problem was to add a state machine using flipflops or a CPLD, and I'm frankly not ready for that complexity yet. Considering that there's even a diagram in the datasheet showing the LCD controller connected directly to a 6800 and that the 6800 and 65xx families are supposed to be very similar, that doesn't seem right to me. I've encountered the same sort of problems with pretty much all non-65xx family chips I've looked at - I either need to run the CPU at a really low speed (sub MHz) or some parameters match but others don't. Judging from your comment above, I shouldn't be getting anywhere near hitting these sort of problems though. What am I doing wrong?

Well, I think you don't need to extrapolate setup times from higher speeds to lower speeds, but take them as they are. After all, if a signal is triggered by Phi2 going low, and is available after 10ns at 8MHz, why should it not be there after 10ns at 1MHz? Or when some data is latched at Phi2 going low, why should it take longer on slower speeds? Setup and hold times are a function of internal pathlengths and capacities that have to be loaded/unloaded - and those differ between the IC rated for different speeds and not the speed they actually run. At least that is how I calculate.

I have, however, also come upon peripherals that are to slow for the 6502, even at 1MHz. I used this http://www.6502.org/users/andre/icap/rdy.html circuit to hold the CPU's RDY line until the data was taken.

Looking at the datasheets you linked, the "6800 mode" seems the correct mode - the 6800 has the same bus interface as the 6502 (due to its history, but that is a different story :-)
On page 64 of the PDF, there is the timing sheet for M6800 mode that proves this.
What you have to care about is that the address lines and select lines are setup during Phi2 (E) low such that they are valid "t1" resp. "t2" before Phi2 going hi and held "t5" resp. "t6" after Phi2 going low. Note that "RD#" is the Phi2 input in M6800 mode. During writes the data must be valid "t3" before Phi2 going low and held "t7" after Phi2 going low. During reads the LCD provides valid data at least "t9" after Phi2 going high - and that must be before the point in time needed by the CPU to latch the data before Phi2 goes low again. The LCD holds the data for another "t8", which must be longer than the CPU requires as hold time.

Note that as "t9" is measured from Phi2 going high, and the CPU setup time is measured from Phi2 going low, you can calculate the maximum speed be adding those two values, add a safety margin and use it as width of a half-clock-cycle (namely the part of Phi2 high).

Looking at the timing values on the next page of that datasheet, t1=t2=5ns, t5=t6=7ns, address setup is not critical, even the address hold time of 7ns is met by the Rockwell[1] 65C02@4MHz, that has 15ns)

Write data setup time is t3=2Ts+5, where Ts is the "System clock", so assuming that is 40Mhz (which is the 2/3 of the max input clock), this is 2*25+5=55ns, i.e. the CPU must provide the data 55ns before Phi2 goes low (less if Ts is less, i.e. system clock is faster). This you have to compare with the 6502 setup time.
My Rockwell datasheets says the CPU gives valid data 55ns after Phi2 goes high (for a 4MHz rated CPU, but 200ns for a 1MHz rated R65C02). So the minimum time Phi2 high must be high is 55ns+55ns+safety margin, say 15ns=125ns, so write should work smoothly up to 4MHz (which has a Phi2 high time of 125ns with 1:1 clock high/low ratio)
Input data must be held t7=5ns after Phi2 falling, which should be ok (my Rockwell datasheets say 30ns write data hold time for CPUs rated up to 4MHz).

On the read side the data is provided by the LCD t9=4Ts+20ns=4*25+20=120ns after Phi2 going high. This data has to be valid before the CPU requires it - on the R65C02 for 4Mhz that is 30ns setup time before Phi2 going low again. Adding those two values alone gives 150ns, which is too fast for a 4MHz clock (remember, 125ns), but well in the range of 2Mhz (assuming using a 4MHz CPU at 2MHz. For a 2MHz-rated CPU you have to redo the computations again with the correct data for this CPU). Data is held by the LCD up to 2ns after Phi2 going low. This is indeed too short for the Rockwell 65C02 read data hold time of 10ns, no matter the frequency.

Yet, not all is lost. Here's some dark magic.

The CPU calculates the timings from its Phi2 input, while the LCD calculates it from its Phi2 input. If those inputs differ - e.g. by a delay of 10ns in Phi2 due to a driver from the CPU signal to the LCD signal, the read data setup time that the CPU sees is 10ns larger (as the LCD sees the triggering rising edge of Phi2 10ns later than the CPU), but also the hold time that the CPU sees is 10ns larger.
This also changes the write timing. The LCD sees a shorter write data setup time from the CPU (the data is valid earlier compared to the Phi2 the LCD sees), but also a shorter write data hold time (which would be ok in this case, as the write data hold is 30ns minus 10ns Phi2 delay, which is still larger than the write data hold time the LCD requires)

If even the data bus is separated say by a bus driver like the 74LS245, you have to add its delay as well. It increases the write setup and hold times the LCD sees, but also increases the read setup and hold times the CPU sees.

Note that you have to check the whole range of the driver delays, from minimum to maximum to make sure it works all the time.

Hope this helps
André

[1] Note: I'm using the Rockwell data here as this is the manual I currently have on my desk. You will have to use the appropriate numbers for the 65816

fachat · Post by **fachat** » Fri Apr 06, 2007 8:00 pm

kc5tja wrote:

Modern 65816s are all the 14MHz part, so unless you can still purchase the 2MHz model of 65816, you will find that the hold time maxes out at 10ns.

However, look at what it's saying: the hold time required of the LCD is 2-55ns. The CPU would deliver 80ns (in reality, more like 10ns due to the more modern fabrication processes). The CPU, assuming you have a real 2MHz part, would meet the requirements of the LCD.

The LCD provides a hold of 2-55ns during a read - it does not require any hold time during a write. 2ns, however, are too fast for a 65C02 to latch the data.

Quote:

OK, I'll get off the soap-box now. ;-)

Nice musings, though :-)

André

GARTHWILSON · Post by **GARTHWILSON** » Fri Apr 06, 2007 8:08 pm

Quote:

2ns, however, are too fast for a 65C02 to latch the data.

Bus capacitance however will hold the data plenty long after the originator of the data goes high-Z; so it still works consistently when it doesn't seem like it would if you only look at data sheets.

fachat · Post by **fachat** » Fri Apr 06, 2007 8:11 pm

GARTHWILSON wrote:

Quote:

2ns, however, are too fast for a 65C02 to latch the data.

Bus capacitance however will hold the data plenty long after the originator of the data goes high-Z; so it still works consistently when it doesn't seem like it would if you only look at data sheets.

Even if loaded with 10 or more inputs?

Ok, data bus inputs should all be high impedance when not selected, but address lines for example?

André

GARTHWILSON · Post by **GARTHWILSON** » Fri Apr 06, 2007 8:12 pm

Sure. The inputs are part of the capacitance (maybe around 5pF each) that holds the data.

faybs · Post by **faybs** » Fri Apr 06, 2007 8:50 pm

Thanks, Andre. Ts, for the particular LCD module that crystalfontz sells, is 100ns - they run the LCD controller at 10MHz. It's mentioned somewhere near the beginning of the datasheet, but it's not easy to spot.

I'm leaning towards Garth and kc5tja's suggestion of using some sort of port or latch to interface the LCD though; as Garth mentioned, the LCD is going to be a fair bit slower than the CPU, I'm currently a lot more comfortable with software solutions to problems (that may change as I get more experienced with hardware

), and finally the inability to use MVP/MVN doesn't really come into it because the LCD panel only exposes one address line - the main CPU doesn't have direct access to vram.

This is proving to be quite educational! I think I'll go back to lurking and let you guys continue with the discussion

8BIT · Post by **8BIT** » Fri Apr 06, 2007 8:56 pm

kc5tja wrote:

You mentioned programming PALs this way as well. How is this done? It's always bothered me that I could never find the low-level details needed to program one of these things.

I was not very clear on that. I have the EEPROM programmer that is connected to my SBC-2. I also bought a cheap universal programmer from EBAY that can program certain 16V8's as well as most EPROMS. I can do the Atmel 16V8D, but not the Atmel 16V8C for instance. I also had a lattice 16V8 that worked.

The only homebrew solution that I've found is the GALBLAST package. It includes a software driver (DOS based I think). There is some low-level docs on this website:
http://www.armory.com/~rstevew/Public/P ... kMe1st.htm

You can look over my EEPROM programmer hardware and code on my website. http://sbc.rictor.org/io/28256.html

Daryl

I got my programmer

GARTHWILSON · Post by **GARTHWILSON** » Fri Apr 06, 2007 9:27 pm

Andre, I see what you're saying about the address bus. I was thinking it was tri-stated for a short time between the end of T(AH) and the end of T(ADS). There have always been minor problems on the data sheet, and I assumed the time difference there was spent in tri-state. Looking at it on my almost-too-slow oscilloscope, it is not showing to be that way; but I do have a 65c802 on the board and don't seem to have a 65c02 setup handy to look at. According to the spec.s, the address can supposedly start changing after the 10ns T(AH); but I've found the spec.s to be overly conservative. In fact, the circuit they show for latching the high address byte on the '816 severely violates the timing diagrams they showed a few years ago if you operate at the higher speeds (12-22MHz), even though we know the processor did indeed work at at these speeds. I just looked at the 816's timing diagram in the data sheet they have posted now, and it is really messed up!

Anyway, I've never had trouble with the hold time being inadequate, and I've made several home-made 6502-based computers and one that's been flying in hundreds of aircraft for the last 12 years without any problems. There was another commercial product I designed the computer for in the late 1980's that had more non-65xx stuff hanging on the bus, and that one worked fine too, although only a few beta versions were made for field testing, and a poor packaging design made it too expensive to manufacture, so the product never really made it to market.

faybs, the LCD's microcontroller's running at 10MHz does not mean the host processor's interface to it can go anywhere near that fast. Even if you could write directly to the microcontroller's memory or processor registers, its read and write cycle may be slower at 10MHz than the 6502's is at 2MHz.

faybs · Post by **faybs** » Fri Apr 06, 2007 9:57 pm

Quote:

faybs, the LCD's microcontroller's running at 10MHz does not mean the host processor's interface to it can go anywhere near that fast. Even if you could write directly to the microcontroller's memory or processor registers, its read and write cycle may be slower at 10MHz than the 6502's is at 2MHz.

Right, the reason I mentioned that is because a few of the values in the LCD controller's timing diagram are based off the clock period; for example, the minimum time E must be active is 5Ts, which for that particular LCD module (clocked at 10MHz) means 500ns.

GARTHWILSON · Post by **GARTHWILSON** » Sat Apr 07, 2007 12:38 am

Daryl, WDC's website says they're still sampling the 65c51 version that they already know has one bug which they describe there. I suppose they're hoping people will find and tell them about any remaining ones before they fix the design and do a production run of wafers. I didn't get to it soon enough to talk to them this week on the phone.

So, maybe the board could be made with a choice of UARTs. The only two I've used are the 6551 and the MAX3100, the latter in a 14-pin DIP with SPI/Microwire interface that would need four VIA bits, bits which could also be used for other things at the same time as long as the UART's select line is not made true when the other 3 lines are toggled for something else. The 16-pin DIP outline could go inside the 28-pin DIP outline, so it doesn't take extra board space. Mike likes the 16450 or 16550 UARTs which have a different bus interface and come in 40-pin DIPs.

8BIT · Post by **8BIT** » Sat Apr 07, 2007 4:41 am

GARTHWILSON wrote:

Daryl, WDC's website says they're still sampling the 65c51 version that they already know has one bug which they describe there. I suppose they're hoping people will find and tell them about any remaining ones before they fix the design and do a production run of wafers. I didn't get to it soon enough to talk to them this week on the phone.

I emailed WDC back in May 2006. I asked about the 65C51 and was told of the bugs. I offered to test one or two of their samples. After describing my SBC's to them, they asked me if they could evaluate one of my boards. Somehow, my request for samples turned into their request for material from me. I offered to sell them one at my cost, but they did not respond further on the matter.

Seems the progress in bringing these to market is very slow. Good luck in trying to get a sample from them.

Daryl

faybs · Post by **faybs** » Sun Apr 08, 2007 7:51 am

One thing we could use is the DLP-USB245M. It's a small module that takes parallel 8 bit data and outputs it from a USB peripheral port; from Windows or Linux it just looks like a serial port. It's not too cheap ($25 from mouser.com) but it seems to meet a lot of the criteria people have been discussing. The datasheet is at http://www.dlpdesign.com/usb/dlp-usb245m13.pdf.

I was thinking that a neat way to make this sbc simple yet versatile would be to make it behave like a BBS - all communication with it is over a single serial port (RS232 or USB). User interfacing can be done using ansi/vt100 escape sequences (if it's good enough for emacs it should be good enough for anything we may need), and binary transfers for eg loading and saving programs can be done using kermit or xmodem. The end user will be able to make relatively complex user interfaces and transfer data to and from his big box, all over one simple link. If we make the software a bit more complex we could even multiplex data packets, so that you can eg interact with the keyboard and have data scrolling on the screen while it's being transferred from the big box. What do you guys think?