6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Oct 03, 2024 2:17 pm

All times are UTC




Post new topic Reply to topic  [ 49 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Sun Mar 16, 2014 4:20 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California
cr1901 wrote:
BigEd- in another thread, Garth mentioned you do work on the Visual 6502 project... do you know how the 6502/816 is capable of delaying the assertion of bus signals. For example- in the '816, the Bank Address becomes valid just before the rising edge of PHI2. How does the '816 make sure that it loads Bank Address so far into the negative half of PHI2? Does it use a latch (where the output becomes valid without the presence of a clock edge signal) in addition to gate propagation delay, instead of a flip-flop or otherwise sequential memory element (where output only becomes valid on a clock transition)?

I don't know when Ed will be able to respond, so I'll jump in. (Being 9 hours ahead of me, he's probably sawing logs right now.) He and I and Jeff have been discussing timings, and Jeff is about to publish some super helpful .gif's on this very thing which he has worked some repeating-motion-picture magic on. (Nobody said to keep it a secret, so I hope it's ok for me to say this.) [Edit: Done, at viewtopic.php?f=4&t=2909.] Getting the bank address out late in the low half of phase 2 is not the point; but rather that with the maximum amount of time it takes to serve it up after phase 2 falls, that quantity of time puts it late into that half of the cycle at its upper clock frequency limit. At low clock speeds, it will be ready after only a small percentage of the phase-2-low time, proportionately very early in the cycle. Again though, in reality, the bank address seems to be served up much sooner than the guaranteed limit.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 16, 2014 4:26 am 
Offline

Joined: Wed Feb 05, 2014 7:02 pm
Posts: 158
Ignore everything after the italics- I screwed up... hold time is AFTER the clock transition for a flip-flop, setup time is BEFORE.

I was thinking setup time was "the maximum time allowed for the external hardware to get the value on the bus and stable", and hold time as "the time that the external hardware is required to keep the bus stable before clocking in data."

The thing is Garth, even if the Bank Address is relatively early at lower clock speeds, the issue I'm having is that- putting the Bank Address on the bus is dependent on the read/write data being sampled and stored properly beforehand. I am having trouble visualizing how the '816 can switch from "Read/write data state" to "switch bus direction state" to "output bank address state" without so much as a clock transition to switch between states. I can't easily visualize what combinational circuitry tells the '816: "Okay, read data has been stored and hold time has elapsed, time to output the bank address"


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 16, 2014 9:40 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10943
Location: England
To address the first question - the timing of the Bank Address outputs - the reality is that the '816 drives those outputs as soon as it can after falling phi2. That is, they are not driven prior to the rising edge, but driven after the falling edge. It's just that (according to the worst-case figures in the datasheet), "as soon as it can" is not terribly quick, and the 30ns figure given is getting close to the 35ns minimum phi1 time. If you're not in worst case situations, and if you're running slower - say 12MHz - then the BA output might be at 20ns and the rising phi2 at 41ns or so.

For the second question, it may be that you momentarily confused the setup constraint on a flop's input with the clock-to-Q propagation time for its output. I know this is easily done because I've done it myself. For a flop to work reliably, the input needs to be static for a critical window which starts a little before and ends a little after the clock edge. Those times give rise to the setup constraint and the hold constraint. But because the clock seen by the flop might be a little delayed (or even advanced) compared to the clock which is the reference to the timings, it's possible to have a zero hold time on a datasheet.

So, we see a worst case 10ns hold time (constraint), and a 30ns worst case clock-to-Q time (property). As they are both worst case, I'm not sure we can even say that the bank address won't be out any earlier than 10ns.

Hope this helps
Ed


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 16, 2014 9:13 pm 
Offline

Joined: Wed Feb 05, 2014 7:02 pm
Posts: 158
Your explanations are helping to relieve my confusion and starting to jar my memory from 3 to 4 years ago. We're getting there, so thanks for the help so far. Unfortunately, I still have questions.

Quote:
To address the first question - the timing of the Bank Address outputs - the reality is that the '816 drives those outputs as soon as it can after falling phi2. That is, they are not driven prior to the rising edge, but driven after the falling edge. It's just that (according to the worst-case figures in the datasheet), "as soon as it can" is not terribly quick, and the 30ns figure given is getting close to the 35ns minimum phi1 time. If you're not in worst case situations, and if you're running slower - say 12MHz - then the BA output might be at 20ns and the rising phi2 at 41ns or so.
Still, the processor itself controls the hold time for RWB and the address bus after the negative edge of PHI2 has elapsed... how does it "know" that 10ns has elapsed and it's "time to release the address bus, and time to tell the memory or I/O to release the data bus"? (Preliminary Answer according to these lecture slides: buffers. Before this point in my life I've never had to design a circuit where hold time was a concern or would need to change before the next clock edge. Example- a ripple counter, provided the clock speed is reasonable- won't have these problems.)


Quote:
It may be that you momentarily confused the setup constraint on a flop's input with the clock-to-Q propagation time for its output. I know this is easily done because I've done it myself.
Indeed... and glad to know I'm not the only one who has done so.


Quote:
and a 30ns worst case clock-to-Q time (property)
I'm sorry, where is this listed? Are you talking about tBAS, which says it's 33ns? Also, If Wikipedia is to be believed, setup time for a flip flop is the "The minimum amount of time that the bus should be stable on a flip-flop inout before the value is sampled on the clock edge".

The timing diagram in the image above seems to use setup time as "The maximum amount of time it takes for the bus to stabilize" for tBAS and tADS, and then goes and uses setup time as "The minimum amount of time that the bus should be stable before the value is sampled on the next clock edge" for tDSR. However, the datasheet appears to be consistent in this regard- since tBAS and tADS are given maximums and tDSR is only given as a minimum. I presume, that If you know one, you can calculate the other based on the half-period of PHI2. The tBAS and tADS maximums don't appear to describe clock-to-Q times... or am I mistaken?


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 12:33 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California
I doubt that it really clocks the address out and takes up to 33ns for it to appear at the pins, as if it were just a clock-to-Q time. I don't think it's a matter of it "knowing" when it's time to remove the old R/W and address and when it's time to put the new ones on; it seems that instead the clock edge starts a process of getting to the next set of states, and it takes up to 33ns for things to get fetched and make their way through all the logic gates inside the chip and reach the outside world. tADS and tAH are on the same line of the timing diagram representing R/W and A0-A15 too; so the transition for both, from valid state to valid state, is guaranteed to be confined to the time between tAH and tADS.

tBAS and tADS however are saying, "We might take up to this amount of time. We probably won't take that much; but just in case, please make your circuit so it doesn't require it sooner;" while the tDSR is saying, "Please give us at least this amount of time so we can be sure to latch in the right data. We probably won't need that much; but just in case, please give us at least that much."

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 1:38 am 
Offline

Joined: Wed Feb 05, 2014 7:02 pm
Posts: 158
GARTHWILSON wrote:
I don't think it's a matter of it "knowing" when it's time to remove the old R/W and address and when it's time to put the new ones on; it seems that instead the clock edge starts a process of getting to the next set of states, and it takes up to 33ns for things to get fetched and make their way through all the logic gates inside the chip and reach the outside world.


The only part that's really worrying me about what you say here is that if the clock signal just 'starts' a process of combinational logic to get to the next state, it's theoretically possible that the data on the bus will not have been sampled by the internal flip-flop properly (where the value sampled != the value on the bus), by the time R/W is de-asserted and a new address is starting to be put on the bus. I think it's called "hold time violation," and while it appears that most '816 hardware can tolerate them (going back to what you say about "probably not needing 10ns to sample"), it seems that there's nothing to prevent incorrectly sampling the data bus just based on the current internal circuitry conditions (i.e. R/W could de-assert and change the data bus before the flip-flop properly sampled the data bus).


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 2:55 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California
Quote:
by the time R/W is de-asserted and a new address is starting to be put on the bus.

The things on the bus will need time to respond to R/W also, meaning they won't take the data away that quickly. If there's any time period that the bus is not driven, its capacitance will definitely hold the last state for a long time.[*] The only concern is what Jeff was getting at, at viewtopic.php?f=4&t=2438&p=24250#p24250, in the topic "Managing the 65816 multiplexed bus."

In contrast, the 6502 doesn't drive the bus at all during the phase-2-low period, and I have never had a 6502 breadboard or PC board creation fail to work on first try.

[*]When was setting up the testing for my 4Mx8 SRAM module, I found, by accident, that bus capacitance was holding the data for a millisecond, or a million nanoseconds, in the absence of anything driving the bus, since everything sitting on it was CMOS input loads. It might have gone much longer, but I didn't try. It was just a curious passing observation.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 5:21 am 
Offline

Joined: Wed Feb 05, 2014 7:02 pm
Posts: 158
Alright, I'm having a bit of trouble wrapping my head around this, but that's okay for now... we'll get there LOL.

And what about the 65816 data bus itself switching direction (since that's an internal operation depending on the CPU)? If the process of generating the correct inputs and outputs only occurs after a clock pulse is generated (in other words, purely combinatorial logic), then it's also possible for the bus to switch direction before the hold time is satisfied.

As I understand this, the hold time is for the CPU to finish properly sampling the value on the bus- not necessarily because that it actually takes that long for the combinational logic for the current negative-cycle of PHI2 to propagate to the output.

Of course, I could completely misunderstand the timing diagram. Is it possible the data hold time's minimum constraint of 10ns is due to the time it actually takes for the combinatorial logic for bus-switching to propagate to the transceiver? And that the flip-flop itself doesn't actually require a 10ns hold time? Of course in this case, 10ns would have to be the MAXIMUM hold time for the flip flop, because after 10ns, there's no guarantees that the data bus will be an input anymore :P. If that is the case, I wonder what the true hold time for the internal flip-flop is.

And sure enough, the hold time/bank address times could be wrong- BDD correctly says that if these times were correct, reliable 20 MHz operation wouldn't be possible... but it is/was.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 12:21 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10943
Location: England
It's best to distinguish two types of numbers in datasheets and timing diagrams:
Properties are statements about what the chip will do.
Constraints are requirements that the outside circuit should meet.

In both cases, we'll see a range between best case and worst case, or we'll be given worst case.

When we say numbers in the datasheet might be wrong, we mean that the worst-case properties are worse than we've seen in practice: the chip is not so slow to take actions as WDC say it might be. We mean that worst-case constraints are tighter than we've seen in practice: our designs which violate those constraints nonetheless seem to work.

There's good reason for both of those: datasheets have to give the worst case for the worst combination of voltage, temperature and process variation. It's perfectly possible that the chip fabrication has never produced chips which give the worst variations of process tolerances. The manufacturer is within their rights to produce lot after lot of chips which are better than spec. As for temperature, we usually don't run everything as hot as is allowed. As for voltage, our supply rails are probably not right on the edge of spec.

Again, note that a micro such as the 816 or 6502 does not generally have any substantial combinatorial circuits intermediating between clocks and outputs. Outputs generally come more or less directly from latches or flops. Inputs arrive more or less directly at latches or flops. The delays we see are the delays of the pad drivers, the parasitic effects of chip leads and pin loading, and delays in the internal clock distribution. (If outputs are driven from combinatorial circuits, they might glitch. If they come from clocked elements, they will transition just once in a clock cycle - supposing there is no conflict from other external drivers on the same bus)

Also note that busses don't change direction as such: one driver switches off and another driver switches on, in some order. Nothing ever happens instantaneously. When a driver switches on, it takes time to start moving the voltage, and if all drivers are off it takes time for the bus to move on its own (TTL inputs float high, CMOS inputs don't.)

Hope this helps.
Ed

Edit: I should mention there's another effect with the timing of flops, at least with the circuits I'm familiar with. As the input comes later and later, and changes from meeting to violating the setup constraint, before the point at which the flop fails, the clock-to-Q propagation delay starts increasing. So the spec for setup time is chosen by the supplier to produce an acceptable clock-to-Q. You might find that you can violate setup, and your circuit is quite happy with the resultant push-out of the timing of Q. Not that this is best practice, but my point is that these are interdependent, not independent, numbers in the spec.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 1:54 pm 
Offline

Joined: Wed Feb 05, 2014 7:02 pm
Posts: 158
Regarding the pin/parasitic capacitance, etc... I think I'm simply going to defer to the CPLD manufacturer's timing diagrams for now. As interested as I am in how to model circuits incorporating those measurements, I feel my energy is for now best spent elsewhere.

Now, onto the gist of why I'm asking these questions... BDD mentioned a point that it might be possible to do mem-to-mem xfer in a single 6502 bus cycle... I personally want to see if that is possible at 14MHz (maybe even up to 20MHz in the future)... so that the theoretical xfer rate is the clock speed. This means that I have to emulate certain portions of the 65816 bus cycle, including the hold times on the data input/output. However, I think for the time being, I'm just going to make this simple on myself and focus on just getting the thing to work.

One of my goals was to make the controller fit into a 40-pin DIP/44-pin PLCC. Yes, I know this will never be available in DIP form- the point was to have a reasonable pinout that is contemporary to the time period from which the 6502/816 originated. However, it would make timing easier to not multiplex signals together, especially when implementing one-cycle mem-to-mem. And besides- 68k was a 64-pin DIP, and the NEC Ethernet controller was > 40 pins. So I guess it's not a huge deal to use a 68-pin PLCC.

For I/O devices it's easier; as long as each device has a dedicated I/O channel, I/O and memory access happen concurrently, so one xfer per bus cycle is completely feasible. Ditto with block-fill. Only mem-to-mem would take 2 bus cycles, and even then, one could justify using mem-to-mem in place of MVP/MVN.


My final question regarding "hold times", however... how do I make sure that my circuit obeys the hold times required for the I/O devices and memory circuitry? Do I just rely on the DMA input/output signals taking a finite amount of time to respond to the next clock edge and not worry about timing violations (i.e. things happening too fast, such as R/W being de-asserted before the hold time for the destination is satisfied)?


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 2:13 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10943
Location: England
At my level of competence, yes, you just assume that the writing device leaves its outputs in place long enough for the reading device to be happy about the data not changing too soon. And you assume that no other device will put new data on the bus so soon that it disturbs the reading device. If you arranged that all devices can drive the bus only during phi2, you're safe. But that is by no means the only way to tackle this - nor is it the only safe way. (We often see statements made about the role of phi2 which are not accurate. Phi2 is one possible demarcation between the early part and the late part of the cycle.)

I can see how a device-to-memory DMA can transfer data in a single cycle, similarly vice-versa. For memory-to-memory, you must need at least two memory banks, one to read and one to write, or you need fast enough memory to perform a read access and then a write access within a single CPU cycle. You may or may not need to supply independent addresses to the two memory banks. Or is there some simplification I've missed?

Cheers
Ed


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 3:40 pm 
Offline

Joined: Wed Feb 05, 2014 7:02 pm
Posts: 158
Quote:
For memory-to-memory, you must need at least two memory banks, one to read and one to write, or you need fast enough memory to perform a read access and then a write access within a single CPU cycle. You may or may not need to supply independent addresses to the two memory banks. Or is there some simplification I've missed?

No, that's more-or-less the gist of it. I will need to supply two independent addresses to the bus... if the 8237 is any indication, a temporary holding register is required. So I need to know whether a memory-to-memory xfer in one clock cycle, both read and write, can be done within '816 timing and memory timing specs.

One other caveat: Like the '816, the bank registers in this DMA controller should be able to temporarily increment. However, any attempt to overlap source with destination (which should be detected using a simple 24-bit subtraction internally) will cause the controller to refuse to perform the xfer.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 9:58 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8411
Location: Midwestern USA
BigEd wrote:
At my level of competence, yes, you just assume that the writing device leaves its outputs in place long enough for the reading device to be happy about the data not changing too soon. And you assume that no other device will put new data on the bus so soon that it disturbs the reading device. If you arranged that all devices can drive the bus only during phi2, you're safe. But that is by no means the only way to tackle this - nor is it the only safe way. (We often see statements made about the role of phi2 which are not accurate. Phi2 is one possible demarcation between the early part and the late part of the cycle.)

Plus, one must consider the pin-to-pin propagation time of the CPLD. Registered logic that is clocked by Ø2 will not affect outputs immediately upon a clock phase change. So some signal persistence following a clock phase change would "naturally" occur. Of course, the advertised prop time is a maximum, but it's doubtful that the actual prop time will be much faster than the spec.

Quote:
I can see how a device-to-memory DMA can transfer data in a single cycle, similarly vice-versa. For memory-to-memory, you must need at least two memory banks, one to read and one to write, or you need fast enough memory to perform a read access and then a write access within a single CPU cycle. You may or may not need to supply independent addresses to the two memory banks. Or is there some simplification I've missed?

I think your analysis is correct. I had ruminated "out loud" about the possibility of doing a single cycle transfer, but that was based on the notion that the source and destination banks were the same. In practice, I don't think that it is possible, since it's not practical to make a complete address bus change on each half-cycle of Ø2.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 10:02 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8411
Location: Midwestern USA
cr1901 wrote:
One other caveat: Like the '816, the bank registers in this DMA controller should be able to temporarily increment. However, any attempt to overlap source with destination (which should be detected using a simple 24-bit subtraction internally) will cause the controller to refuse to perform the xfer.

MVN and MVP cannot increment banks. Any such copy is limited to the address space of the specified banks. If the copy is such that bits 16-24 would have to be incremented, the address in one or both of the index registers will wrap and copying will start anew in the same bank(s). My recommendation is to not increment banks, as that is contrary to how the MPU would operate if doing the copy via MVN/MVP.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 17, 2014 10:16 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California
BigDumbDinosaur wrote:
I had ruminated "out loud" about the possibility of doing a single cycle transfer, but that was based on the notion that the source and destination banks were the same. In practice, I don't think that it is possible, since it's not practical to make a complete address bus change on each half-cycle of Ø2.

I had imagined (perhaps incorrectly) that cr1901 was talking about disabling the 573's outputs and having the DMAC put out the whole 24 bits of address with no multiplexing. That way they DMAC would not be limited by bank boundaries.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 49 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 12 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: