6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 21, 2024 11:45 pm

All times are UTC




Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 32 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
PostPosted: Fri Feb 04, 2000 4:50 pm 
Well, it's reassuring that the DIP40 part is not totally gone. The data sheet has a place for it but has been left blank, suggesting it's not a valid orderable part.

I might want to hotrod one of my old multi-board systems some day. Right now they work at 4 MHz with a speed switch to 1 MHz for the slow sections of memory and I/O.

I'm curious . . . what do you do with the 6522's aside from parallel I/O? Have you considered doing the serial stuff in firmware rather than hardware? That might save some power, not to mention size and cost. I guess you don't need to worry size and cost so long as you use the VIC. What's more, so long as you use the VIC, you can't speed the circuit up, can you?

Uli

P.S. - I had the odd recollection that the VIC used the 6510 rather than a 6502. Isn't that the case? I guess I'll have to check my schematics. Uli


Report this post
Top
  
Reply with quote  
PostPosted: Sat Feb 05, 2000 12:06 pm 
>Then I replaced the two 6522's with 65C22's, and it also >worked fine, and saved even more power (the 65C22's are not >WDC, they are another obscure brand, but I found them on the >web).

I remember reading somewhere in the forum discussions that 65C22s do not have a open collector IRQ line! I'd like to confirm and sort the issue.

Where can I find the complete datasheets for the 65C22/65SC22?


Report this post
Top
  
Reply with quote  
PostPosted: Sat Feb 05, 2000 11:43 pm 
Offline

Joined: Fri Aug 30, 2002 3:06 pm
Posts: 124
Location: Colorado
>Right now they work at 4 MHz with a speed switch to 1 MHz >for the slow sections of memory and I/O.

I would like to see the schematic for your speed switch. I recently designed one on paper, but I've not yet tried it on a breadboard.

>I'm curious . . . what do you do with the 6522's aside from >parallel I/O?

On the VIC, they still have their original function of keyboard scanning. Some of the I/O pins are used for robot-related functions (for example, the two CB2's are used as PWM outputs for motor speed control).

>Have you considered doing the serial stuff in firmware >rather than hardware?

I'm not doing anything serial with the VIC board.

>...so long as you use the VIC, you can't speed the circuit >up, can you?

No, and it doesn't matter. There's nothing going on that would work any better at a higher speed.

>P.S. - I had the odd recollection that the VIC used the 6510 >rather than a 6502. Isn't that the case?

That would be the C-64. The VIC uses a 6502A. It only clocks it at 1 Mhz, but I think they used an "A" because the 6560 "VIC chip" needs to steal the bus during phase 2.

Pete


Report this post
Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 05, 2000 11:46 pm 
Offline

Joined: Fri Aug 30, 2002 3:06 pm
Posts: 124
Location: Colorado
Yes, a few weeks back I posted that note in this forum after I noticed in the WDC data sheet that there was no OC outputs on the IRQ. WDC says it was for a "speed" issue, but I think they just screwed up...

Pete


Report this post
Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 06, 2000 4:15 am 
Well, I'd think they'd let the VIC chip steal the bus during Phase-1. That's what the Apple-][ did with its video memory. Because the sequential refresh of the display would then also refresh the DRAMs. That also is why the video memory was segmented the way it was. By the way ... doesn't VIC stand for video interface controller? What does this do in a robot?

The speed switch I am most familiar with (the one I remember the best) used two unused gates on the CPU board to make a multiplexer. The first one, which was when I only had 32 of the 2147's I needed to make one of my 16K memory boards fast, selected the 4 MHz tap from the countdown chain, while a 1 on either of the two top addresses made ths mux select the 1 MHz tap. Since addresses settle well into phase-1, the system doesn't glitch. This was really slick, since the code sections of the assembler or the editor and, in fact the XPL0 compiler and i^2L iterpreter lived in that bottom 16K so the machine really seemed to run at 4 MHz most of the time even though only 1/4 of the memory map was fast memory. Now, of course, I can put a single 32-pin 15 ns chip in the system and let two processors share it with 75% of the memory bandwidth to spare. That memory sharing is an interesting notion.

I once used five processors to drive a video card. Four of them were used for drawing a 256x192 quadrant of the image, while the remaining one ( a biger but not as fast device ) performed the computations needed to partition the drawing of each line. There were four separate 2K ram chips fast enough to allow alternate phase access. This was not as simple as it seems because of the data hold time requirements of most RAM devices. When you share the memory you need the processors to be running with a nominally 60/40 duty cycle, high/low. You must create two complementary 60/40 clocks from the same reference in order to have the 40% (still barely in spec for the 6502) cycle as the high phase of the clock used as phase-0. That leaves time for data hold on each processor's clock cycle. The way to do this is to use a 5x clock, a biquinary counter like the 74196 or 7490 to divide the oscillator by 5, which produces a 60/40 duty cycle on the divide-by-5 output. One precise way to generate the complement of that clock is to run it through a 4-bit SIPO shift register (7495 or 74164) using the 4th tap and a 10x clock to shift the phase. There are probably cleaner ways by now, but this scheme indicates what has to be done.

Couldn't you use any parallel port bit to time yout PWM? Imagine you had a processor running at 10 MHz instead of 1, wouldn't that allow you to do more, control the PWM more closely, and perform more tasks aside from that? Now I realize that you've found this to be adquate, but if you had no VIA's wouldn't the thing work as well from HCMOS parallel port bits contained in a '574? By virtue of the symmetrical outputs of the HC574, you'd already get closer control of the pulse width. The 6522 is latched rather than clocked, so its data propagates for an unknown time before being latched. Whats more, you can't operate the 6522 very fast.

I'm not saying it's not possible to do the job well WITH a 6522, but I'd say you need to consider that, given more time relative to the task at hand by virtue of the faster processor speed, you can do more and gain resolution on every task you have.

Uli


Report this post
Top
  
Reply with quote  
PostPosted: Sun Feb 06, 2000 4:50 am 
You can use a 4 Mhz oscillator and divide it down to get 1 Mhz.

Toshi


Report this post
Top
  
Reply with quote  
PostPosted: Sun Feb 06, 2000 9:03 pm 
Offline

Joined: Fri Aug 30, 2002 3:06 pm
Posts: 124
Location: Colorado
>By the way ... doesn't VIC stand for video interface >controller? What does this do in a robot?

It makes NTSC video, just like in a VIC-20. I hook it up to a monitor so that I see the BASIC code. It's very convenient to do the high-level functions with the built-in BASIC, and do low-level or time-critical stuff in native code. While I'm working on BASIC code, I have the keyboard plugged in - when the robot is ready to 'run', the keyboard and monitor won't be there.

>The speed switch I am most familiar with (the one I remember >the best) used two unused gates on the CPU board to make a >multiplexer.

It seems like with that scheme, you would have to be careful about when it switched it switched from 'fast' to 'slow' -- you can't jump into the middle of slow cycle arbitrarily. What I designed (untested) is something that *always* does one complete slow cycle before going back to a fast cycle. The slow/fast decision is made during phase 1 of each 'fast' cycle - if the CS is going to something slow, then the current cycle becomes slow for one complete cycle.

>Couldn't you use any parallel port bit to time yout PWM?

The 6522 timer does exactly what I need in hardware, with no software babysitting required. I can control both the frequency and duty cycle within a certain range, and it stays that way until I change it. To turn off the motor, I set a "0% duty cycle".

There's no need for any better resolution - there are 17 possible duty cycles, and a whole bunch of possible freqs.

Pete


Report this post
Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 27, 2000 2:00 pm 
I recently considered the ready availability of 64k x 8 srams and 64K x 8 EPROMS on old '486 boards taking up space in the closet. These parts are typically of 15 or 20 ns speed and therefore would allow them to be shared between two maximum-speed processors, both running at full speed. I considered this (1) because if offered some timing challenges and (2) because it opens the door for using one processor as a dedicated I/O processor while the other focuses on the assigned task. It occurs to me also, that the same strategy could be applied using more processors, allowing one to provide video timing, etc, while another does debug functions, and a third might be dedicated to synchonized I/O.

Most of my CPU circuits have been driven from a high speed oscillator divided down to whatever the application required. Now, with really fast srams available at low cost, it occurs to me that the phase-0 clock can be quite asymmetrical, therefore allowing a clock circuit to clock pultiple processors, arbitrating the memory access window by generating a long low phase and a very short high phase, during which it accesses shared resources.

Consider for a moment the function of a 74HC4017 "Johnson Counter" circuit. If you clock this baby at 20 MHz, you get ten mutually exclusive positive-going outputs of 50 ns duration. That's essentially a "one-hot-one" circuit. During each positive-going strobe, you could clock a processor, thereby providing a phase-0 clock to it to the exclusion of all the others. That allows you to put ten processors to work in a common environment without "stepping" on one another.

These days, procesors are cheap, and labor is costly. In an environment such as this one, you can separate tasks in a way requiring minimal effort to avoid collisions. It does require some effort to allocate stack space, but an adder across the high 8 address bits could fix that readily enough. Interprocess communication could be through a page of registers allocated for semaphoring. You are not limited to using ten processors with such a scheme, because you simply take an unused ouptput and use it to reset the counter asynchronously, thereby shortening the sequence without altering the clock's high phase as seen by each processor.

I've considered this for such applications as generating video display timing with one processor while manipulating the display memory contents with another, and determining the nature of these contents with yet another.

How does that look to you?

Uli


Report this post
Top
  
Reply with quote  
PostPosted: Sun Feb 27, 2000 11:01 pm 
Offline

Joined: Fri Aug 30, 2002 3:06 pm
Posts: 124
Location: Colorado
>I've considered this for such applications as generating >video display timing with one processor while manipulating >the display memory contents with another, and determining >the nature of these contents with yet another.

That's basically what happens in a VIC-20: the 6560 is really a dedicated CPU that knows how to generate video timings. On boot-up, the main CPU tells it the address of the video RAM. The 6502 has the same RAM in it's address space, but accesses it 'out of phase' with the 6560.

Your Johnson counter idea is intriguing, but there's a point of diminishing returns, where the hardware is just too complicated to be practical, relative to what it can actually do.

Pete


Report this post
Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 05, 2000 4:09 pm 
Well, there's nothing complicated about it. If you simply clock the various processors sharing a single RAM, e.g. a 64Kx8 sram as you commonly find sitting in the closet as cache on an old '486 board, or any other devices fast enough to handle the resulting cycle rate, you feed each processor a different output from the 74HC4017 as Phase-0. As a result, you'll get a low phase, corresponding to the Phase-1 part of the clock cycle, that's quite long, and you get a high phase that's half the johnson counter's input clock width. All the outputs on a '4017 are mutually exclusive, with no overlap at all, hence, any signals gate with the resulting clock should work fine as an arbiter between the processors.

There's nothing complicated about that is there? You still generate a set of read and write strobes in the usual way, and, in fact, you can play some interesting tricks with the output from the 'HC4017 and the bus strobe logic, so you combine the various clocks and processor signals by a wired-or scheme so you automatically generate a single set of strobes to the various shared resources. Naturally, the addresses will have to be muxed in some way and the various processors will all have to have their data busses buffered so the shared and exclusive resources don't step on one another.

Uli


Report this post
Top
  
Reply with quote  
PostPosted: Sun Mar 05, 2000 10:58 pm 
Offline

Joined: Fri Aug 30, 2002 3:06 pm
Posts: 124
Location: Colorado
[snip]
>There's nothing complicated about that is there?

[snip]
>Naturally, the addresses will have to be muxed in some way >and the various processors will all have to have their data >busses buffered so the shared and exclusive resources don't >step on one another.

I rest my case! ;-)
It's all very do-able, but it's a lot of chips and wires!

Pete


Report this post
Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 05, 2000 11:04 pm 
Offline

Joined: Fri Aug 30, 2002 3:06 pm
Posts: 124
Location: Colorado
Question: What goes on in a multi-processor PC box?
Where does the hardware between the CPU's overlap?
How do they share the load?

Pete


Report this post
Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 05, 2000 11:18 pm 
If you mean that it's a lot of hassle wiring up the data and address buses, I doubt you'll get away from that. I agree entirely that a sharing scheme requires more work than one with a single processor, but it requires less work than two systems with a single processor each, and if the code is intelligently written, though there's less than twice the work, there's more than twice the processing power.

Actually, I've been fooling around with a way to make it one chip and very few wires. Maybe 35 or 50 if the RAM is external.

The 650x core is so thrifty of silicon that one can conceive of an implementation that lives in a CPLD. RIGHT! a CPLD, not an FPGA. The 650x core consists of a register set and an ALU, an instruction decoder, a control sequencer (state machine) and a data path multiplexer. The data paths can be common tristate buses, however, so there's not so much silicon required to make them. A lot of gates can be saved by virtue of the fact the ALU is used to operate on the addresses as well as the data, thereby eliminating the need for all those gates used in a synchronous counter. The ALU really is just an adder/subtractor and the shift/rotate functions can be achieved with a multiplexer. Multiplexers are easily built in CPLD's as they're just a sum-of-products construct. Consequently they require only one product term per register, multiplied by the number of bits, namely 8 sum terms.

The latter device type is register rich but MUX poor and the lower-cost devices don't have much in the way of tristate resources for you to use as part of a multiplexing scheme for data path requirements. However, the XILINX 9500 CPLDs are pretty cheap and readily available via DIGIKEY, among others. Cypress has some pretty low-cost devices too. Maybe it would be worth rolling our own 650x+peripherals on a single chip. The resources with which to do this are not expensive and the devices are almost all in-system programmable, hence can be reprogrammed with a simple cable form the parallel port of your PC.

Uli.


Report this post
Top
  
Reply with quote  
PostPosted: Sun Mar 05, 2000 11:45 pm 
The simplest way to look at it is to consider what happens when you have a memory shared between a video refresh circuit and a processor. That happens in almost all the classic microcomputers of the '80's, including the IBM PC. The VIC, for one example, had a circuit, not unlike that used in many other systems, that generated refresh addresses to the common memory block at such a rate that the refresh could take place, but on the opposite clock phase from that on which processor accessed memory. One processor, one refresh circuit, no problem . . . so long as the access windows are confined to opposite clock phases. If you have a number of processors sharing a common memory block, or anything else, the clock phases can be used to arbitrate whose turn it is to hit the common resource. If you have two processors with a 1 MHz clock, they need simply be operated from opposite phases of the oscillator. Look upon this as simply dividing the master 2 MHz clock by two and driving the processors with the output of the flipflop that divides it. One processor gets the Q output as its Phase-0 while the other gets /Q as its Phase-0. Since the internals delays are more or less equal, the phases won't overlap. There's a way to make certain there's no contention, but we're not there yet.

Each of the two procesors is operating at its full bandwidth without waits all the time, and so long as the access time of the shared resource, memory perhaps, is sufficiently shorter than the Phase-2 window, everything will be fine. The two processors are in no way aware of one another. Of course, if they have to communicate, they can semaphore one another using the shared memory.

If they are to become aware of one another, it becomes a committee based operation, and efficiency goes down. It's like a corporate engineering team. If you are given a job to do and do it by yourself, it's easy ... you just "do" it. However, if it's a big job and has to get done in a limited time, your boss may decide that several of you must work together. This means the tasks can be divided, but, alas, there have to be those occasional team meetings, so you can make sure you're all on the same page. Sometimes you spend more time coordinating with the others than doing your job.

It's like that with multiprocessor systems as well. If the tasks are to be shared by several processors, they have to be in communication with one another to some extent just to monitor progress.

Perhaps you can understand that have been in the corporate situation I mentioned.

If your processors and the shared resources are fast enough to allow multiple processors to take a swipe at memory every cycle, you can use an approach like what I described earlier, using a "johnson counter" e.g. MC74HC4017 to generate short active (phase 0) phases and long idle phases to various processors during each cycle. They then operate more or less transparently to one another. It's like having email. Instead of going to a meeting, the data you need just magically appears in your mailbox and you can process it as you need to.

If you have overlapping access, it can be done in the same way as the two-processor tactic, with the exception that you have to coordinate whose turn it is to jump in. There has to be significant prioritization of access to the shared resource and things become a lot more burdensome. That's just like the big committee vs the small team.

Give it some thought, it really does work like that.

Uli


Report this post
Top
  
Reply with quote  
PostPosted: Mon Mar 06, 2000 9:34 pm 
Offline

Joined: Fri Aug 30, 2002 3:06 pm
Posts: 124
Location: Colorado
Yes, all that is understood.

What I asked is: How *specifically* do multi-CPUs work in a PC?

Do they share RAM on different clock timings?
Do they share only certain parts of RAM, or all of it?
Is there some multi-port RAM involved?
Which CPU talks to peripherals, and who decides?

I'm asking because there might be something to be learned from the folks that have already made 'real-world' systems with multi CPUs.

Pete


Report this post
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 32 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 25 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron