6502.org • View topic - 100MHz TTL 6502: Here we go!

View unanswered posts | View active topics

Board index » 6502.org Users Forum » Hardware

All times are UTC

100MHz TTL 6502: Here we go!

Page 5 of 13

[ 182 posts ]

Go to page Previous 1, 2, 3, 4, 5, 6, 7, 8 ... 13 Next

Previous topic | Next topic

Author

Message

Windfall

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Thu Oct 22, 2020 2:39 pm

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands

ttlworks wrote:

Die Devices sells bare chip dies

Do they offer a free, teeny tiny soldering iron, when they ship their dies ? :wink:

Top

Drass

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Thu Oct 22, 2020 4:23 pm

Joined: Sun Oct 18, 2015 11:02 pm
Posts: 428
Location: Toronto, ON

ttlworks wrote:

Maybe not relevant for this project:

Die Devices sells bare chip dies, 74AUC2G53 is on their list.
Würth Elektronik is able to bond bare chip dies to PCBs.

That's so cool! Maybe we can create our own SMD 74AUC283 on a little PCB with tiny castellations.

Btw, I like the 74AUC2G53 replacement for a 74'151 equivalent. Thanks.

Looks like in the right hands, this 74AUC2G53 is a real powerhouse. :!:

_________________
C74-6502 Website: https://c74project.com

Top

ttlworks

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Fri Oct 23, 2020 5:25 am

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1393

Windfall wrote:

Do they offer a free, teeny tiny soldering iron, when they ship their dies ? :wink:

I think they don't.
To me, it looks like sort of a "if you have to ask for the price, you know you can't afford to buy it" business.

Drass wrote:

That's so cool! Maybe we can create our own SMD 74AUC283 on a little PCB with tiny castellations.

Nah, just build sort of an improved AM29203 with 74AUC, then sell it to other TTL CPU hobbyists.

Attachment:

am29203.png [ 107.68 KiB | Viewed 1337 times ]

BTW: Würth Elektronik also is able to integrate SMD components into PCBs.

Top

joanlluch

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Fri Oct 23, 2020 2:39 pm

Joined: Thu Apr 11, 2019 7:22 am
Posts: 40

Drass wrote:

Looks like in the right hands, this 74AUC2G53 is a real powerhouse. :!:

Some time ago, I spent some time studying the viability of using similar analog switches to make a relay-logic based processor. It looked to me that it was feasible, and it used far less number of components than an equivalent discrete transistor logic gates based processor.

It's a pity these things do not come in larger packages, such as quad 2:1 (SPDT) switches with independent control inputs, as it is the case for 1:1 switches. Not even dual 2:1 switches seem to be available, as far as I know. However there's always the possibility to pair 74xx3125 with 74xx3126 quadruple 1:1 switches to to get 2:1 functionality with less overall number of components.

Top

ttlworks

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Fri Oct 23, 2020 3:09 pm

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1393

joanlluch wrote:

Not even dual 2:1 switches seem to be available, as far as I know.

The problem is, that analog switches like 74HC4053 (triple individually controlled SPDT) are too slow for being useful when trying to build a fast CPU.

Top

joanlluch

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Fri Oct 23, 2020 3:24 pm

Joined: Thu Apr 11, 2019 7:22 am
Posts: 40

ttlworks wrote:

joanlluch wrote:

Not even dual 2:1 switches seem to be available, as far as I know.

The problem is, that analog switches like 74HC4053 (triple individually controlled SPDT) are too slow for being useful when trying to build a fast CPU.

Yes, that's the point. There's single fast 4:1 and 8:1, as well as multiple 1:1 switches with both common and separated control inputs, but not 2:1 fast Switches with the same or similar pin layout than say a 74HC4053.

Top

Dr Jefyll

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Fri Oct 23, 2020 4:20 pm

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3353
Location: Ontario, Canada

joanlluch wrote:

[...] but not 2:1 fast Switches with the same or similar pin layout than say a 74HC4053.

FWIW, there are faster 4053's than the HC version. Check out 74LV4053A, 74VHC4053A and the MAX4619. These are triple SPDT switches which preserve the 4053 pinout. (They're not as fast as the single-switch AUC part we're using, though.)

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html

Top

Drass

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Tue Oct 27, 2020 2:40 am

Joined: Sun Oct 18, 2015 11:02 pm
Posts: 428
Location: Toronto, ON

Let's now take a closer look at the pipeline in this design. The objective is to reduce the cycle-time while keeping the cycle-count fixed. The critical path in the CPU falls squarely on the ALU, and associated pre- and post-processing. Rather than cramming all this into one cycle, the basic strategy is to push pre-processing to the prior cycle, and post-processing to the next. This allows the ALU to have the whole cycle to itself, giving us the headroom we need to boost the clock-rate.

Attachment:

8EE48AA7-1063-4FE1-9FE5-2F36299202AF.jpeg [ 37.24 KiB | Viewed 1240 times ]

Pre-processing here refers to the work required to set up the inputs to the ALU with appropriate values. That seemingly innocuous task takes a surprising amount of time -- we have to fetch microcode, decode control signals, select source values and output-enable the approriate registers. Post-processing, on the other hand, refers to updating the status flags and writing to the destination register. Rebalancing this workload around the ALU, we end up quite naturally with a four-stage pipeline, as follows:

Attachment:

0A20172C-7EE4-4D9D-96DE-9DFC021A9BF8.jpeg [ 52.57 KiB | Viewed 1240 times ]

We have FETCH, DECODE, EXECUTE and WRITEBACK -- the idea is to perform a roughly equal amount of work at each stage and then to pass the baton to the next. Along the way, we capture intermediate results in pipeline registers. Specifically, we have the Microinstruction Register (MIR) after the FETCH stage, we have ALUA, ALUB and ALUC registers at the ALU inputs and we have the R register at its output. The FTM (Flags To Modify) and RTM (Registers To Modify) registers direct the WRITEBACK stage regarding which flags and destination register to update. (More on the WRITEBACK stage below.)

Memory operations using "flow-through" synch RAMs are a good fit for this arrangement. A key feature of these RAMs is that we can clock an address into the RAM's internal registers then read the data value from its outputs before the next clock edge occurs. The ADL and ADH registers allow the pipeline to work in this same way with asynchronous peripherals. For writes, there is also the WE register and a Data Output Register (DOR).

Attachment:

D50672AF-FBC0-4152-8AC2-424CA09C1F35.jpeg [ 60.45 KiB | Viewed 1240 times ]

As we've discussed before, the ALU features a "recirculate" path to allow the result to be fed back into its inputs. This is done during address calculation, for example, when the ALU result is immediately required in the next cycle. Memory reads are also recirculated, as either ALU operands or addresses to be used in the next cycle.

Attachment:

E7B259A6-3F7B-478E-B5C7-D5FBF8A343C2.jpeg [ 64.17 KiB | Viewed 1240 times ]

The WRITEBACK stage calculates the flags based on the ALU result, updates the P register according to the FTM, and writes the result to a destination register according to the RTM.

Attachment:

16FC68AD-EC89-4E0D-8C9C-6E79EC7FE30A.jpeg [ 76.01 KiB | Viewed 1240 times ]

One important thing to highlight is that the WRITEBACK stage writes to registers using a mid-cycle rising clock-edge (PHI2 rising edge). Meanwhile, registers are always sampled at the end of the cycle (PHI1 rising edge). This discipline ensures that we always get an up to date value when a given register is being read and written to in the same cycle. For example, the P register may be updated in the same cycle that a branch test is being executed. Delaying the branch test until the second half of the cycle ensures that the branch test evaluates correctly.

Attachment:

FE96582E-0EDA-4322-B47B-08E82FCF4BBC.jpeg [ 42.79 KiB | Viewed 1240 times ]

Beyond allowing enough time to calculate the flags, a separate WRITEBACK stage allows the R register to neatly buffer the ALU from the rest of the CPU's internal registers (and the added bus capacitance they would impose). There are over ten destinations for the ALU output, all of which would add unnecessary delay to the ALU's critical path were they connected directly (10 loads x 3pF per load x 50Ω + 6" trace delay = 2.5ns).

Finally, we should note that the DECODE stage must receive a fresh instruction every cycle in order for the pipeline to function smoothly. To begin with, FETCH retrieves a new opcode from main memory (or simply generates a BRK on a CPU reset) and feeds it to DECODE stage via the Instruction Register (IR).

Attachment:

7D710D00-CA4C-4635-A91A-B0BB487A09C9.jpeg [ 70.93 KiB | Viewed 1240 times ]

Thereafter, FETCH will retrieve microinstructions associated with that opcode from the microcode store, one per cycle, and feed them to the DECODE stage via the the Mircoinstruction Register (MIR).

Attachment:

60D1FE67-C0B6-4E71-B174-46F6CF50E679.jpeg [ 68.22 KiB | Viewed 1240 times ]

Once we reach the end of the current opcode, a new opcode is fetched and the sequence repeats again. The DECODE stage, meanwhile, always delivers appropriate control signals for downstream pipeline stages, whether by decoding the opcode in one cycle or a microinstruction in another.

And that's it. We'll take a look at how this pipeline executes cycle-accurate 6502 instructions in a future post. For now, the main thing to note is that this is a relatively simple pipeline that still packs a punch in terms of performance. By way of comparison, the critical path on this pipeline is about 20ns long (50MHz) as compared to 50ns (20MHz) on the C74-6502 -- that's assuming similar components in both cases; ie, AC logic for the ALU and CBT logic for tri-state buffers. The hope of course is that faster components and further optimizations (like the FET Switch Adder) will enable us to double the clock-rate yet again and reach the 100MHz milestone. Only time will tell whether we'll manage to get there.

Cheers for now,
Drass

P.S. Many thanks to Dr Jefyll for helping to clarify and edit this description. It is much better for it. Thanks Jeff!

_________________
C74-6502 Website: https://c74project.com

Last edited by Drass on Tue Oct 27, 2020 1:32 pm, edited 1 time in total.

Top

joanlluch

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Tue Oct 27, 2020 7:52 am

Joined: Thu Apr 11, 2019 7:22 am
Posts: 40

Hi Drass,

That's interesting.

I have a question on the "writeback" stage and the flow of the pipeline. You explain that by performing register writes on the mid-cycle, then reading them at the end of the cycle, you avoid data hazards on the pipeline. However, I think this only works (possibly) because the 6502 uses two cycles anyway to complete instructions. So in fact you have a two step gap between the fetch-decode-execute-writeback sequence of one cycle to the next. Or in other words, the next instruction fetch happens while the current cycle is in the executing stage, not while it is in the decode stage, as it would be the case for a standard pipelined risc processor. Is this right, or I am missing something fundamental here?

I mean, you have this:

Code:

 Fetch   | Decode  | Execute | Writeback
                     Fetch   | Decode  | Execute | Writeback
                                         Fetch   | Decode  | Execute | Writeback
                                                             Fetch   | Decode  | Execute | Writeback 

As opposed to this:

Code:

 Fetch   | Decode  | Execute | Writeback
           Fetch   | Decode  | Execute | Writeback
                     Fetch   | Decode  | Execute | Writeback
                               Fetch   | Decode  | Execute | Writeback

Thanks
Joan

Top

ttlworks

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Tue Oct 27, 2020 8:14 am

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1393

Drass wrote:

Now this gives me quite a headache.

Drass, since the ALU input latches will be edge triggered 74AUC16374 chips or such...
have you considered building the registers with transparent latches like 74AUC16373 ?

When using transparent latches for the registers (transparent during PHI2=HIGH),
the register data inputs wouldn't have to be stable/valid before the rising edge of PHI2.

Top

joanlluch

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Tue Oct 27, 2020 9:27 am

Joined: Thu Apr 11, 2019 7:22 am
Posts: 40

ttlworks wrote:

If I am allowed to share what I learned from developing my processor, I found to my surprise that the calculation of the ALU result flags takes longer than I have anticipated. The Z flag is surprisingly quite expensive to have. In my case, in addition to the Z, C, V flags, I have to compute at some point a result based on condition code (EQ, NE, LT, GT, GE, and so on), which is common in many processors and adds even more delays. I think the latter is not needed on a 6502, but in any case, it looks that the initially seemingly innocuous task of computing the flags, must require really some time to be completed. Thus it also looks to me that just half a cycle for write-back might not be enough time for what's required, and that as Dieter suggests, we might need to allow some incursion onto the second half of the cycle to make this affordable.

Top

BigEd

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Tue Oct 27, 2020 10:05 am

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10799
Location: England

On the face of it, computing Z should be no worse than a carry-chain problem, and indeed the inputs to the Z function arrive earliest from the LSB and latest from the MSB, so Z might only need to be a gate-delay behind C. (I say this, knowing that computing Z often does seem to be a time-consuming thing. So I'm interested in why the difference between theory and common practice.)

Top

joanlluch

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Tue Oct 27, 2020 11:14 am

Joined: Thu Apr 11, 2019 7:22 am
Posts: 40

Hi Ed,

BigEd wrote:

As per my limited experience on processor design, essentially only acquired from reading books and my adventure at developing a processor architecture (on paper), I can say that your affirmation is only true in the context of ripple carry ALUs. Indeed, the Z flag can be computed with the same delay than the carry flag by just comparing each bit with the previous one.

However, once we apply carry lookahead circuits or carry skip strategies, then the Z flag starts to proportionally add some non-meaningless time to the set. This is because, apparently, there's no way to look for a Z flag ahead of the result. The Z flag must be computed from the result, so any delay on that is directly added up to the total ALU delay. If the ALU data width is significant, say 16 or 32 bits, then implementing the Z flag circuit with standard 74xx ics requires cascading them up to 3 or 4 levels.

Still, your comment from a conceptual point of view seems to be quite fair. So I would be interested too to know if it really there's no way to compute the Z flag ahead (or in parallel) to the result.

Top

BigEd

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Tue Oct 27, 2020 11:29 am

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10799
Location: England

I suspect even in more sophisticated ALUs, the LSB results come sooner. So it might be an advantage to structure the Z logic to take that into account, with the MSB bits having a shallower logic depth.

(Not just logic depth: sometimes a many-input logic gate will in practice react faster to some inputs than others, faster to one sense of transition than the other. I don't think TTL spec sheets tend to show this, but the timing models used within chips do, AFAIR.)

Top

joanlluch

Post subject: Re: 100MHz TTL 6502: Here we go!

Posted: Tue Oct 27, 2020 12:38 pm

Joined: Thu Apr 11, 2019 7:22 am
Posts: 40

BigEd wrote:

Makes sense!

Top

Page 5 of 13

[ 182 posts ]

Go to page Previous 1, 2, 3, 4, 5, 6, 7, 8 ... 13 Next

Board index » 6502.org Users Forum » Hardware

All times are UTC

Who is online

Users browsing this forum: No registered users and 2 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum