6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 21, 2024 5:36 pm

All times are UTC




Post new topic Reply to topic  [ 66 posts ]  Go to page 1, 2, 3, 4, 5  Next
Author Message
 Post subject: 6509 dissection: IDKFA
PostPosted: Thu Dec 16, 2021 8:34 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
Main thread: 6509 dissection

Another dissection brought to you by Frank Wolf and ttlworks.

Since we had dissected anything around the 6509 CPU core, it's now time to go for the core,
and then to put everything together with the results from the previous 6509 dissections.

6509 peripherals (MMU) dissection is here. //Main thread.
6509 pads\pins RES#, NMI#, IRQ#, RDY, SYNC dissection is here.
6509\8051 clock generator dissection is there.

Note:
For consistence with Frank's notation, low_active signals are named foo#, not /foo.

Orientation for all the chip pictures: PHI1(in) is North.


Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:35 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
Eagle 6.4 schematics for my schematic pictures in this thread,
just in case if somebody needs them.

Note: KiCad is supposed to be able to import these schematics,
unfortunately it doesn't seem to be possible to disable the layers 'name' and 'value' in KiCad schematics,
so making my schematics look nice and clean in KiCad will require some work, sorry.

Attachment:
6509r7_idkfa_schematics.zip [1.02 MiB]
Downloaded 34 times


Last edited by ttlworks on Tue Jun 18, 2024 6:31 am, edited 6 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:37 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
A picture of the 6509 silicon, with the interesting areas marked.
The dark green areas in the picture went covered by the previous 6509 dissections.

Attachment:
6509r7_orientation.png
6509r7_orientation.png [ 130.05 KiB | Viewed 2956 times ]


Just as a reference, another picture of the 6509 silicon without the markings.

Attachment:
6509r7_small.png
6509r7_small.png [ 1.35 MiB | Viewed 2956 times ]


Bus systems inside the 6509 CPU core:
(omitting instruction register and peripherals)

Attachment:
6509_core_bus.png
6509_core_bus.png [ 71.03 KiB | Viewed 2537 times ]


Last edited by ttlworks on Fri Jan 14, 2022 1:39 pm, edited 3 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:39 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
To us, it appears that the CPU core in the 6509 and 8501 is somewhat identical to the 6502 at the logic design point of view,
just with some variations in the chip layout.

The only difference we were able to spot is in "16) PHI1# driver".
Two PHI1# metal traces are going from "16) PHI1# driver" into "17) control latch\driver".
IN the 6502, IIRC we have two inverting super buffers fed by PHI1, separately driving each of the PHI1# traces.
In the 6509\8501 we have just one (bigger) inverting super buffer fed by PHI1, driving both PHI1# traces.

Like the 6530, the 6502 is an incredibly compact beast, and it's sometimes difficult to tell
where one function block ends and the next function block starts when just staring at the silicon.
It's very impressive, that the designers were able to get two manually routed chips of that size and complexity
done and running within only a year.

6502 is a work of true art\craftsmanship, and the designers had pulled every trick possible
to make it as compact as possible, because the size of a chip directly affects the sales price of a chip.
As in "one square millimeter of processed silicon is more expensive than one square millimeter of processed gold".

;---

Now about said tricks:

Before the 6502, the control circuitry of a CPU just was a big lump of random logic.
6502 uses a PLA (fed by the instruction Byte and the sequencer), with some random logic attached to it.

The sequencer is split into two parts.
One part (below the East end of the PLA) does the steps T0 (ALU data operation) and T1 (instruction fetch),
the other part (below the West end of the PLA) does the steps T2..T5.

When the interrupt logic detects RES#\NMI#\IRQ#, it forces the Byte from an instruction fetch to $00,
and $00 is instruction BRK, a software interrupt.
While the rest of the CPU executes the faked BRK instruction, the interrupt logic blocks PC increment
and "injects" the interrupt vector into the CPU internal address bus.

The next thing is, that BRK has one more step than the rest of the instructions: T6.
The T6 state flipflop is not part of the sequencers, on the chip it almost looks like an integral part of the interrupt logic.
Step T6 is triggered from the PLA after step T5 during a BRK instruction.
Also, the PLA does not check T6 state, so what has to be done during step T6 is defined in the random logic area.

RMW (read/modify/write) operations, that's incrementing/decrementing/shifting memory,
"fork" from the sequencer by using a flipflop "step chain" controlled by RDY,
everything related to this is deep inside the random logic area.

In my opinion, it all was intentionally build that way to minimize chip space.

;...

It is normal to have the FETs of a single logic gate "spread over half of the chip",
and it is normal that it turns out to be an AND/NOR combination gate.

It is normal to have transparent latches and NOR based RS flipflops which don't appear as such at first sight.

It also is normal to encounter logic design tricks like that one:

Attachment:
6509_mux_evil.png
6509_mux_evil.png [ 8.73 KiB | Viewed 2955 times ]


Last edited by ttlworks on Thu Dec 16, 2021 9:07 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:42 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
Note, that this thread is about how the 6509 core works at logic design level in a simplified way.

A detailed transistor level explanation would be a lot bigger, and a lot more confusing.

After the dissection, I had checked my schematics at transistor level again,
also I compared my schematics to the emu-russia "Breaking NES Book", Revision 1.
Greetings to' org' and his team: great job, nicely done.
//Thread about said book is here

Of course, "I see no errors in my schematics" is different from 'there are no errors in my schematics',
take care.


Had no capacity for a detailed understanding of the structure of the 6502 pipeline,
what would be necessary for understanding the sequencer,
what would be necessary for drawing a detailed state diagram for the sequencer,
what would be necessary for checking what the random logic does during a specific instruction,
what would be necessary for checking if the random logic is drawn correctly.

So there still is enough work to do for the experienced reader with too much spare time on his\her\its hands.

;---

First a small picture, to tell what in the cheat sheet goes where.

Note, that the location of the functional blocks in the picture does not directly correspond
with the physical location of the functional blocks in the silicon.

Attachment:
6509_idkfa_small.png
6509_idkfa_small.png [ 175.38 KiB | Viewed 2955 times ]


Now for the big cheat sheet: be warned it really is big.
It has some intrinsic beauty, but be warned that staring at it for too long is unhealthy: please take a break from time to time.

Attachment:
6509_idkfa.zip [1.28 MiB]
Downloaded 32 times


Last edited by ttlworks on Tue Jun 18, 2024 6:32 am, edited 6 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:44 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
1) ADL0..ADL2 constant generator

Interrupt vectors are at the upper end of memory.

The internal addres bus lines of the CPU are precharged to 1 during PHI2.
//External address lines change later during PHI1.

The interrupt logic controls three FET switches for pulling the internal addres bus lines ADL0..2 to GND
according to the interrupt vector, for more details see "23) interrupt logic".

Control signals for these three FET switches are named:
0>ADL0, 0>ADL1, 0>ADL2.

The three FET switches are located at the North West side of the mill,
and physically they almost look like an integral part of the A0..2 address bus latches\drivers.

//We now travel from West to East trough the mill of the CPU.

Attachment:
si6509_1_cg_adl0_adl2.png
si6509_1_cg_adl0_adl2.png [ 24.36 KiB | Viewed 2955 times ]

Attachment:
6509_1_adl0_adl2.png
6509_1_adl0_adl2.png [ 7.21 KiB | Viewed 2955 times ]


Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:45 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
2) Y register

Y register is located East from the A0..7 latches\drivers in the silicon.

Y register cells are half_static (and very compact).
The Bits are refreshed during PHI2,
and read/written during PHI1# =0 (what is different from PHI2 =1).

Note, that PHI1 and PHI2 are non_overlapping clock signals.
//unlike the 6502, the 6509 has no internal clock generator.

And that a FET pulling an internal bus line to GND also has to sink
the current through the pullup resistor (FET) inside the register cell to GND.

So trying to figure out what exactly happens during "illegal" NMOS 6502 instructions
when more than one register might connect to to a bus line may give you some more grey hairs,
because you would have to take the resistance of all the pullup resistors (FETs)
plus all the FETs switching to GND at a bus line into account.

In my opinion that's not a clean design, but it sure saved some chip space.
Also, when stretching PHI1 for too long, the data in the register is getting lost.

Attachment:
si6509_2_y.png
si6509_2_y.png [ 8.08 KiB | Viewed 2955 times ]

Attachment:
6509_2_y.png
6509_2_y.png [ 24.63 KiB | Viewed 2955 times ]


Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:46 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
3) X register

X register is located East from the Y register in the silicon.

Circuitry is identical to "2) Y register".

Attachment:
si6509_3_x.png
si6509_3_x.png [ 8.5 KiB | Viewed 2955 times ]

Attachment:
6509_3_x.png
6509_3_x.png [ 25.84 KiB | Viewed 2955 times ]


Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:48 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
4) S, the stackpointer

Stackpointer is located East from the X register in the silicon.

S Register cells are a nicer design than X,Y register cells, but they take more chip space.

The pullup resistors (FETs) for precharging the internal SB (special bus) lines
during PHI2 are integral part of the S register.

Attachment:
si6509_4_s.png
si6509_4_s.png [ 15.08 KiB | Viewed 2955 times ]

Attachment:
6509_4_s.png
6509_4_s.png [ 43.68 KiB | Viewed 2955 times ]


Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:49 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
5) ALU_in

ALU_in is located East from the Stackpointer in the silicon.

For understanding how the ALU works, it was necessary to break it into three parts.

For convenience, I had named the part where the bus signals enter the ALU just ALU_in.
What isn't exactly correct, because it also handles things like AND, OR, shift right.

Three bus signals enter ALU_in:
ADL, the internal A0..7 address bus,
DB, the data bus from/to the "15) data latches" (which connect to the external data bus),
SB, the special bus.

One ALU input is either ADL, DB, or inverted DB.
The other ALU input is either SB, or 0.

ALU_in generates NAND (open collector with pullup) plus NOR (open collector with pullup)
for both ALU inputs, these signals are used later for the ALU carry chain, the BCD detection\correction,
also for "9) V evaluation" (overflow detection that is).

The ALU result bus Q# is low_active,
FETs controlled by 'AND' are switching the NAND output signals on the Q# bus,
FETs controlled by 'OR' are switching the NOR output signals on the Q# bus.

Shifting/rotating right just switches the NAND output signals in a different order to the Q# bus,
except for Q7# which is pulled to VCC.

Attachment:
si6509_5_ALU_in.png
si6509_5_ALU_in.png [ 19.49 KiB | Viewed 2955 times ]

Attachment:
6509_5_alu_in.png
6509_5_alu_in.png [ 59.18 KiB | Viewed 2955 times ]


Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:53 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
ALU carry chain is located East from ALU_in in the silicon.

The ALU carry chain features the usual inverted/non_inverted carry scheme for binary adders.
For more details, start reading here.

FET switches controlled by 'EOR' place an XNOR (open collector with pullup) of both ALU inputs on the Q# bus.
FET switches controlled by 'SUM' place the inverted adder sum on the Q# bus.

;---

6) ALU carry chain, Bit 0,2,4,6:
low_active carry input, high active carry output.
The NAND from "5) ALU_in" generates a carry, if both ALU inputs are 1.
The NOR from "5) ALU_in" kills carry propagation, if both ALU inputs are 0.
Note, that "8) BCD detection" reads the high_active Bit 4 input carry from inside the carry chain.

Attachment:
si6509_7_alu0246.png
si6509_7_alu0246.png [ 15.13 KiB | Viewed 2954 times ]

Attachment:
6509_alu_0246.png
6509_alu_0246.png [ 57.6 KiB | Viewed 2954 times ]


;---

7) ALU carry chain, Bit 1,3,5,7:
high_active carry input, low_active carry output.
The NOR from "5) ALU_in" propagates a carry, if one or both ALU inputs are 1.
The NAND from "5) ALU_in" generates a carry, if both ALU inputs are 1.
DC3 is an additional carry input of carry chain Bit 3, generated by "8) BCD detection".

Attachment:
si6509_7_alu157.png
si6509_7_alu157.png [ 15.23 KiB | Viewed 2954 times ]

Attachment:
si6509_7_alu3.png
si6509_7_alu3.png [ 17.43 KiB | Viewed 2954 times ]

Attachment:
6509_7_alu1357.png
6509_7_alu1357.png [ 69.29 KiB | Viewed 2954 times ]


Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:54 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
8) BCD detection

BCD detection is located East from the ALU carry chain in the silicon.

For decimal ADC\SBC, it checks if the result of Bit 0..3 (respectively Bit 4..7)
would be not inside the valid range $0..$9 for BCD (binary coded decimal).

It does so by sensing some of the signals generated in 5), 6), 7).

For more info, see patent US3991307A.
For an article about the concept, start reading here.

Attachment:
si6509_8_bcd_detect.png
si6509_8_bcd_detect.png [ 58.41 KiB | Viewed 2954 times ]

Attachment:
6509_8_bcd_detect.png
6509_8_bcd_detect.png [ 223.07 KiB | Viewed 2429 times ]


Last edited by ttlworks on Mon Feb 07, 2022 9:14 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:56 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
9) V evaluation, overflow evaluation related to the V flag that is.

It is located at the South end of the ALU carry chain in the silicon.

Per definition, overflow when adding two signed binary numbers is,
when the result has a corrupted sign Bit in the MSB (most significant Bit).

When adding/subtracting two signed binary numbers, this is the case when:
a carry goes into the MSB and no carry comes out of the MSB,
or when no carry goes into the MSB and a carry comes out of the MSB.

For more details about the concept of overflow detetion, see here.

;---

In the NMOS 6502, it's implemented like this:

If no carry goes out of Bit 6 and into Bit 7 (the MSB),
and both ALU inputs for Bit 7 are 1 (what would generate a carry going out of Bit 7),
there is an overflow.

If a carry goes out of Bit 6 and into Bit 7,
and both ALU inputs for Bit 7 are 0 (what would kill/inhibit a carry going out of Bit 7),
there is an overflow.

Note, that the high_active AVR output signal from the V evaluation goes through an inverter
and becomes the low_active AVR# signal which goes to the V flag.
The inverter is located between "17e) control signal latches\drivers" and "27) random 3" (random logic area).

Attachment:
si6509_9_v_eval.png
si6509_9_v_eval.png [ 8.58 KiB | Viewed 2953 times ]

Attachment:
6509_9_v_eval.png
6509_9_v_eval.png [ 37.15 KiB | Viewed 2953 times ]


Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 8:58 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
10) ALU_out

ALU_out is located East from "8) BCD detection" in the silicon.

The low_active ALU result bus Q# is sampled by a latch during PHI2.
The output of that latch drives a FET.
Similar to a 7405 type open collector inverter,
said FET does pull (or not pull) a pullup resistor (FET) to GND,
giving us the high_active ALU result Q.

From there, we have FETs switching Q to ADL bus or SB bus (or to no bus at all).

However, while the FETs switching Q0..7 to ADL0..7 all have a common output enable 'ADD>ADL',
for the FETs switching Q to SB it's different:

'ADD>SB' switches Q0..6 to SB0..6,
'ADD>SB7" switches Q7 to SB7.

Point is, that the SB bus is precharged to 1 by pullup FETs during PHI2, including SB7.
And that for "shift right" operation, the ALU output Q7 would be 0. //See "5) ALU_in".

So during "shift right",
if 'ADD>SB7' is 1, Q7 would be 0,
if 'ADD>SB7' is 0, Q7 would be 1.

ROR makes creative use of 'ADD>SB7' depending on the C Flag,
for getting the state of the C flag into SB7.

Attachment:
si6509_10_alu_out.png
si6509_10_alu_out.png [ 13.86 KiB | Viewed 2952 times ]

Attachment:
si6509_10_alu_out7.png
si6509_10_alu_out7.png [ 14.1 KiB | Viewed 2952 times ]

Attachment:
6509_10_alu_out.png
6509_10_alu_out.png [ 29.33 KiB | Viewed 2952 times ]


Top
 Profile  
Reply with quote  
PostPosted: Thu Dec 16, 2021 9:01 am 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1431
11) BCD correction

BCD correction is located East from "10) ALU out" in the silicon.
Basically, the game is about making creative use of XOR (XNOR) gates.

It is controlled by 8) "BCD detection".

For more info, see patent US3991307A.
For an article about the concept, start reading here.
//I already had mentioned this in "8) BCD detection".

Note, that a BCD detection during ADC for Bit 0..3 would go into the ALU carry chain Bit 4.

BCD correction also taps into the Q outputs of the ALU.
I think that's a smart move, because SB bus capacitances plus Q>SB (Q>SB7) switch impedances
would cause a delay of the data on SB relative to Q, and because it gives a more compact chip layout.


The ALU result Q is put on the SB bus,
and from there it goes through the BCD correction before entering A (the accumulator).

This sure results in a compact chip layout, this also sure results in an incorrect Flag evaluation,
because the N,Z Flag evaluation does check the _input_ of the BCD correction circuitry, not the _output_.

Obviously, a compact chip layout was more important to the designers than correct flag evaluation in decimal mode.

Attachment:
si6509_11_bcd_correction.png
si6509_11_bcd_correction.png [ 75.45 KiB | Viewed 2951 times ]

Attachment:
6509_11_bcd_correction.png
6509_11_bcd_correction.png [ 327.04 KiB | Viewed 2951 times ]


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 66 posts ]  Go to page 1, 2, 3, 4, 5  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 26 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: