6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Apr 28, 2024 4:44 am

All times are UTC




Post new topic Reply to topic  [ 149 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 10  Next
Author Message
PostPosted: Sun Apr 21, 2019 11:38 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
Is the architecture of the TK20 known? Might be some interesting tactics in there. I suppose two of the nibble-wide '181s would be for the ALU, and two could be for index calculation (although as we know, the 6502 uses the ALU for that too.) I wouldn't really expect to use a '181 for incrementing. Maybe two for stack pointer increment and decrement? Or maybe for BCD adjustment??


Top
 Profile  
Reply with quote  
PostPosted: Sun Apr 21, 2019 1:16 pm 
Offline

Joined: Sat Aug 19, 2017 1:42 pm
Posts: 35
Location: near Karlsruhe, West-Germany
BigEd wrote:
Is the architecture of the TK20 known?

Not me :-( The only detail is the code of the pre-boot firmware ROM. But I don't understand the softswitches.

AFAIR (I visited Schaetzle & Bsteht twice in 1987 and 1988) the bottleneck was RAM access. I think it was the address decoding and the bank switching which was typical for the Apple II family.

My most important question is the meaning of the softswitches: $C0F8/$C0F9, $C0FA/$C0FB, and $C044 .. $C04F. My board was "compiled" to sit in slot 7, means soft switches in $C0Fx. But the $C04x softswitches are a bit strange.

Regards
Ralf


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 22, 2019 1:44 pm 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1392
RalfK wrote:
In 1987 I bought an accelerator board for my Apple IIe.

Woot ! :mrgreen:
Somebody has that hardware.
Somebody really has that hardware !

Ralf, _thanks_ for posting the pictures, haven't expected to see them in my lifetime.

;---

That's a high density PCB layout, and to me the PCBs look multilayer.

Looks like the registers were implemented by using 74ACT374 chips.
6* 74F181 4 Bit ALU. Maybe arranged as an 8 Bit ALU for data calculations and a 16 Bit ALU for address calculations ?

17 PALs.
4* Cypress CY7C291, 35ns 2k*8 PROM.
1* MMI 83S881, a (30ns?) 1k*8 PROM.
This implicates that the designers might have known more tricks than we know now when it comes to decoding the 6502 instruction set.

;---

Aha: the advertisement says:
"3 million instructions per second". Wikipedia says 6502 has 0.43 MIPs per MHz.
"100ns cycle time", this would be 10MHz. 3 MIPs, 0.43MIPs per MHz, gives ca. 6.98 MHz. Hmm...
"80kB of 55ns on board RAM", but in the picture we see 9* M5M5165P-70L (9* 8k*8 = 72kB, 70ns).
Maybe there probably were different revisions\versions of the PCBs, so the hardware in the pictures just doesn't fit the advertisement, that's normal. :)

"Fully 6502 compatible", so it seems to have decimal mode, but we can't tell for sure if it supports the UFOs (instructions UnForseen by the designers).
"Optional extention for the 65C02 instruction set", so it only does the NMOS6502 instruction set, and running the 65C02 instruction set might require some additional hardware.

;---

RalfK wrote:
I visited Schaetzle & Bsteht twice in 1987 and 1988

Starting to wonder, if Heinz Schaetzle and Herbert Bsteh are still alife, and if there would be a chance to drag them into our forum. ;)
//Drass, any plans for building a chess computer ?


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 22, 2019 1:47 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
Oh, with 72k of RAM, I wonder if the machine has special handling for zero page, and/or page one, to offer a 16-bit wide path for pointers and stacked addresses? And with a 16-bit ALU for address calculation, as you suggest, no need for extra cycles on page crossing.


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 22, 2019 2:22 pm 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1392
Maybe it makes a copy of the external ROM which is located in the host computer at power_on, this would explain the 72kB RAM.

From the PCB layout, we have 4* 74181 plus only a few registers on one PCB, and 2* 74181 plus some more registers on the other PCB.

But saying anything more about it would be highly speculative until somebody has reverse_engineered the PCBs into schematics
and until we have a read_out of the GALs and PROMs...


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 22, 2019 3:18 pm 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1392
Now for tossing in something different:

Somewhere up in the thread, we had a few words about how difficult the 6502 instruction decoding is,
and this had brought me to the idea of playing the "what if" game:

6502 has a 16 Bit address bus and an 8 Bit data bus, and that's what adds a lot to the complexity of the 6502.
What, if the address bus _and_ the data bus for a "6502 styled CPU" would be both 16 Bit ?
Had tinkered for a week with this idea, and here the results:

Attachment:
m16_mill.png
m16_mill.png [ 145.27 KiB | Viewed 2402 times ]


PC is a 16 Bit counter, all the other registers are just registers.
The blue shaded registers are not visible to the end user.
The blue bus line is the internal data bus of the CPU.

Now for the state machine diagram:

Attachment:
m16_state_diagram.png
m16_state_diagram.png [ 331.59 KiB | Viewed 2402 times ]


No more zero page addressing, but all the other addressing modes seem to be there
(just had to add stack relative data and pointers, you sure can imagine how to implement
stack relative pointer with Y register as an offset).

For Bxx false, the ALU does nothing in step 5 while the next instruction word is fetched,
for BXX true the 16 Bit immediate value fetched in step 0 is added to PC in step 5.

Compare this to a rapid transit map:

Image

In step 5, the train is at the station, and "something" happens to the chargo (data).

Adding PEA, PEI and PER to the state machine diagram is left as a homework assignment to the experienced reader.

Point is: we now have a 16 Bit instruction word,
and if one would be using mostly the lower part of the instruction word for controlling the sequencer (telling it which steps to skip),
while using most of the upper part of the instruction word for controlling ALU, registers and flags in step 5,
the layout of the opcode map somehow would resemble a bit the layout of the opcode map of the 6502.

For a 65C02 (and for our TTL CPU), a state machine flow diagram would be at least 4 times as big,
and a lot more complicated because a 16 Bit address has to be calculated with an 8 Bit ALU,
one needs to test for edge cases like page crossings etc.

When trying to implement an instruction decoder\sequencer for the 65C02 functionality plus some extras
by using 74138 decoders, 74151 multiplexers and logic chips, I think this would take more than 150 chips in total,
limiting the speed of a design which has a 2 level pipeline to maybe less than 10 MHz.

;---

Downside of the state machine flow diagram is, that all of the instructions take at least two memory words and two machine cycles.
If one would add a little bit logic which detects single word instructions in step 0, then prevents that PC+ is written into PC at the end of cycle 0 if this is the case
(like in the NMOS 6502 and in our TTL CPU), single word instructions would work, but they still take two machine cycles.

Questions to Drass:

First:
We currently seem to generate the control signals for step 0 by hardware.
If _all_ of the microcode sequences would contain that [PC+] read for step 0, is there a chance for simplifying this circuitry ?

Second:
We now do ALU operation and flag evaluation in the EXECUTE machine cycle.
In which cycle the status register P is pushed on stack when the CPU responds to an interrupt ?
In theory, it might be possible to do just the ALU operation in the EXECUTE machine cycle, while latching the ALU outputs into a temporary register
for doing the flag evaluation and the status register update in the machine cycle that follows EXECUTE.
But I think this would complicate implementing cycle exact BXX false.


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 22, 2019 5:07 pm 
Offline

Joined: Fri Mar 31, 2017 5:10 pm
Posts: 8
Location: Paris, France
ttlworks wrote:
<musings>

It almost looks like ttlworks is reinventing RISC :mrgreen:

Good luck people, but don't forget to KISS ! 8)


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 22, 2019 5:15 pm 
Offline

Joined: Sat Aug 19, 2017 1:42 pm
Posts: 35
Location: near Karlsruhe, West-Germany
ttlworks wrote:
Maybe it makes a copy of the external ROM which is located in the host computer at power_on, this would explain the 72kB RAM.

That's nearly right. The Apple II and II+ have 12kB of ROM from $D000 to $FFFF. It depends on versions and models which kind of software is in there. The monitor ROM or Autostart, or which Basic. In the Apple IIe there are 16kB of ROM from $C100 to $FFFF.

The code from the (slow) motherboard ROMs is copied to the fast RAM onto the DC65. This code for copying is in the pre-boot ROM (the 2716 seen on the pics). The same does i.e. the Transwarp accelerator from Applied Engineering (real 65C02 or 65802 @3,6MHz). I disassembled this code too. But I knew this before because the ROM code (i.e. Basic interpreter) runs also at full speed. After the copy loop the fast pseudo ROM seems to be disabled for writing.

The other questions:
Mr Schaetzle and Mr. Bsteh seem to live at their former town near Stuttgart in the southwest of Germany, 70km from me. I'm planning to contact them when I start to reassemble my Apple IIe. I found their address, and it seems that they are busy with electronics development.

The ad and the versions of that accelerator board for use in an Apple II:
- 10,0MHz, optional 12,5MHz, 100nsec cycle time means 10MHz, 12,5MHz -> 80nsec. They told me, that my board has "selected" chips. Means: these are 70nsec types but they are faster.
- Apple II: 80kB are 64kB plus 12kB for the pseudo ROM, but they use 8kB RAM chips
- Apple IIe: 144kB are 128kB plus 16kB for the pseudo ROM
- "Einfache Bedienung durch Menu-Steuerung" means simple menu for configuration or somethinng like this. But you need no config. This does not exist.
- Instruction set NMOS or CMOS: yes, I had the choice :-) The CMOS instructions ran well.

I mostly ran the UCSD p-System with my own BIOS and interpreter which were highly optimized for CMOS instructions. The acceleration was min. 15* in comparison to the normal 1MHz Apple IIe just for code. I mentioned the speed of the RAM disks in 1987: 500 or 333kB/sec. A standard Apple II or IIe equipped with two or three floppy drives lasts 8.5hours to run the assembler for a new BIOS and p-code interpreter. Using the RAM disks and the DC65 board the same ran in 2.5 minutes :-)

Plans for the future: a high speed serial card based on technology of the 1980s. The main goal is to replace floppies. I also need a server for the file system via serial line. No technical problem :-) It's just my time. The serial speed will be 921kBaud. The 65C02 code is ready but yet untested. This kind of file system access should also work with the 3.6MHz Transwarp but limited to 460kBaud.

The decimal mode with the DC65: the UCSD p-System does not use this mode, so I never tested it. But I'm planning to do that since I learnt some tricky pieces of code in the last years.

Regards
Ralf


Last edited by RalfK on Mon Apr 22, 2019 5:28 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 22, 2019 5:25 pm 
Offline

Joined: Sat Aug 19, 2017 1:42 pm
Posts: 35
Location: near Karlsruhe, West-Germany
ttlworks wrote:
Somebody has that hardware.

Yes, I do :-) Do you know other owners of that hardware prepared for the chess computers or the Apple II?

I'm not sure if the hardware is ok after years in the storage. I'm preparing this Apple IIe carefully.

Would you like to get the pics direct from the scanner (10MB per pic)? Or would you like to get the code (disassembled or binary)?

Regards
Ralf


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 23, 2019 1:28 am 
Offline
User avatar

Joined: Sun Oct 18, 2015 11:02 pm
Posts: 428
Location: Toronto, ON
ttlworks wrote:
We currently seem to generate the control signals for step 0 by hardware.
If _all_ of the microcode sequences would contain that [PC+] read for step 0, is there a chance for simplifying this circuitry ?
There’s a bit of logic which kicks in during the FetchOperand cycle (cycle after FetchOpcode). It looks at the opcode and makes decisions as follows:

1) If the opcode is a single byte-opcode, PC + 1 is inhibited
2) If the opcode is a single-cycle NOP, the currently executing FetchOperand micro-instruction is changed on the fly to a FetchOpcode, and a FetchOperand is scheduled for the following cycle.
3) If the opcode is a branch, it evaluates the branch condition. If false, it triggers a FetchOpcode to execute in the next cycle, and schedules a FetchOperand to execute in the cycle after that. This is the “Branch Exit”, which is invoked when a branch is not taken.

FetchOperand is an all-zeroes micro-instruction, and FetchOpcode has only two control bits high, so both microinstructions are trivial to synthesize on the fly. One flip-flop is used to trigger a FetchOpcode in the next cycle, and another to schedule FetchOperand thereafter. So, one 74AC74 does the trick. The real gate count is all about decoding the opcode. This is one reason I suggested using a RAM to decode the opcode in the new pipeline. Another alternative is using something like a PAL, 16V8 would do.

Quote:
In theory, it might be possible to do just the ALU operation in the EXECUTE machine cycle, while latching the ALU outputs into a temporary register
for doing the flag evaluation and the status register update in the machine cycle that follows EXECUTE.
But I think this would complicate implementing cycle exact BXX false.
I think the flag evaluation can be done after the EXECUTE cycle while still maintaining cycle accuracy. The flags-logic itself is pretty efficient, so we can clock the P register with PHI2 in the cycle following the ALU operation. (I called this the WRITE_FLAGS stage). The flags would then be available for the second half of that cycle, and can be tested then.

This is the critical point. The branch test occurs during the FetchOperand cycle of a branch instruction, but it happens in the second half of the cycle. The flags would also be evaluated during that cycle, but in the first half. They can be ready in time for the branch test. I’ll post up the timing analysis on this when I get a chance. It’s very tight, but I believe it can be done.

_________________
C74-6502 Website: https://c74project.com


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 24, 2019 4:24 pm 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1392
Drass wrote:
There’s a bit of logic which kicks in during the FetchOperand cycle (cycle after FetchOpcode).

For the microcode sequences of all 6502 instructions, microcode ROM places a "read from [PC+]" into the control pipeline during an instruction fetch for the next cycle (which would be step 0).
Additional hardware forces a "read from PC+" into the control signals during step 0, overriding the microcode.
And I just can't remember why we had added that hardware...

Drass wrote:
This is the critical point. The branch test occurs during the FetchOperand cycle of a branch instruction, but it happens in the second half of the cycle. The flags would also be evaluated during that cycle, but in the first half.

Aha: you are out to write the flags in the middle of the cycle "step 0".
I think that there is an alternative for writing the flags at the end of that cycle, but it would add some chips to the design:

Attachment:
flags_pipelined.png
flags_pipelined.png [ 178.26 KiB | Viewed 2201 times ]


Drass wrote:
The MT15 carry chain looks great, and it’s much faster for 16 bits. Does it help with an 8-bit ALU?

Spent some thoughts on this by tinkering with other ALU designs for a week, unfortunately it had turned out that they all would be slower (regarding the worst case timing).
Hmm... with 8ns 512kB RAMs, one sure could build a fast 8 Bit adder, but initializing that RAM after power up would require some additional circuitry.

My first TTL CPU had two EPROMs working as 4 Bit ALUs...


Last edited by ttlworks on Fri Apr 26, 2019 2:25 pm, edited 2 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 24, 2019 4:25 pm 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1392
whygee wrote:
Good luck people, but don't forget to KISS ! 8)

Aha: build a time machine and tell Chuck Peddle about it in 1974. Now it's too late. ;)


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 24, 2019 4:31 pm 
Offline
User avatar

Joined: Fri Nov 09, 2012 5:54 pm
Posts: 1392
RalfK wrote:
Would you like to get the pics direct from the scanner (10MB per pic)? Or would you like to get the code (disassembled or binary)?

Don't worry: the pictures you had posted are good enough for identifying the chips on the PCBs.
I assume that the two PCBs were sold without any schematics...

RalfK wrote:
That's nearly right. The Apple II and II+ have 12kB of ROM from $D000 to $FFFF.

Sorry that: I grew up with Commodores, but I'm not familiar with Apples.

RalfK wrote:
Mr Schaetzle and Mr. Bsteh... [snip] it seems that they are busy with electronics development.

Wow: after all these years they still do hardware design. It isn't unusual to go for a different job after 20 years+ of doing hardware design for a living. Nice.

RalfK wrote:
Plans for the future: a high speed serial card based on technology of the 1980s.

What UART chip you are going to use ?


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 25, 2019 2:15 am 
Offline
User avatar

Joined: Sun Oct 18, 2015 11:02 pm
Posts: 428
Location: Toronto, ON
ttlworks wrote:
For the microcode sequences of all 6502 instructions, microcode ROM places a "read from [PC+]" into the control pipeline during an instruction fetch for the next cycle (which would be step 0).
Additional hardware forces a "read from PC+" into the control signals during step 0, overriding the microcode.
And I just can't remember why we had added that hardware...
I think you’re referring to hardware that synthesizes a FetchOperand after a FetchOpcode. The mechanism is only used for Branches and Single Cycle NOPs. For branches, the microcode reflects the Branch-Taken AND page-crossing conditions. Branch-Exit inserts a FetchOpcode/FetchOperand sequence into the instruction stream and ignores the microcode when either condition is not met. We do the same unconditionally for Single Cycle NOPs, but in the current cycle rather than the next. Either way, the microcode is ignored.

Quote:
I think that there is an alternative for writing the flags at the end of that cycle, but it would add some chips to the design:
Yes, I think either approach works. Thanks for suggesting this! I’ll refer back to it when the time comes.

Btw, the new microcode is coming along. Needs more work, but I’m beginning to see how it’s going to play out. So far, so good.

Cheers.

_________________
C74-6502 Website: https://c74project.com


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 25, 2019 11:19 am 
Offline

Joined: Sat Aug 19, 2017 1:42 pm
Posts: 35
Location: near Karlsruhe, West-Germany
ttlworks wrote:
I assume that the two PCBs were sold without any schematics...

Yes.

ttlworks wrote:
Sorry that: I grew up with Commodores, but I'm not familiar with Apples.

That's ok :-) My first contact to PCs was to a PET 2000. But later at the university we had a lot of Apple II, and so I bought my first one. Two PETs are now waiting to be cleaned and repaired.

I just mentioned these addresses for the readers like you and for their understanding this TTL cpu within the world of Apples.

RalfK wrote:
What UART chip you are going to use ?

6850 because the limit is 1Mb/sec on the serial line. The 8530 are a bit oversized.

In 1988 I soldered this during a rainy weekend (or so :-) ) using some chips which were available at that moment (no LS161):

Image
Image

The base was an EPROM card AP64e with a 6821. Several pins are similiar to the 6850. That card was good for 38400 and 57600Baud depending on the oscillator.

Regards
Ralf


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 149 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 10  Next

All times are UTC


Who is online

Users browsing this forum: Proxy and 16 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: