6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 16, 2024 12:10 pm

All times are UTC




Post new topic Reply to topic  [ 51 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
 Post subject:
PostPosted: Fri Feb 19, 2010 10:08 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10980
Location: England
I don't think anyone's trying to have an argument, I think it's just disconnect. By '6502' we think we know what we mean, but when we start changing the architecture there are a variety of things we might mean.

One question is whether or not the CPU is clocking faster than the main memory. A cache only makes sense(*) if there's a cost in going to main memory. An access by a fast CPU to cached zero-page still takes extra cycles compared to register access, but it's less time than accessing main memory.

In custom chips (as opposed to FPGAs) there is indeed a time cost associated with distance, and associated with size. The latency to cache, and to more distant larger caches, is certainly greater than latency to registers or to nearer cache. Here's a nice recent exploration and discussion.

Your average enormous x86 processor takes some care to prefetch and to model sequential strided accesses. See this patent for example. Imagine verifying that.

Back to earth though, if we compare TXA with LDA $55, we have (on 6502)
1. fetch (TXA)
2. execute (A is updated)
1. fetch (next instruction)
versus
1. fetch (LDA)
2. operand ($55)
3. read (zero page is read)
1. fetch (A is updated)
So, a zero page access has to take 3 cycles because it has to read 3 bytes. It needs 3 memory accesses, which in the cached case is an opportunity to make 3 cache hits, which could be very significant.

(Again, absurdly complex x86 machines will cache pre-decoded instructions, such that 'LDA $55' would be a single fetch from a warm cache. I assume we're not thinking of implementing that.)

(*) Edit: OK, perhaps with additional complexity, the write back of results to cache could share a cycle with the fetch of next instruction. By 'cache' we'd normally mean a write is (eventually) written to memory. Perhaps the confusion arises because the proposal is more like an overlay: zero-page stays private to the FPGA and writes are never made to that page of main memory? There's still a question of operand fetch.


Last edited by BigEd on Fri Feb 19, 2010 10:21 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Feb 19, 2010 10:15 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10980
Location: England
Ruud wrote:
In the basic version the EPROM has 14 inputs:
- 4 clock bits
- 8 opcode bits
- 1 bit telling whether a branch should be taken or not
- 1 bit for telling there is a Reset/IRQ/NMI going on
The eight outputs are connected to the clock inputs, Output Enable, latch inputs etc. of the various TTL IC's. So far I need NINE EPROMs for the instruction decoder, not counting two for the ALU and two for some other trick work. Dieter TTL-CPU uses only four and I wonder why so less ???


Nice explanation! The instruction decode PLA in the 6502 also has lots of outputs: the PLA is followed by random (unstructured) logic to control the many pass transistors in the datapath, and to control the ALU. If you have a control+datapath design, with a similar set of registers, muxes and ALU capability, I think you're bound to have a similarly large number of control signals, so you'll need a very wide EPROM (or collection of EPROMs)

I haven't looked into Deiter's design files. But if he uses the ALU to increment PC he certainly has a very different datapath, and must surely take more cycles in many cases. It's a space versus time tradeoff.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Feb 19, 2010 8:40 pm 
Offline
User avatar

Joined: Sun Feb 13, 2005 9:58 am
Posts: 85
i found this very clear http://www.weihenstephan.org/~michaste/pagetable/6502/6502.jpg (from another topic discuss in this forum) where you can see that pc=pc+1 is not done via alu.
it's just a month that i'm working on it: in my opinion it's the key to develop a good ttl (or what you like) clone.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Feb 20, 2010 9:05 pm 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
ptorric wrote:

According that picture the 6502 has a 21*130 ROM. Mine has a 14*64 (maybe 72, I'm working out an idea).
ptorric wrote:
it's just a month that i'm working on it: in my opinion it's the key to develop a good ttl (or what you like) clone.

Keep us informed, please!

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Feb 20, 2010 10:19 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10980
Location: England
If you've got 72 control signals, that matches the diagram!

That "ROM" isn't a ROM, it's a PLA. Better to think of the "ROM"+logic as 14 inputs and 72 outputs - which sounds like exactly what you have!

(A 14-input 72 output ROM would of course be enormous: the 6502's PLA exploits regularity in the instruction encoding to pull off the low transistor count. You can bet those 130 intermediate signals were chosen very carefully.)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Feb 21, 2010 8:11 am 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
Hallo allemaal,

Yesterday I started creating a program in Pascal that should calculate the output of every EPROM for every given situation. I first calculated the output for every output when in rest. This means that I only have to take car of the active elements.

const
cb00 : byte = $AC; { standard output for 1st EPROM }

var
b00 : byte;
.
.
b00 := b00 or 2; { reset step counter }
b00 := b00 and $20; { reset IRQ flipflop }

Using only an 8 bits internal data bus I need 8 cycles to perform an IRQ request or BRK :( IMHO things go wrong the moment I load the PC with the value $FFFE. I now need two steps, if it could be done in one, I have my seven steps. But one step means 16 bits not 8.
I used to have a circuit that could do it but it meant an extra 16 bits internal bus. Dropping it caused the extra step.

Is there anybody willing to have a look at this and come with some ideas, please? Thanks!

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Feb 21, 2010 9:13 am 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
Hallo allemaal,

Due to a dead car battery, we have to skip church now, I have some time left to expand the above message. As said before, I now need 8 cycles to perform an IRQ, NMI or BRK instead of the 7 mentioned in the data sheets:

1) Load Instruction decode with 'opcode'
2) Save HB PC to Stack { PC = Program Counter }
3) Save LB PC to Stack
4) Save Status register to Stack
5) Load LB TPC with $FE { TPC = Temporary PC }
6) Load HB TPC with $FF
7) Load LB new address into PC
8) Load HB new address into PC
9) Reset counter etc. -> becomes 1) of 1st instruction of IRQ/NMI routine.

As said as well, the only way IMHO to save a step is to combine step 5) and 6). And this is only possible by feeding the TPC with the address $FFFE in one go. But that means other/extra hardware. Ideas are very welcome!

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Feb 21, 2010 10:00 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10980
Location: England
Have you had a look at Beregnyei Balazs' giant schematic?

It's not quite complete - and perhaps not quite correct - but it does explain a few things.

In particular, the vector values are forced onto the address bus by a few gates right next to the pins: the machine doesn't load PCH and PCL with the vector addresses. As things are generally pre-charged high, that just means conditionally pulling down a few of the low bits. Have a look half-way up the right-hand side.

In fact I think we may have mentioned this before: on the block diagram these pulldowns are in the box OPEN DRAIN MOSFETS.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Feb 21, 2010 8:47 pm 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
Hallo allemaal,

My idea was to build the TTL-6502 on 4, maybe 5, Eurocards. By rearranging the design I managed to solve my problem of getting from 8 to 7 steps.
BigEd's remarks made me to have a look at those schematics again and I realised the my "Static Addresses EPROMS" emulate the pulling down FETs. His words also triggered another idea. From the first beginning of my design I loaded the NMI/Reset/IRQ vector ($FFFA/C/E) into a counter and increased the counter to get the second byte of the address of the routine. Outputting the second address ($FFFB/D/F) is something my "Static Addresses EPROMS" can do as well!

I'm back on the track again!

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Mar 05, 2010 6:16 pm 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
Hallo allemaal,

As promised I uploaded the schematics and updated my TTL-6502 page: http://www.baltissen.org/newhtm/ttl6502.htm

You find the schematics as PNG at http://www.baltissen.org/zip/6502-png.zip

You find the schematics as Eagle files at http://www.baltissen.org/zip/6502-sch.zip

If you don't understand something, please feel free to ask me about it. Any comment is welcome!

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Mar 13, 2010 12:33 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10980
Location: England
Hi Ruud
Thanks for posting your schematics.

I noticed that Steve Chamberlin blogged (in 2007) about his development of the big mess o' wires project, and drew up a table of his cycle counts.

As a matter of interest, do you have any mismatches any more with the 6502?

(There's also an interesting post about the difficulty of re-using the ALU for address calculation.)

(Document on cycle-by-cycle timings from vice team)

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 15, 2010 8:54 am 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
BigEd wrote:
do you have any mismatches any more with the 6502?

Not anymore. So far I solved all timing problems. For some instructions, like TAX, TYA, etc. I only need one cyle. But I ran into an hardware problem, see "ORA ($xx,X)" thread, that I need to solve first. If it only can be solved in the way I think, as a bonus I can speed up other instructions as well.

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Mar 18, 2010 7:23 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8541
Location: Southern California
On the first page of this topic, I wrote:
Quote:
Not too long ago there was something about a home-made 6502 on very nicely done multilayer boards that plug together, with rows of T1 LEDs which I assume were for status and debugging. It was not wire-wrapped. It was very impressive, even though I think the author said he never really did finish it. I can't find it now but maybe this will jog someone's memory for what to put in a search string. If the author can be found, he would probably be happy to give you all the info to give you a head-start so he can see it come to reality.

I found it. It's at http://web.whosting.ch/dieter/trex/trex.htm . It was not the 6502 I was thinking of, but a very impressive 32-bit TTL CPU nevertheless. The board size appears to be 6U VME.

He also has this article on implementing the V flag:
http://web.whosting.ch/dieter/v_flag/v_0.htm

and this one on BCD in TTL:
http://web.whosting.ch/dieter/bcd/bcd_0.htm

Edit, 7/24/12: This was Dieter Mueler's work, and he now has updates at http://6502.org/users/dieter/index.htm .

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Mar 18, 2010 7:52 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10980
Location: England
That is nice. And, I see he's used line-segments on the Names layer in Eagle to draw a block diagram: maybe it was tedious to do, but it serves the purpose. I've completely failed to find a drawing program which offers fat hollow arrows for busses.

Wonderful hobby timeline too.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 10, 2010 10:03 am 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
Hallo allemaal,

The hardware of the TTL-6502 enables me to execute quite some instructions faster then the original 6502. For example: some instructions need an extra cycle when the page boundary is crossed, like branches. TTL-6502 doesn't. The question is: is that bad?

My VIC-20 runs without any problem with various 65xx's like the 6502, 65C02 and the 65816 (AFAIK they are cycle exact anyway). But I cannot see any reason that a bit faster 6502 can hurt in one or another way.

Your opinion is appreciated!

But FYI: with a bit more effort I think I can make it cycle exact.

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 51 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 10 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: