6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 2:18 pm

All times are UTC




Post new topic Reply to topic  [ 82 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
Author Message
 Post subject: Re: Minimalistic CPU
PostPosted: Mon Jul 23, 2012 7:34 pm 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 1043
Location: near Heidelberg, Germany
Dajgoro wrote:
My issue it that i don't know how to connect an component inside a process, or something like that.


If I get your question rigth, you need to use signals in the process, and connect the signals to the component outside the process

André

_________________
Author of the GeckOS multitasking operating system, the usb65 stack, designer of the Micro-PET and many more 6502 content: http://6502.org/users/andre/


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Mon Jul 23, 2012 7:36 pm 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
I updated the post before.
Ill try to explain the concept of this cpu attempt.
There are 4, 4bit register(r0,r1,r2,r3), but they can be combined into 2, 8bit registers.
The 4 bit registers are selected with the values A and B shown in the sketch, but some instructions require pointers, and then they are combined in 8 bit register, where the pointer bit selects which of the two is addressed. The alu is 8 bit wide, but it can work only with 4 bits. The idea is that in the fetch phase it loads the program counter and increments it for 1 and stores it back, by doing that it should save 4 bits on the program counter increment circuit.
The data bus is also 8 bit, and instructions are 8 bit, and the idea was to have 4 and 8 bit load store instructions, but that depends on how much i can shrink in the xc9572.
The other idea that should make this cpu different, is that it should be able to work as a bit slice. I still need to implement this, but the idea is to make outputs to share the flags, cin and cout from the alu, and external program counter increment and overflow output. So by putting 2 of this cpu, you should get a proper 8 bit cpu with 16 bit address space, and so on, but it would work in an awkward way...


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Mon Jul 23, 2012 7:39 pm 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
fachat wrote:
Dajgoro wrote:
My issue it that i don't know how to connect an component inside a process, or something like that.


If I get your question rigth, you need to use signals in the process, and connect the signals to the component outside the process

André


Yes but i don't know how will then the alu have time to calculate anything, if the signals become visible after the processes block is done, and that mean after phase 1 goes to 0, but then phase 2 comes, and phase 2 expects an result from the alu, but where here does the alu have time to calculate anything? And the other issue is that then i should make input latches for the alu too, and that means lots of more macrocells...


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Tue Jul 24, 2012 3:32 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Dajgoro:

Andre Fachat makes a good suggestion regarding the inverter before the ALU to implement subtraction.

I was in the process of writing a missive on the CPLD architecture, and lost it. I won't go back and reconstruct it, but I do suggest a thorough familiarization with the architecture of a CPLD. It is best thought of as a set of macrocells consisting of an AND array feeding an OR gate which feeds into a configurable FF. The macrocells generally consist of 18 cells. There are several advantages to the CPLD over a RAM-based FPGA, but the generally accepted distinguishing feature is the ability of a CPLD to implement wider logic function in fewer logic levels. Ignoring the fact that most FPGA logic cells have sub-nanosecond performance, fewer logic levels will result in less delay.

The disadvantage that a CPLD has with respect to an FPGA is the lack of FFs, and routing resources. Vendors have worked hard to include routing resources into CPLD architectures, and in most cases there are no constraints in pin placement. However, the basic architecture of a CPLD logic cell is still that of the 22V10 PAL first introduced by Monolithic Memories and AMD in the early 80s. That means that the CPLD places its emphasis on AND-OR array in order to implement wide logic functions in two levels of logic, which a RAM-based FPGA may require as many as 10-20 levels to implement.

That advantage is not as significant as what you'd like particularly when implementing counters, ALUs, and bidirectional busses. Consider the number of terms required by each bit in a simple counter. The first bit is a simple toggle of the output, i.e. an inverter. The second bit is bit more complicated but it is at least two inputs wide. As the counter grows, the number of inputs to each stage grows and is equal to its bit number. Simple functions don't make effective use of the AND-OR array, which can't generally be shared with functions requiring more OR terms. Complex functions such as counters greater than 8 bits in width require more OR terms than are generally available, and so must be expanded in some manner in order that they may be correctly implemented.

A CPLD generally has an OR gate that allows it to accept 8 AND terms. In terms of your minimal CPU, this means that you need to ensure that your logic functions are 8 (or whatever the architecture provides) terms wide or less. If not, the synthesizer will require the use of terms imported from adjacent cells, which will generally limit the logic function that that cell can implement.

I am not a particular fan of either VHDL or Verilog (better) because they tend to hide the architecture from the designer. Generally this allows the designer to focus on the more important architectural elements of the design. However, I've yet to work with a synthesizer that informs the designer of the effects his/her code has with respect to the underlying architecture.

I misunderstood the thrust of your earlier post. The block diagram that you posted earlier clears up your objectives very well. The issue I see for you is that you have an exrernal bidirectional bus that you want to implement. Its been my past experience with the 95144, 95108, and 9536 parts that you have to keep the number of (internal/external) busses down to a minimum. The intra-macrocell routing resources are at a premium in these parts. Within a macrocell, the routing resources is quite extensive, and you can readily implement any functions that stays with the OR widths limits and the cell limits of the macrocell, i.e. 8 and 18, respectively. (In the 9572XL, there are 54 inputs into each macrocell of 18 cells, and 18 outputs from the macrocell into the fast interconnect matrix.)

Thus, from your block diagram, I think that you will need to try and limit the ALU to fit into a single macrocell, or 18 logic cells. Similarly, the register bank should fit into a macrocell. The AR and PC will need to be implemented in the remaining macrocells. The register bank, 4x4 or 16 lgic cells, should be fairly easy to constrain to fit as suggested. It appears that you want the PC and AR to be 8 bit registers. If my count is correct, you expect require 16 registers (logic cells) for the register array, and 16 registers for the PC/AR combination. This leaves approximately 36 cells (two macrocells) for the ALU and the sequencer. I am assuming that the 4 cells in the other macrocells will not be available.

The key to reducing the number of cells required for the ALU is to minimize the functions you require. For example, an ADD instruction is vital, but subtraction is not because it can be accomplished by addition with the complement of one input. In addition, AND and OR functions are vital but NAND and NOT are not. Instead, I would recommend XOR instead.

Thus, I recommend that you reconsider your instruction set with the idea of minimizing the ALU to fit within 8-12 cells: four cells for ADD; four cells for AND, OR, and XOR; and four cells for ASR, ROL. After this is achieved, I would then consider simply adding an increment (INC) instruction for PC management.

Following that effort, I would look to defining the load and store instructions. It is my recommendation that you have two loads: Load from memory (LD), and load constant (LDC). It is my recommendation that you have a store (ST) instruction.

I see where you desire to have a general register based architecture, which is laudable. But for such a minimal CPU, I recommend that you reconsider an accumulator based architecture. The 65C02 architecture is a good example. If there were exchange instructions instead of transfer instructions, then the architecture would be more flexible. The problem with your register based architecture in the small space of the 9572 is the number of logic cells that will be needed to construct the nuimber of multiplexers required for the two operand architecture that your block diagram implies. The implicit address of the accumulator means that only a single multiplexer is generally required on the ALU inputs since the accumulator is always one of the inputs.

Furthermore, I did not see a subroutine call and return mechanism. Without it, I can't think that you will be satisfied with your CPU. (Even the microprogram controller that I used in implementing MAM65C02 core (see github) implements a subroutine/return mechanism, even though the MAM65C02 microcode did not make use of that capability. The point is that subroutines are the key feature of any useful computer/controller.) If you opt for an accumulator based architecture, then I recommend reducing the register bank from 4 registers to 2 registers, and using the 8 cells gained to provide at least a single subroutine level.

With all of that said, the sequencer may be a significant challenge to get to fit into the remaining 24 logic cells, and I've not even accounted for the PSW that will be more than likely necessary to allow you to extend your machine to support multi-word arithmetic operations. (One thought would be to create a BCD mode adder since carry would be matched to the ALU operand size of your machine.)

Andre's suggestion of an implementation in an XC95108 would be worth considering. As I stated above, you need to keep the number of busses to a minimum in order to successfully implement the processor in such a small CPLD. It is a good challenge, and I would recommend its completion because it would certainly hone your HDL skills and give you a high degree of satisfaction upon its completion.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Wed Jul 25, 2012 12:47 am 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
I agree with the idea that uses not gates for subtraction, and stuff about the alu. But the problem is that i don't know how to implement that in vhdl, that i what i was asking before. About the register based architecture, first i was considering an accumulator based architecture, but i quickly found out that there are some things which would get messy. For example pointers, here i can use the registers to act as pointers, where in an accumulator based structures i would need to implement a state machine that would load the pointer from an external location, and the instructions scheme would not be so simple. But when i decided to try this approach, i found out that i would always have the same number of cycles, and no instruction would require special operations.
About the stack(call) issue, i was thinking of that, but it seems i am out of space for a proper stack... And there is a Store instruction, actually the Load and Store are one instruction, there is a bit that defines if it is a load or store.
As for the 108, i see they sell it on ebay for 29$, but shipping is 155$. Since this 5V Xilinx cpld-s are rare, it is not likely that i am going to get one any time soon...


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Wed Jul 25, 2012 4:37 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Again. Save often.

I am not particularly comfortable with VHDL. I will give your problem some thought and put a simple example together for you. However, it will be at least the weekend before I'll have some time to dedicate to it.

I realize that you are just getting started, and that VHDL is the more popular HDL in Europe. From the perspective of language complexity, VHDL is a bit more complex, and generally has a longer learning curve than Verilog. If you are familiar with C, and can keep SW constructs out of your HW source, then I recommand Verilog for a faster start. VHDL has some very powerful constructs for creating parameterized components and libraries of components which is not available in Verilog. Verilog's parameterization capabilities are powerful in their own way, but libraries are simply a collection of files.

The biggest stumbling block that I've found between VHDL and Verilog is that in VHDL there's an abstraction between the operations supported by a particular signal type and the operations of a logic type. For example, in most VHDL implementations, you will find a counter implemented with signal having an integer type swhich supports add, subtract, and comparison operators. However, the leaving the signal as integer type will not synthesize, but result in a error message unless a type conversion is performed to cast the integer variable to a signal having a standard logic vector type. There is a purpose to all these contortions, but I no longer ascribe to them as I once did.

I prefer working in Verilog because these type casting operations are not required. Verilog does have its own quirks that produce a number of errors and which can be difficult to identify and correct. In either language, you will have to demonstrate a certain level of attention to detail unless you are okay with spending a significant amount of time fiddling and debugging HDL code.

From a Verilog perspective, you can certainly learn a lot from Arlet Ottens' 6502 core, and the work that EEyE and Arlet have posted on githib. In particular, Arlet's implementation of the 6502 ALU is particularly elegant and you can use it as good example for portions of your minimal CPU. Arlet's implementation of the 6502 controller is one I would recommend to anyone. It is a very clean example of a well designed and implemented state machine. My implementation of the 65C02 ALU is also available on github, but my core's controller is based on an entirely different principle. It's implementation is not at all suitable for your CPLD implementation, but Arlet's implementation would be a fine model on which to base your CPU's controller. (I am sure that there are others on the forum that have also posted good good that you can use, I've only really looked in depth at Arlet Ottens' core, and its derivatives from EEyE and BigEd.)

If you've not received a example from me by Sunday, post a query on this thread or send me a PM as reminder.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Wed Jul 25, 2012 5:02 am 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
Thanks for the help. The reason why i use VHDL is because i had it in my college classes, so i already know the basic stuff. The most complex thing I've done was a simple binary calculator with a state machine.


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Mon Jul 30, 2012 2:54 am 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
MichaelM wrote:
Dajgoro:

Perhaps I'm off track here, but I think EEye's correct about your ucf file syntax. You appear to be using a pin label syntax I've come to expect from BGA packages, not the TQFP or PLCC packages that I expect you to have purchased.

I have been working this weekend toward that CPLD minimal CPU implementation we discussed earlier in the week. The ALU portion is synthesizing and fitting into 27 macrocells. I got to this point using my 10.1i SP3 toolset without setting any constraints. Seeing yours and EEyE's posts, I went into the tool and had it generate the ucf for the pins it chose to see the syntax. The following is a snippet from the ucf file produced by ISE 10.1 for a XC9572-7PC84 CPLD.

Code:
#PINLOCK_BEGIN

#Sun Jul 29 19:34:48 2012

NET "CE"             LOC =  "S:PIN70";
NET "Clk"            LOC =  "S:PIN9";


In the case of these pin locks, I see the syntax I would expect from a PLCC-84 package. In other words, I see a Pxx or Pinxx instead of a Ax syntax.

I haven't used CPLDs in a long time. FPGAs have been much more applicable to my projects, and the decoding functions I used to use them for have essentially been built into the microprocessors that I've been applying. It is curious to note a special option available for the part type: XC95*, which essentially allows the tool to select the CPLD into which the design fits. I'll see how that setting affects the fitting of a minimal CPU into a CPLD.

Seeing how the ALU adder expanded my estimate for the ALU's adder/subtractor, I am going to predict that the minimal CPU will likely overflow the XC9572, and may require a 108 macrocell (as suggested by Andre Fachat) or even the larger 144 macrocell device. Earlier today, I found WP214, "The TTL Burn-Rate of Xilinx CPLDs", on the Xilinx website. It provides some interesting estimates for the macrocell requirements of standard TTL MSI components/functions.

When I have completed a minimal CPU in an XC95xxx component, and gotten a functional testbench running for, I'll post it to github and send you a link.

As I said above, the ALU portion is coded and synthesizing, and I'm currently designing the control logic and execution state machine. I've defined an instruction set based on some of my current FPGA work. I'll wait until it's complete before claiming success, but given the synthesis and fitting results for the ALU logic, there's a good chance of that for the instruction set and CPU architecture I've defined.


I would prefer the 108 too, but as i said earlier it is very hard to find, and not to mention in plcc packages. Actually i do have one that i got from ebay, and it was the last one. I planned to use it for that graphic module that i discussed a while ago. So i can only borrow it for making tests.


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Wed Aug 01, 2012 1:43 am 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
There are 2 solutions to this problem.
1. Use the 74ls181 as an external alu, free, since i already have it.
2. Buy the 108 and buy the pcb adapter, and pins...


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Wed Aug 01, 2012 9:23 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I vote for the 108. That way you can experiment more with HDL.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Wed Aug 01, 2012 6:44 pm 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
Or i could buy the smd 108 and the adapter and use it for the graphic module, and keep the plcc for this project, where i can still add the external alu if i wish.


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Thu Aug 02, 2012 2:14 am 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
For clock generation, what do you recommend the MC6875 or something else?


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Thu Aug 02, 2012 2:28 pm 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
Yesterday i noticed an issue with the smd version of the 108, you can see a picture of it on ebay:
http://www.ebay.com/itm/XC95108-10PQG10 ... 19d15e4015

And the only adapters that i found look like this:
http://cgi.ebay.com/ws/eBayISAPI.dll?Vi ... 1117510239

The issue is that the adapters are made for ic that are square like, but the 108 is not. So i would need another solution for soldering it.


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Thu Aug 02, 2012 2:55 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Gonna be tough finding a cheap adapter of that type. I've seen them for around $50...

Is the 84-pin PLCC version of the '108 too small for your project?

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: Minimalistic CPU
PostPosted: Thu Aug 02, 2012 4:02 pm 
Offline
User avatar

Joined: Mon Aug 08, 2011 2:48 pm
Posts: 808
Location: Croatia
ElEctric_EyE wrote:
Is the 84-pin PLCC version of the '108 too small for your project?


No, actually i would like to have it, but it is hard to find, and expensive.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 82 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 12 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: