6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun May 05, 2024 2:44 pm

All times are UTC




Post new topic Reply to topic  [ 125 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 9  Next
Author Message
 Post subject:
PostPosted: Fri Oct 29, 2010 6:00 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10797
Location: England
Hi Ruud
Another point about how many LUTs your design might use: the 6502 has an ALU and an incrementor for the PC. All the address indexing and stack pointer adjustment is done by the ALU. In NMOS, muxing is quite cheap and tristate multi-master busses are too. In FPGA, a bus mux might be free if the mux is simple and just in front of a register, or it might take up dedicated slices if not. Also, it would be easy to use many adders: for address indexes, for stack adjustment, even for inc/dec operations - this would increase the slice count, although it might be a tactic for a faster machine. It might also be easier to write, to debug, and to modify.

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Oct 30, 2010 12:37 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10797
Location: England
Hi EE
have you got Arlet's slice count right? The LUT count and slice count are generally somewhat related.

If you think it's worth the effort, you could perhaps check the synthesis report for these cores: it may well tell us number of add/subtracts or something else interesting.

I've got an old report here for a T65 build which says:
Code:
 Summary:
     inferred   1 Counter(s).
     inferred 145 D-type flip-flop(s).
     inferred  10 Adder/Subtractor(s).
     inferred   2 Comparator(s).
     inferred  56 Multiplexer(s).

and then later we see
Code:
HDL Synthesis Report

Macro Statistics
# ROMs                                                 : 2
 4x13-bit ROM                                          : 1
 4x2-bit ROM                                           : 1
# Adders/Subtractors                                   : 21
 16-bit adder                                          : 2
 16-bit subtractor                                     : 1
 5-bit subtractor                                      : 2
 6-bit adder                                           : 2
 6-bit subtractor                                      : 2
 7-bit adder                                           : 3
 7-bit subtractor                                      : 1
 8-bit adder                                           : 3
 8-bit addsub                                          : 1
 8-bit subtractor                                      : 1
 9-bit adder                                           : 3
# Counters                                             : 2
 26-bit up counter                                     : 1
 3-bit up counter                                      : 1
# Registers                                            : 93
 1-bit register                                        : 82
 2-bit register                                        : 2
 3-bit register                                        : 1
 4-bit register                                        : 1
 8-bit register                                        : 7
# Comparators                                          : 4
 3-bit comparator equal                                : 1
 3-bit comparator not equal                            : 1
 5-bit comparator greater                              : 2
# Multiplexers                                         : 40
 1-bit 4-to-1 multiplexer                              : 24
 1-bit 8-to-1 multiplexer                              : 1
 2-bit 32-to-1 multiplexer                             : 1
 2-bit 8-to-1 multiplexer                              : 5
 24-bit 4-to-1 multiplexer                             : 1
 3-bit 4-to-1 multiplexer                              : 6
 8-bit 6-to-1 multiplexer                              : 1
 8-bit 8-to-1 multiplexer                              : 1


Not sure what's the best way of digesting that - or if it's worthwhile - but if say Ruud was interested in comparing his implementation with another, he might look at those tables.

(The synth report is probably found in a file *.syr)

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Oct 30, 2010 7:57 pm 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
Hallo Ed,
BigEd wrote:
(The synth report is probably found in a file *.syr)

No *.syr found. I'm using Xilinx ISE Webpack. What are you using?

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Oct 30, 2010 8:15 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10797
Location: England
Same software. Is it hidden in a subdir? If you're using the GUI, maybe there's a way to request or suppress a synthesis report.

In my case, I ran using the command line, with something like

Code:
xst -ifn T65.xst -intstyle xflow -ofn ./T65.syr


Maybe the report doesn't normally have that suffix, maybe it's encoded in my scripts (which I've inherited). Sorry. It should be there somewhere. If you have a *.xst file perhaps it specifies the report with '-ofn'?

Ed


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Oct 31, 2010 1:13 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
BigEd wrote:
Hi EE
have you got Arlet's slice count right? The LUT count and slice count are generally somewhat related....


Thank you for double checking. I see what you're talking about. I re-ran Arlet's 2 files and came up with same result. He does mention BCD mode does not work on his site. Maybe this accounts for the discrepancy?

BigEd wrote:
... I've got an old report here for a T65 build which says:
Code:
 Summary:
     inferred   1 Counter(s).
     inferred 145 D-type flip-flop(s).
     inferred  10 Adder/Subtractor(s).
     inferred   2 Comparator(s).
     inferred  56 Multiplexer(s).

and then later we see
Code:
HDL Synthesis Report

Macro Statistics
# ROMs                                                 : 2
 4x13-bit ROM                                          : 1
 4x2-bit ROM                                           : 1
# Adders/Subtractors                                   : 21
 16-bit adder                                          : 2
 16-bit subtractor                                     : 1
 5-bit subtractor                                      : 2
 6-bit adder                                           : 2
 6-bit subtractor                                      : 2
 7-bit adder                                           : 3
 7-bit subtractor                                      : 1
 8-bit adder                                           : 3
 8-bit addsub                                          : 1
 8-bit subtractor                                      : 1
 9-bit adder                                           : 3
# Counters                                             : 2
 26-bit up counter                                     : 1
 3-bit up counter                                      : 1
# Registers                                            : 93
 1-bit register                                        : 82
 2-bit register                                        : 2
 3-bit register                                        : 1
 4-bit register                                        : 1
 8-bit register                                        : 7
# Comparators                                          : 4
 3-bit comparator equal                                : 1
 3-bit comparator not equal                            : 1
 5-bit comparator greater                              : 2
# Multiplexers                                         : 40
 1-bit 4-to-1 multiplexer                              : 24
 1-bit 8-to-1 multiplexer                              : 1
 2-bit 32-to-1 multiplexer                             : 1
 2-bit 8-to-1 multiplexer                              : 5
 24-bit 4-to-1 multiplexer                             : 1
 3-bit 4-to-1 multiplexer                              : 6
 8-bit 6-to-1 multiplexer                              : 1
 8-bit 8-to-1 multiplexer                              : 1


What version(s) of ISE?

I too, am looking for more comparative information that can be extracted from the Reports. Especially, some info relating to max speed. Won't be able to devote time until next week...

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Oct 31, 2010 7:33 am 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
[quote="Ruud"No *.syr found.[/quote]
Stupid me. I started rewriting RB65, haven't used ISE for almost two weeks and cleaned the directory. For details about the rewriting see André's 65k thread and later on my own RB65 one.

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Oct 31, 2010 11:41 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10797
Location: England
ElEctric_EyE wrote:
BigEd wrote:
Hi EE
have you got Arlet's slice count right? The LUT count and slice count are generally somewhat related....


Thank you for double checking. I see what you're talking about. I re-ran Arlet's 2 files and came up with same result. He does mention BCD mode does not work on his site. Maybe this accounts for the discrepancy?


Hi EE - your summary.jpg for Arlet shows 276 occupied slices, but your table says 465. Is your summary.jpg out of date or have you copied the wrong number?

On the topic of looking at more detail of the complexity of a design, I see in your jpg files there's a table of blue links to reports at the bottom of the page, the first of which is 'Synthesis Report' - hopefully that's the same as the *.syr file I use.

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Oct 31, 2010 12:05 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10797
Location: England
I've updated my tabulation (running it in parallel with EE's, hope that's not a problem) and I see that Arlet's is the only design which uses LUTs as 16-bit RAMs - this could be a reason why the slice count is by far the smallest.

The Xilinx documentation contains descriptions of some particular coding styles which will cause the tool to apply particular implementations.

Another point or two on density:
    How you encode control information will affect the complexity of the decode. One-hot, two-hot, binary, Gray code, etc.

    Whether you decode all bits of an encoding, or just the bits which distinguish interesting cases, will make a difference. (You'll get an effect like the 6502's undocumented behaviour of illegal opcodes .)

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Oct 31, 2010 11:33 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
BigEd,

I see where I've copied the wrong info regarding Arlet's core... MY apologies.

Integrated your table into the original post! :D

There is one piece of data missing. In case there is an update to a Core, we need to add a column titled "Last Core Updated Here" or something similar.

The data I posted was on all Cores available as of 10-26-10.


-EyE

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Nov 12, 2010 8:41 pm 
Offline

Joined: Thu Nov 11, 2010 10:14 pm
Posts: 9
Hi,

My name is Mike and I took over the T65 core a few years ago.
I've just got CVS access to opencores, so I will push back some changes.

The latest version is at www.fpgaarcade.com in the library

you'll also find a very elegant table based 6502 core there as well - my 68000 core in development uses a similar technique.

I would agree the T65 is hard to follow, but it sort of evolved. It is highly accurate, tested against a real chip and "finished" so you shouldn't need to do much work on it :)

The size if bigger than it could be, but FPGAs are so big now it has not been an issue. The T80 core has an option for some LUT RAMs which make it a bit smaller.

I'm working with the visual6502 netlist and back converting this to VHDL now. I will optimise this and see how small/fast we can get it.

Best,
Mike


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Nov 12, 2010 10:51 pm 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Mike,

Have you done any speed comparisons? or tried to push max speed for the T65 Core on any Xilinx FPGA's in a real world test?
I've done comparisons here as far as resources used, and pinouts, but I would've liked to do a "relative" speed comparison as well...

I do believe I used your core, from www.fpgaarcade.com, for comparison here.

What happened to Daniel Wallner? I guess I should edit the original post here!

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Nov 12, 2010 11:33 pm 
Offline

Joined: Thu Nov 11, 2010 10:14 pm
Posts: 9
Hi,
No, I haven't really pushed it.
I did achieve 40MHz with no problem in a Virtex4 - I am sure it would go quite a bit faster on modern FPGAs.

I don't know what happened to Daniel, we were in regular contact, then a few years ago I never heard from him again.

Best
Mike


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Nov 13, 2010 3:46 pm 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
Hallo Mike,

fpgaarcade wrote:
I would agree the T65 is hard to follow, but it sort of evolved.
I mentioned before that T65 was diffecult to understand, certainly for a newbie. So I made my own core: http://www.baltissen.org/vhdl/RB65.vhd. The original used twice the resources T65 needed in my FPGA. I now need about 85% of my original.
By studying your code I also start to understand what you mean with "evolved". I have been thinking about "evolving" my design as well but for the moment I keep it as it is as this is has one big advantage: I can easily add new opcodes or change the old ones.

Quote:
but FPGAs are so big now it has not been an issue.
This is a good reason to evolve my core only when needed.

Quote:
The T80 core ...
I don't know if you are familiar with the 1541Ultimate-II http://1541ultimate.net/content/index.php. Now it can be used as 1541 drive but one of the ideas I have is to turn it into a CP/M computer like the Z80 Second Processor for the Acorn Atom, http://en.wikipedia.org/wiki/BBC_Micro_expansion_units. I could make my own code or, with your permission of course, using yours. In that case I have to find out how it can be fit in my design.
The design is no secret: Z80, RAM, ROM and a PIO-like interface. This "PIO" communicates with a 6522-like interface that can be seen by the Commodore 64.

One question: T80 is (AFAIK) cycle exact. The Z80 needs 3 or 4 cycles for every byte it processes. But I don't need this cycle-exactness. Isn't it possible to tweak T80 in such a way that it only needs one, like the 6502 does?
The C64 provides a 8 MHz colour clock, which would mean we deal with at least a 24 MHz equivalent machine. And hopefully one of the onboard crystals, 24 or 50 MHz, can be used instead, resulting in a 72 or 150 MHz machine!

Many thanks in advance!

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Nov 14, 2010 7:13 pm 
Offline

Joined: Thu Nov 11, 2010 10:14 pm
Posts: 9
Ruud,

The T80 can certainly be speeded up but not easily in it's current state.
It's designed to reflect the (guessed) underlying hardware so all through the MCode section you will see MCycle used to tell it what to do on each cycle.

It would be better to branch the code, flatten the microcode and then pipe-line it to get a much faster design. Not a trivial task however...

Best,
Mike


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Nov 14, 2010 9:21 pm 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
Hallo Mike,
fpgaarcade wrote:
It would be better to branch the code, ...
In that case I think I'll start from scratch and use the same structure as used with RB65. But that will take some time as I'm not that familiar with the Z80 anymore.

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 125 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 9  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: