6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 12:58 am

All times are UTC




Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Sun Oct 08, 2017 7:29 am 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
This popped up about a week ago from MicroCore Labs, the same person/company that released the MCL86 (a small microsequenced 8088 core).

They've now got a 6502 core working.
The MCL65 is cycle accurate and has hardware level compatibility.

490 LUTs on a Xilinx Spartan-3
252 LUTs on a Xilinx Spartan-7

Initial announcement.
https://microcorelabs.wordpress.com/201 ... 5-working/

Many updates on MicroCore Labs blog.
https://microcorelabs.wordpress.com/

MCL65 Microsequencer-based 6502 running on Commodore VIC-20
https://www.youtube.com/watch?v=wYNedT1Yaow

MCL65 6502 FPGA core running Defender on Atari 2600
https://www.youtube.com/watch?v=p-HXVTrug9c

MCL65 6502 FPGA core running ProDOS on Apple II+
https://www.youtube.com/watch?v=yImo7vDnefI

Many more videos on Youtube.
https://www.youtube.com/channel/UC9B3Ta ... 9jg/videos


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 08, 2017 5:57 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Very interesting! (32-bit microsequenced engine uses 2 block RAMs presumably for microcode.) Looks like the idea is to keep the code proprietary and license it commercially?

Quote:
The next project will probably be to see how many MCL65 cores I can stuff inside of a Spartan-7 FPGA. If I can time-share sixteen cores inside of one physical core, have two cores share microcode and program RAMS, and instantiate 32 of these blocks, then I could potentially reach 1024 cores in this modest FPGA. Stay tuned!

After that I may investigate implementing a super-scalar microsequencer which can issue more than two instructions per clock cycle.


As a size reference, this was the result of Electric Eye's fitting adventures on a spartan 2:
Code:
        flops  slices   LUTs RAM16 HDL      Notes
A2601     138     467    840    0  vhdl     by retromaster
Syntiac   144     564   1063    0  vhdl     by Peter Wendrich
RB6502    146    1005   1942    0  vhdl     by Ruud Baltissen (work in progress)
cpu.v     155     276    474    8  verilog  by Arlet Ottens
sprow     160     667   1224    0  vhdl     by Robert Sprowson (enhanced Free6502)
T65       162     547    985    0  vhdl     by Daniel Wallner et al
bc6502    179     544    951    0  verilog  by Rob Finch
6502_tc   293    1076   1995    0  vhdl     by Jens Gutschmidt
65c02_tc  317    1318   2460    0  vhdl     by Jens Gutschmidt
MyCPU     325    1612   2980    0  vhdl     by Dennis Kuschel, inspired by 6502


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 08, 2017 8:00 pm 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
BigEd wrote:
Very interesting! (32-bit microsequenced engine uses 2 block RAMs presumably for microcode.) Looks like the idea is to keep the code proprietary and license it commercially?

Quote:
The next project will probably be to see how many MCL65 cores I can stuff inside of a Spartan-7 FPGA. If I can time-share sixteen cores inside of one physical core, have two cores share microcode and program RAMS, and instantiate 32 of these blocks, then I could potentially reach 1024 cores in this modest FPGA. Stay tuned!

After that I may investigate implementing a super-scalar microsequencer which can issue more than two instructions per clock cycle.
Yes, it appears to be a proprietary/commercial core, the same was true for the MCL86 8088 core. It provides an interesting optimization goal for anyone developing a new core though.
The time-share idea sounds interesting, though he'll have to do something with the stack page, 32 time-shared cores all trying to manipulate the stack, hmmm.

The super-scalar microsequencer sounds interesting, and is something I've put some though into in the past. The tricky bits will be separating the opcodes from the instruction queue (made tricky by the 6502's multi-byte instructions), and the handling of self modifying code.

Quote:
As a size reference, this was the result of Electric Eye's fitting adventures on a spartan 2:
Code:
        flops  slices   LUTs RAM16 HDL      Notes
cpu.v     155     276    474    8  verilog  by Arlet Ottens
Arlet's core may be smaller,̶ ̶a̶t̶ ̶l̶e̶a̶s̶t̶ ̶i̶n̶ ̶L̶U̶T̶S̶,̶ ̶t̶h̶o̶u̶g̶h̶ ̶e̶v̶e̶n̶ ̶t̶a̶k̶i̶n̶g̶ ̶t̶h̶e̶ ̶S̶p̶a̶r̶t̶a̶n̶-̶7̶'̶s̶ ̶l̶a̶r̶g̶e̶r̶ ̶b̶l̶o̶c̶k̶r̶a̶m̶s̶ ̶i̶n̶t̶o̶ ̶a̶c̶c̶o̶u̶n̶t̶,̶ ̶t̶h̶e̶ ̶M̶C̶L̶6̶5̶ ̶h̶a̶s̶ ̶a̶ ̶5̶0̶%̶ ̶a̶d̶v̶a̶n̶t̶a̶g̶e̶ ̶i̶f̶ ̶I̶'̶m̶ ̶d̶o̶i̶n̶g̶ ̶t̶h̶e̶ ̶n̶u̶m̶b̶e̶r̶s̶ ̶r̶i̶g̶h̶t̶.


Last edited by Cray Ze on Sun Oct 08, 2017 8:21 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 08, 2017 8:03 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
(Might be worth noting that Arlet's core doesn't use block RAMs - the RAM16 column is a Xilinx trick whereby a single LUT can be used as a 16 bit RAM, which is very dense and fast.)


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 08, 2017 8:19 pm 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
Thanks Ed, a misinterpretation on my part, for whatever reason I assumed 16Kbit blockram. A bit of an oops as I know the Spartan-2 blockrams are 4Kbit.
In that case, I'd say Arlet's core is smaller.


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 08, 2017 8:24 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
(I just did a quick resynth of Arlet's core for the Spartan 6 family with the 6-input LUTs:
Code:
        flops  slices   LUTs RAM16 HDL      Notes
cpu.v     154     105    333    8  verilog  by Arlet Ottens synthed for Spartan 6

)


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 08, 2017 8:44 pm 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
Interesting numbers, Alet's core uses less Spartan-2 4-input LUTs, than the MCL65 uses in Spartan-3 4-input LUTs.
Where as the opposite seems to be going on with the 6-input LUTs.

Are the 6-input LUTs of the Spartan-7 more efficient than the 6-input LUTs of the Spartan-6? The Spartan-7 documentation makes a point of calling them "REAL 6-input LUTs".


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 08, 2017 9:07 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
It looks to me like the Spartan 7 LUTs might be slightly more featureful, but not lots more. The Spartan 6 LUTs do seem to be a pair of 5-LUTs and a mux, which is going to affect speed but not, I think, function.

See the two PDF docs:
https://www.xilinx.com/support/document ... /ug384.pdf
https://www.xilinx.com/support/document ... es_CLB.pdf

There's a lot to digest in there!

Arlet's core is a bit of a marvel. Maybe it's no great surprise that a more general purpose 32 bit pipelined machine takes more resources.


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 08, 2017 10:07 pm 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
The 7 Series CLB pdf sums it up nicely under the "7 Series CLB Features" heading.
I think the most impact for better efficiency comes from "More routing between CLBs" and secondarily from "All slices support carry logic"

This is echoed in the following statement.
Quote:
The interconnect routing resources are increased in size, quantity, and flexibility
relative to the Virtex-6 FPGA family, improving the quality of automatic place and route
results.


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 10, 2017 4:44 am 
Offline

Joined: Thu Oct 05, 2017 2:04 am
Posts: 62
Hi,

I planned to release the code for the MCL65 once I finished testing it on the vintage machines and had a chance to clean it up...

It's on my website now: http://www.microcorelabs.com/mcl65.html

Thanks,
-Ted


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 10, 2017 5:09 am 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
MicroCoreLabs wrote:
Hi,

I planned to release the code for the MCL65 once I finished testing it on the vintage machines and had a chance to clean it up...

It's on my website now: http://www.microcorelabs.com/mcl65.html

Thanks,
-Ted
Hi Ted, thanks very much for dropping in with that information, I stand happily corrected. :)
Thank you even more for 'actually' releasing the source, I've seen projects in the past where the only thing ever released was the 'promise to release'.

Are you aware of Klaus's 6502 CPU test, it's very useful when it comes to verifying correct functionality.
https://github.com/Klaus2m5/6502_65C02_functional_tests


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 10, 2017 5:36 am 
Offline

Joined: Thu Oct 05, 2017 2:04 am
Posts: 62
Yup, I tested my code against Klaus Dormann's test suite as well as a number of others in addition to my own individual opcode tests. I also used a logic analyzer to probe a real 6502 to observe the bus cycle activity for each addressing mode, interrupts, stack operations, and reset. Hopefully somebody will let me know if I missed anything, but I believe the core should be nearly bus and cycle exact to the original 6502.

I have also tested the core on actual vintage hardware such as the Commodore VIC-20, Atari-2600, and the Apple II+ to verify that the core is cycle accurate to the original 6502. The core is implemented using a "vertical" microsequencer, so is able to duplicate the exact bus cycles that the MOS 6502 performs for each opcode and interrupt type.

Thanks,
-Ted


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 10, 2017 8:14 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Excellent Ted, thanks! And welcome!


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 10, 2017 8:36 am 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
I see the acknowledgement of the 6502 page wrapping bug in the microcode_rom.hex file.
If I'm reading the microcode correctly, the MCL65 doesn't suffer from the bug, is this correct?
Code:
@4F4 1033_0000  // Wait for CLK to be high            -- Fetch ADH - 6502 page wrapping bug
@4F5 1034_0000  // Wait for CLK to be low
@4F6 30BF_FF00  // r0 <= data_in AND 0xFF00
@4F7 4701_0000  // PC = r0 OR r1
@4F8 4A7F_0000  // ADDRESS_OUT = PC
@4F9 4DFF_0003  // SYNC=1, RD_WR_n=1
@4FA 1000_0102  // Jump to main loop


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 10, 2017 3:14 pm 
Offline

Joined: Thu Oct 05, 2017 2:04 am
Posts: 62
I have coded the MCL65 to "suffer" from the same bug... :)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 22 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: