6502.org http://forum.6502.org/ |
|
CPLD 6502 http://forum.6502.org/viewtopic.php?f=10&t=5418 |
Page 1 of 3 |
Author: | Arlet [ Mon Dec 31, 2018 1:42 pm ] | ||
Post subject: | CPLD 6502 | ||
Hello, After a long pause, I decided to get back into 6502 hacking, and implement an idea I've been toying with for a few years: using multiple small CPLDs to implement a 6502. My CPLD of choice was the Xilinx XC9572XL in 44 pin TFQP package. My original plan was to use 5 or 6 of them, but somewhat as a surprise to myself, I was able to fit it into 4. It's a very tight fit, especially for the control logic and the ALU. At first, it seemed completely hopeless watching the tools allocate big chunks of resources for the simplest expressions, but with a lot of experimenting and reading the fitter reports, I gradually gained an understanding on how to write the code so it would match the capabilities of the CPLDs and the tools. A few years ago I tried something similar, but noticed that the CPLD was a very poor fit for bigger adders (mostly because there's no fast carry chain, and also because the AND-OR structure is not good for XOR operations), and had given up on the idea. But then a while ago, I was going through the datasheets for another project, and I noticed that nice dedicated XOR port in each macrocell. I spent a few days going over different ways to turn that XOR into the centerpiece of the ALU. I had several reasons for picking this particular type of CPLD. It's fairly easy to solder, even by hand, I had previous experience with it, and it also has just enough resources to make this possible, but not too many to make it simple, resulting in a very nice puzzle that has kept me busy for a while. Everything builds with standard settings, optimized for speed, on ISE 14.7, except for a couple of KEEP attributes at strategic places. Also, automatic FSM extraction needs to be turned off, because it doesn't respect the ordering in overlapping casez clauses, introducing bugs in the control logic. I did run it with automatic FSM extraction once, copied the state encoding, and then turn it back off. I must say, the tools are pretty amazing when optimizing smallish logic functions, but tend to be very clueless how to deal with more complex stuff, such as deciding when to allocate another macrocell for a subexpression. I always check the fitter reports for extra variables (recognized by totally random names with dollar signs). If there are any, I try to rewrite my code to avoid them. Also, the code generation for the built-in "+" expression isn't very good (except for adding/subtracting simple constants), so it's best avoided. Also, optimizing for size sometimes makes the implementation bigger. I recommend always optimize for speed because you get more control over the result. I made a small board, containing the 4 CPLDs, plus an extra CPLD for UART/SPI, some SRAM and Flash, and 6 LED displays on a 74HC595 chain. My goal was to have it run at speed of 10 MHz. It's currently running stable at 12 MHz. I've added a wait state for the flash, mainly to test the RDY logic, but also because flash is rather slow. I briefly tried it at 24 MHz but it crashed. Haven't tried testing the maximum speed, nor did I do any analysis of the longest path. [Edit: I've added the source code for the extra CPLD as well, even though it's not strictly part of the project, it can still be useful] Bootstrapping was done by writing simple UART loader and hard-coding it into I/O CPLD (which is connected to full address+data bus). The UART loader reads 256 bytes over UART, writes them to memory, and jumps to the first instruction. From there, a secundary loader took more data from UART and wrote it to flash. Source code still needs to be cleaned up a bit, but I've made a repository on github: https://github.com/Arlet/cpld-6502 (By the way, if anybody's is working in Verilog, I highly recommend the Verilator project: https://www.veripool.org/wiki/verilator If I disable all output, it simulates the entire design for 100 million cycles in 21 seconds on my desktop PC, that's more than enough to run Klaus Dormann's verification program)
|
Author: | cbmeeks [ Mon Dec 31, 2018 1:46 pm ] |
Post subject: | Re: CPLD 6502 |
Very interesting project. I like how you labeled the different parts of the virtual CPU. |
Author: | Arlet [ Mon Dec 31, 2018 2:27 pm ] |
Post subject: | Re: CPLD 6502 |
Resource usage for each of the 4 modules: ABL module: Code: Function Mcells FB Inps Pterms IO Block Used/Tot Used/Tot Used/Tot Used/Tot FB1 11/18 28/54 52/90 9/ 9* FB2 15/18 41/54 80/90 9/ 9* FB3 17/18 39/54 86/90 9/ 9* FB4 14/18 42/54 79/90 7/ 7* ----- ----- ----- ----- 57/72 150/216 297/360 34/34 ABH module (still has a bit of room) Code: Function Mcells FB Inps Pterms IO Block Used/Tot Used/Tot Used/Tot Used/Tot FB1 16/18 33/54 85/90 9/ 9* FB2 8/18 32/54 72/90 9/ 9* FB3 5/18 26/54 22/90 8/ 9 FB4 13/18 33/54 85/90 7/ 7* ----- ----- ----- ----- 42/72 124/216 264/360 33/34 ALU module (only one macrocell left!) Code: Function Mcells FB Inps Pterms IO Block Used/Tot Used/Tot Used/Tot Used/Tot FB1 18/18* 35/54 49/90 8/ 9 FB2 18/18* 43/54 74/90 9/ 9* FB3 18/18* 46/54 86/90 9/ 9* FB4 17/18 36/54 42/90 7/ 7* ----- ----- ----- ----- 71/72 160/216 251/360 33/34 CTL module (lots of wide input functions, there still appear to be free macrocells, but they only have a few product terms. Most of the free product terms are in FB1, but that one only has 2 macrocells). Code: Function Mcells FB Inps Pterms IO
Block Used/Tot Used/Tot Used/Tot Used/Tot FB1 16/18 31/54 69/90 9/ 9* FB2 12/18 44/54 77/90 9/ 9* FB3 11/18 22/54 88/90 8/ 9 FB4 14/18 19/54 82/90 7/ 7* ----- ----- ----- ----- 53/72 116/216 316/360 33/34 |
Author: | Arlet [ Mon Dec 31, 2018 4:35 pm ] |
Post subject: | Re: CPLD 6502 |
Forgot to tell: the core is not cycle exact, but rather removes a couple of cycles in order to simplify control logic (and who needs dummy cycles anyway?)
There is still a useless cycle in PHA/PLA where instruction fetch is repeated. It is possible to remove those, but at the cost of considerable increase in control logic complexity. |
Author: | Dr Jefyll [ Mon Dec 31, 2018 4:54 pm ] |
Post subject: | Re: CPLD 6502 |
Whoa, fun project! And I share your feelings about the dummy cycles -- who needs 'em! Speeding up a JSR/RTS pair by 33% is something I can appreciate. Same for the other speedups. ZP, X and (ZP,X) are very commonly used in Forth, so it's fun to contemplate that. And of course implied instructions such as INX and ROL A are ubiquitous in all contexts, not just Forth. Great work! Thanks for posting! -- Jeff |
Author: | Drass [ Mon Dec 31, 2018 5:20 pm ] |
Post subject: | Re: CPLD 6502 |
Great to see this Arlet. Welcome back! I agree with Jeff, looks like a lot of fun — especially working to a tight fit! Cheers. |
Author: | BigEd [ Mon Dec 31, 2018 7:44 pm ] |
Post subject: | Re: CPLD 6502 |
Wonderful project Arlet! I see the readme on github gives some more details of how you partitioned the design - thanks for that. https://github.com/Arlet/cpld-6502#readme |
Author: | Arlet [ Mon Dec 31, 2018 8:06 pm ] |
Post subject: | Re: CPLD 6502 |
I just added schematic to github. https://github.com/Arlet/cpld-6502/blob ... matics.pdf |
Author: | Arlet [ Mon Dec 31, 2018 8:22 pm ] | ||
Post subject: | Re: CPLD 6502 | ||
Here's a simplified block diagram showing most of the interconnections.
|
Author: | GaBuZoMeu [ Mon Dec 31, 2018 8:52 pm ] |
Post subject: | Re: CPLD 6502 |
Very nice project, indeed Are you intend to implement some/all of the 65C02 opcodes as well? I have a problem with your link above - I get "404" There is most likely a typo in your block diagram: ABL will serve for AB[7:0] and ABH for AB[15:8] I assume Regards, Arne |
Author: | Arlet [ Mon Dec 31, 2018 9:00 pm ] |
Post subject: | Re: CPLD 6502 |
Thanks. Fixed the link and the diagram. I have PHX/PHY/PLX/PLY as well as BRA implemented. I tried to add INC/DEC A, but realized that the ALU doesn't have controls to perform that operation. Maybe I can still add the STZ and the (ZP) instructions. |
Author: | BigEd [ Mon Dec 31, 2018 9:14 pm ] |
Post subject: | Re: CPLD 6502 |
Very handy diagram - thanks! |
Author: | Chromatix [ Tue Jan 01, 2019 12:04 am ] |
Post subject: | Re: CPLD 6502 |
I wonder whether adding a signal to explicitly indicate dummy cycles is feasible in your design. The 65816 does this by holding both VDA and VPA low. |
Author: | ElEctric_EyE [ Tue Jan 01, 2019 3:22 am ] |
Post subject: | Re: CPLD 6502 |
Great things are possible with ISE14.7! Welcome back... I think 2019 is going to be a great year! |
Author: | Arlet [ Tue Jan 01, 2019 6:19 am ] |
Post subject: | Re: CPLD 6502 |
Chromatix wrote: I wonder whether adding a signal to explicitly indicate dummy cycles is feasible in your design. The 65816 does this by holding both VDA and VPA low. The logic would be simple enough. Finding a spare pin on the CTL part would be a bit more of a challenge. Figuring out a way to remove the remaining dummy cycles may be easier than freeing up an I/O pin. |
Page 1 of 3 | All times are UTC |
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/ |