6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 1:48 pm

All times are UTC




Post new topic Reply to topic  [ 137 posts ]  Go to page Previous  1 ... 5, 6, 7, 8, 9, 10  Next
Author Message
 Post subject: Re: M65C02A Core
PostPosted: Wed Sep 07, 2016 3:24 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
I have re-released the M65C02A processor core on github. The repository contains my latest efforts on this core. A number of the extended instructions have not been fully tested, but the base 6502/65C02 instructions pass Klaus Dormann's functional tests. I have updated the description to match the implemented instruction set.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Wed Sep 07, 2016 6:40 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Thanks Michael - glad to see it back!


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Thu Sep 08, 2016 3:33 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Very nice, Michael! I browsed your docs, and noticed that you have a very clear style of explaining how things work, with only a handful of distracting typos. What I didn't find was a description of the interrupt sub-system in your core. For selfish reasons (which you may be able to guess) I would like to request a link to (or a description of) how the M65C02A handles external interrupts. I can't read HDL yet, so the source isn't much help for me at this moment.

Thanks,

Mike B.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Thu Sep 08, 2016 11:34 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
barrym95838:

barrym95838 wrote:
I browsed your docs, and noticed that you have a very clear style of explaining how things work, with only a handful of distracting typos.
Thanks. I am glad that you found some of the information helpful. I also see you ran into one of my pet peeves: typos. It seems that no matter how many times I proof read an email, memo, or report, a number of those little critters seem to slip through. It is most aggravating to find that many of my writings contain several those little beasties even after years of reading, updating, and re-reading them. :D

The interrupt subsystem is independent of the core, and that's why you didn't find anything in the basic core description documents. I'll be glad to put something together this weekend for you on this subject. It's an important component of a microcomputer system using this core, or any other core, so it should at least be included in the documentation as an appendix.

Until then, perhaps the following description of the process will suffice:

As for the core itself, it processes a single signal, INT, which when asserted/set at the completion of an instruction takes the microprogram down a special interrupt service microroutine. The external logic receives the interrupt mask bit from P, and drives INT accordingly. The external logic supplies the vector for the interrupt request/source.

The core's interrupt service routine asserts a signal to the external logic that indicates that the interrupt request/source vector needs to be latched. This action takes place as the interrupt service microroutine is pushing the PC and P registers onto the stack. Since these activities require three memory cycles to complete, the external logic can resolve which interrupt request/source vector will be provided to the core during the cycle in which P is pushed to the stack.

Immediately upon completing the write of P to the stack, the interrupt service microroutine loads the memory address register with the interrupt request/source vector and fetches the low, high bytes of the interrupt service routine address from the processor's memory. Thus, to kick off the interrupt service routine, a JMP (vector) is essentially performed.

This architecture for the interrupt processing allows the external logic to be constructed to handle more interrupt requests inputs than a typical 6502/65C02 provides. It also allows the priorities to be determined by the user/designer rather than the core's internal logic. For 6502/65C02 compatibility, I've placed the interrupt vectors (RST, NMI, INT, and BRK) at the same locations as are typical for the 6502/65C02 processor family. However, since the vector is supplied by the external logic, these interrupt/trap service locations could have been placed any where in the address space.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Sat Mar 10, 2018 1:23 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
After many years, I decided to see if I could create a version of the SMRTool that I use on Windows as a cross-platform command line tool. One of my team wrote the original tool in C#, and after many attempts to get the source code unencumbered, I decided to just write my own version without the benefit of the source code for the original or the subsequent enhanced versions.

With that objective, I wrote a simple Python 3 script, in a very simplistic non-Pythonic manner, that reproduces the memory initialization files that I use for my micro-programmed 6502/65C02 processor cores. I've released it under a GPLv3 license on GitHUB. As a result of the effort, I found a few syntax errors using the Python version of the tool that passed through the Windows version. I also made a change in how the location counter is fitted into the branch address field that results in a difference between the two tools for the M65C02A_uPgm_ROM. The output of the Python version for the M65C02A_IDecode_ROM is identical to that of the C# version.

A more modular implementation will be the next part of this continuing effort. Its current form is very linear and does not use any classes, modules, or such. A better command line interface to select various outputs will follow. Once satisfied that tool is producing good outputs, I'll also remove all of the print statements of intermediate results. (It operates much slower in a Windows DOS box than it does in a Linux terminal window with all of the screen output currently included in the program.)

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Tue Apr 03, 2018 3:28 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Hugh Aguilar wrote:
It is not clear to me what your goal with the M65c02A is --- what kind of applications are you expecting to use it for? --- considering that the 65c816 is largely ignored these days, why do you expect your M65c02A to gain traction?
The M65C02A is the result of my desire to produce a micro-programmed 65C02 as a demonstration of the technique. There are many ways to implement a processor. Arlet Ottens and others on this and other sites have demonstrated that it very possible to use standard state machines in an HDL to implement the 6502, and that it is possible to implement its extended version, the 65C02, in a similar fashion. In fact, I think Arlet's 6502 implementation, and the extension provided by BigEd, Hoglet, and others to support the 65C02, is a very good example of the technique.

The original design utilizes a Programmable Logic Array (PLA) to implement the control logic. Defining the operation of a processor using a PLA is similar to implementing it using microprogramming except the logic requirements are much less with a PLA. Unfortunately, modern FPGAs do not implement the logic structure of a PLA. Some modern Complex Programmable Logic Devices (CPLDs) provide the necessary AND-OR logic arrays to simulate the logic structure of a PLA, but modern HDLs like Verilog and VHDL do not easily support the PLA logic structure.

I use microprogramming in much of my work, so I opted to use that specific technique in the implementation of my core as a challenge. The microprogrammed projects that I create for work use the technique in a variety of application domains. I have used it to interface EnDat 2.2 Heidehain angle encoders, the image generator for an industrial printer, the USB interface controller for the industrial printer, a dual axis motor controller, and a programmable Residue Number System (RNS) signal processor.

The technique has particular applicability to Complex Instruction Set Computers (CISC). I saw the M65C02 core as a challenge for me to implement a processor using microprogramming that would implement all of the complex addressing modes as microsequences. In my work, I typically use 32-bit, stack-oriented ALUs based in part on my love of the HP Reverse Polish Notation (RPN) HP-41CV calculator from the 1980s, and the Inmos Transputer.

On several occasions I have opted to use the soft-core processor from my FPGA vendor of choice: the Xilinx Microblaze 32-bit soft-core processor. And several times Xilinx has obsoleted my designs by updating the processor core itself and the I/O structure it uses to access the FPGA resources. Of course it is possible to rebuild the Xilinx components to use the updated specifications, but all of my custom designed components would have to be changed to connect to the new processor core and I/O standard.

Therefore, one use of the M65C02A is as a medium performance soft-core processor where I control all of the design: the processor core, the I/O interface specification, and the custom I/O interfaces/components that provide the solution that I want. Another use for the core is as a research tool for various trade offs. The general simplicity of the M65C02A core allows me to study issues such as memory management, cache memory, bus width, etc.

I have a 32/40-bit, medium performance, stack-based ALU that I use in many designs. I like the idea of running native 6502/65C02 code and seemlessly moving from the 8-bit domain to the 16-bit domain. The 65816 uses mode bits to provide 8/16-bit operations and index registers, but the M65C02A uses a different technique to accomplish the same thing in a manner that requires less programmer intervention: prefix bytes. It is a given that my approach with the M65C02A suffers from some loss in performance (both in speed and code density) relative to the 65816, but the M65C02A can slip in and out of 16-bit mode on an instruction by instruction basis which is something that the 65816 cannot do. With an appropriately defined assembly language syntax, the M65C02A can hide many of the details related to the prefix bytes from the programmer.

I am not developing the M65C02A other than for my satisfaction. Part of the process has illuminated a variety of trade offs to me. In that sense, the M65C02A has been as much a project for my personal enjoyment as it is a project for studying various implementation approaches. Over time I've changed the microprogram structure and been able to improve its overall performance. The last few things that I've added, in particular, the register stacks, have proven to be useful primarily in mapping the Mak Pascal Compiler, originally developed for the 8086, to the 6502/65C02 architecture. I've added a number of instructions like PSH #imm, PSH/PUL abs, ADJ #imm (adjust stack pointer), and others in order to overcome some of the architectural limitations of the 6502/65C02 with respect to the classical stack frame model expected by Algol-like languages: Pascal, C, and others. There are many other instructions that could be added, like NEG, but most can be synthesized with short instructions sequences, and their frequency of occurrence in many programs is not high enough to warrant allocating scarce opcode space to them.

Dr Jefyll encouraged me to include an in-core FORTH VM. His KimKlone is a great demonstrator for how to expand the functionality of an existing processor. I took his advice and devised the FORTH VM module included in the M65C02A to support either the DTC or ITC FORTH models after spending some time getting an ITC fig-FORTH running on the M65C02, the predecessor to the M65C02A. From that exercise it was clear that the inner interpreter of fig-FORTH was taking an inordinate number of cycles managing the IP and W registers and executing the NXT function. In a DTC configuration, the FORTH VM in the M65C02A only requires about 3 cycles for NXT.

All of the enhancements to the M65C02A have been added with the overall objective of maintaining compatibility with the 6502/65C02 processors. Any deviations from those processors present in my implementation can be easily corrected with changes to the microcode.

Finally, I am not trying hard to get the M65C02A accepted as commercially available silicon. I expect to use the core in products that I produce for my company or that I develop for others that can make use of its capabilities in a cost-effective manner. I have developed an 8-bit processor in response to Arlet's challenge of anycpu.org, a single cycle PIC16C5x-compatible core, and many other ALU cores and microprogrammed machines.

The M65C02A provides an opportunity to tailor the 6502/65C02 architecture to support Algol-like languages and FORTH in a more efficient manner. The objective is not to set the world on fire. There are many reasons to develop soft-cores, but commercialization of the cores as silicon implementations should not be a prime objective given the availability of high performance, low-cost processors like the ARM7TDMI-based LPC-series from NXP.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Tue Apr 03, 2018 7:03 am 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
MichaelM wrote:
The original design utilizes a Programmable Logic Array (PLA) to implement the control logic. Defining the operation of a processor using a PLA is similar to implementing it using microprogramming except the logic requirements are much less with a PLA. Unfortunately, modern FPGAs do not implement the logic structure of a PLA. Some modern Complex Programmable Logic Devices (CPLDs) provide the necessary AND-OR logic arrays to simulate the logic structure of a PLA, but modern HDLs like Verilog and VHDL do not easily support the PLA logic structure.

I thought PLAs were the same as PLDs such as the Lattice ips1048 PLD that the MiniForth was built on --- IIRC, that had AND and XOR rather than AND and OR.

I never heard of CPLDs --- I thought PLD technology was obsolete and everybody used FPGAs now.

Anyway, I don't know enough about that level of design to really comment on the subject.

MichaelM wrote:
On several occasions I have opted to use the soft-core processor from my FPGA vendor of choice: the Xilinx Microblaze 32-bit soft-core processor. And several times Xilinx has obsoleted my designs by updating the processor core itself and the I/O structure it uses to access the FPGA resources. Of course it is possible to rebuild the Xilinx components to use the updated specifications, but all of my custom designed components would have to be changed to connect to the new processor core and I/O standard.

I remember in 1994 that a lot of electrical engineers were mad at Motorola because Motorola kept coming out with a latest/greatest processor, and abandoning the users of their previous processors --- just tell their customers: "Well, rewrite your program!"
For example, the 6811 obsoleted the 6809 --- a lot of people liked the 6809 though, and there was quite a lot of 6809 software in use --- nobody liked being told to rewrite all of the 6809 software!
Because of this, a lot of people switched to the 8051-family --- the 8051 was arguably not as good technically as the Motorola chips, but it had the advantage that the users wouldn't get the rug pulled out from under them.
Later on, Motorola dropped out of the processor business --- their latest/greatest was ignored, and the 8051 is still used today --- so, apparently it is not a good idea to repeatedly pull the rug out from under your customers!

MichaelM wrote:
I am not developing the M65C02A other than for my satisfaction. Part of the process has illuminated a variety of trade offs to me. In that sense, the M65C02A has been as much a project for my personal enjoyment as it is a project for studying various implementation approaches.

Having your own soft-core processor that you like is worthwhile --- you can get good at it --- you don't have to spend time learning somebody else's design, or accept somebody else's design decisions. :)

Also, there is the security issue. In 1994 at Testra they developed the MiniForth processor --- one reason they did this, rather than use a commercially available processor (I suggested the MC6812), is that Red China wouldn't be able to steal their motion-control board software --- their greatest fear was that somebody would come out with a motion-control board at half the cost that runs Testra's software and is stamped: "Made in China."

I don't think this is possible now though. In 1994 Testra could implement a processor on the Lattice isp1048 PLD, and nobody else could --- Testra had their own HDL --- everybody else was using LDL (Lattice Design Language) from Lattice that was inadequate.
Now everybody uses Verilog or VHDL --- it is no longer possible to write your own HDL because the internal details of the FPGA are not available --- that is what I was told.
Realistically, any HDL programmer could produce a TOYF given the specs. My plan is to not tell anybody what the opcode map is, and not give away the source-code to my assembler. They would have to figure this out for themselves by looking at the machine-code and comparing it to the source-code that generated it. This is more difficult than it seems! Both group-A and group-B have two undefined instructions. I can define these as NOP instructions, so I have 3 NOP instructions in each group, and have the assembler randomly choose which NOP to use when a NOP is needed. This is going to seriously complicate figuring out what the opcode map is! It is still possible, but it will take quite a while.

MichaelM wrote:
Finally, I am not trying hard to get the M65C02A accepted as commercially available silicon. I expect to use the core in products that I produce for my company or that I develop for others that can make use of its capabilities in a cost-effective manner.

Well, I'm expecting the TOYF to out-perform the MSP430, and to be less expensive --- the MSP430 is pretty much the leader in the 16-bit world.

I am interested in your M65c02A though. It might use a lower-cost FPGA than the TOYF --- that would make it especially interesting! :)
I don't think it is going to compare well in speed, but blazing speed is not always needed --- low cost is always needed!
Can you comment on the cost of the FPGA used by the M65c02A as compared to what would be required for an MSP430 soft-core? The M65c02A has fewer registers, and it has fewer instructions, and it is 8-bit rather than 16-bit, so I would expect the M65c02A to use a pretty inexpensive FPGA as compared to most other designs.

BTW: What do you think of the Parallax Propeller? That is an example of somebody coming out with his own processor and getting it accepted as commercially viable --- it is not taking the world by storm --- it is making a profit though.
If my TOYF were as successful as the Parallax Propeller, I would be pretty happy! :)
I have group-A group-B and group-M instructions executing in parallel --- that is pretty 8) I think --- by comparison, the Propeller has 8 processors running in parallel, but they each have their own memory, which is pretty complicated.
I would actually expect your M65c02A to be more efficient than the Parallax Propeller --- you have a lot of support for high-level languages --- the Propeller ISA seemed kind of prickly and hard-to-use, from what I gathered from the documentation, so even with 8 processors running in parallel, it might still be slower than the M65c02A assuming that the M65c02A Forth has minimal ISR entrance and exit code.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Tue Apr 03, 2018 7:55 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
(Great overview and summary there Michael, thanks!)


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Sun Apr 22, 2018 4:56 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
Indeed - great summary and insight - it was a pleasure to read.

Thanks Micheal.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Sun Apr 22, 2018 6:37 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 230
Location: Kent, UK
I just wanted to compliment you on the project. Excellent documentation; very professional. It's clear you do this for a living. One thing I was looking for, but didn't find (perhaps it's there and I just missed it) was synthesis results (like you have for your Mini-S), showing utilization and max frequency.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Sun Apr 22, 2018 8:11 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Thank you for your compliments.

Per your request, a screen capture of the Module Level Utilization Report is included below. That is followed by the PAR report for the M65C02A targeting a Spartan 6.

The core is undergoing some changes, and at the moment, I can't vouch for its correctness. The project synthesized includes: 28kB Block RAM memory, 4kB Block RAM microcode memory, Vectored Interrupt Controller, User/Kernel Paged Memory Management Unit, Buffered SPI Master, two Buffered 16C550+ UARTs, and a placeholder External Memory Interface. Thus, the core is complete, but not yet fully tested. Some questions remain regarding whether the approach taken to implement the base-relative and stack-relative addressing modes using the native indexed by X addressing modes will suffice. A prior implementation allocated a significant number of opcodes to these addressing modes at the expense of the Rockwell instructions and general compatibility with the W65C02S. As currently implemented, the M65C02A implements the instruction set of the 65C02, and leaves out the WAI and STP instructions from the W65C02S. Those two instructions are replaced by two instructions, OSZ (Override X with S, plus Set operation/operand size) and OIS (Override X with S, plus Set operation/operand size, plus Add indirection to addressing mode). The microcode for the WAI/STP is included in the microprogram, and all that is required to put the instruction set of the M65C02A into compatibility with the W65C02S is a simple update of the microprogram files.

I am currently creating a model of the core in Python 3 for use with Mike Naberenzy's Py65 toolset. Since Python is not my native language, it's taking a bit of effort on my part, and it's not going to be particularly Pythonic. Once thing is clear, though, and that is that a significant amount of effort will be required for testing.
Attachment:
Screenshot-2018-4-22 Module Level Utilization.png
Screenshot-2018-4-22 Module Level Utilization.png [ 784.26 KiB | Viewed 12387 times ]

Code:
Release 14.7 par P.20131013 (nt64)
Copyright (c) 1995-2013 Xilinx, Inc.  All rights reserved.

MORRISMA-P7510::  Sun Apr 22 14:39:42 2018

par -filter D:/XProjects/ISE14.7i/M65C02A/iseconfig/filter.filter -w -intstyle
ise -ol high -mt off M65C02A_map.ncd M65C02A.ncd M65C02A.pcf


Constraints file: M65C02A.pcf.
Loading device for application Rf_Device from file '6slx45.nph' in environment D:\Xilinx\14.7\ISE_DS\ISE\.
   "M65C02A" is an NCD, version 3.2, device xc6slx45, package fgg484, speed -3

Initializing temperature to 85.000 Celsius. (default - Range: 0.000 to 85.000 Celsius)
Initializing voltage to 1.140 Volts. (default - Range: 1.140 to 1.260 Volts)


Device speed data version:  "PRODUCTION 1.23 2013-10-13".



Device Utilization Summary:

Slice Logic Utilization:
  Number of Slice Registers:                   790 out of  54,576    1%
    Number used as Flip Flops:                 790
    Number used as Latches:                      0
    Number used as Latch-thrus:                  0
    Number used as AND/OR logics:                0
  Number of Slice LUTs:                      1,946 out of  27,288    7%
    Number used as logic:                    1,853 out of  27,288    6%
      Number using O6 output only:           1,664
      Number using O5 output only:               3
      Number using O5 and O6:                  186
      Number used as ROM:                        0
    Number used as Memory:                      87 out of   6,408    1%
      Number used as Dual Port RAM:             84
        Number using O6 output only:            60
        Number using O5 output only:             0
        Number using O5 and O6:                 24
      Number used as Single Port RAM:            3
        Number using O6 output only:             3
        Number using O5 output only:             0
        Number using O5 and O6:                  0
      Number used as Shift Register:             0
    Number used exclusively as route-thrus:      6
      Number with same-slice register load:      4
      Number with same-slice carry load:         2
      Number with other load:                    0

Slice Logic Distribution:
  Number of occupied Slices:                   510 out of   6,822    7%
  Number of MUXCYs used:                       140 out of  13,644    1%
  Number of LUT Flip Flop pairs used:        1,959
    Number with an unused Flip Flop:         1,219 out of   1,959   62%
    Number with an unused LUT:                  13 out of   1,959    1%
    Number of fully used LUT-FF pairs:         727 out of   1,959   37%
    Number of slice register sites lost
      to control set restrictions:               0 out of  54,576    0%

  A LUT Flip Flop pair for this architecture represents one LUT paired with
  one Flip Flop within a slice.  A control set is a unique combination of
  clock, reset, set, and enable signals for a registered element.
  The Slice Logic Distribution report is not meaningful if the design is
  over-mapped for a non-slice resource or if Placement fails.

IO Utilization:
  Number of bonded IOBs:                        59 out of     316   18%

Specific Feature Utilization:
  Number of RAMB16BWERs:                        16 out of     116   13%
  Number of RAMB8BWERs:                          0 out of     232    0%
  Number of BUFIO2/BUFIO2_2CLKs:                 0 out of      32    0%
  Number of BUFIO2FB/BUFIO2FB_2CLKs:             0 out of      32    0%
  Number of BUFG/BUFGMUXs:                       1 out of      16    6%
    Number used as BUFGs:                        1
    Number used as BUFGMUX:                      0
  Number of DCM/DCM_CLKGENs:                     0 out of       8    0%
  Number of ILOGIC2/ISERDES2s:                   0 out of     376    0%
  Number of IODELAY2/IODRP2/IODRP2_MCBs:         0 out of     376    0%
  Number of OLOGIC2/OSERDES2s:                   0 out of     376    0%
  Number of BSCANs:                              0 out of       4    0%
  Number of BUFHs:                               0 out of     256    0%
  Number of BUFPLLs:                             0 out of       8    0%
  Number of BUFPLL_MCBs:                         0 out of       4    0%
  Number of DSP48A1s:                            0 out of      58    0%
  Number of ICAPs:                               0 out of       1    0%
  Number of MCBs:                                0 out of       2    0%
  Number of PCILOGICSEs:                         0 out of       2    0%
  Number of PLL_ADVs:                            0 out of       4    0%
  Number of PMVs:                                0 out of       1    0%
  Number of STARTUPs:                            0 out of       1    0%
  Number of SUSPEND_SYNCs:                       0 out of       1    0%


Overall effort level (-ol):   High
Router effort level (-rl):    High

Starting initial Timing Analysis.  REAL time: 4 secs
Finished initial Timing Analysis.  REAL time: 5 secs

WARNING:Par:288 - The signal COM1/TF1/Mram_RAM1_RAMD_D1_O has no load.  PAR will not attempt to route this signal.
WARNING:Par:288 - The signal SPI0/TF/Mram_RAM1_RAMD_D1_O has no load.  PAR will not attempt to route this signal.
WARNING:Par:288 - The signal COM1/RF1/Mram_RAM1_RAMD_D1_O has no load.  PAR will not attempt to route this signal.
WARNING:Par:288 - The signal COM0/RF1/Mram_RAM1_RAMD_D1_O has no load.  PAR will not attempt to route this signal.
WARNING:Par:288 - The signal COM0/TF1/Mram_RAM1_RAMD_D1_O has no load.  PAR will not attempt to route this signal.
WARNING:Par:288 - The signal SPI0/RF/Mram_RAM1_RAMD_D1_O has no load.  PAR will not attempt to route this signal.
Starting Router


Phase  1  : 11999 unrouted;      REAL time: 5 secs

Phase  2  : 11109 unrouted;      REAL time: 6 secs

Phase  3  : 6136 unrouted;      REAL time: 10 secs

Phase  4  : 6214 unrouted; (Setup:528, Hold:0, Component Switching Limit:0)     REAL time: 12 secs

Updating file: M65C02A.ncd with current fully routed design.

Phase  5  : 0 unrouted; (Setup:150, Hold:0, Component Switching Limit:0)     REAL time: 30 secs

Phase  6  : 0 unrouted; (Setup:42, Hold:0, Component Switching Limit:0)     REAL time: 31 secs

Updating file: M65C02A.ncd with current fully routed design.

Phase  7  : 0 unrouted; (Setup:0, Hold:0, Component Switching Limit:0)     REAL time: 39 secs

Phase  8  : 0 unrouted; (Setup:0, Hold:0, Component Switching Limit:0)     REAL time: 39 secs

Phase  9  : 0 unrouted; (Setup:0, Hold:0, Component Switching Limit:0)     REAL time: 39 secs

Phase 10  : 0 unrouted; (Setup:0, Hold:0, Component Switching Limit:0)     REAL time: 39 secs
Total REAL time to Router completion: 39 secs
Total CPU time to Router completion: 40 secs

Partition Implementation Status
-------------------------------

  No Partitions were found in this design.

-------------------------------

Generating "PAR" statistics.

**************************
Generating Clock Report
**************************

+---------------------+--------------+------+------+------------+-------------+
|        Clock Net    |   Resource   |Locked|Fanout|Net Skew(ns)|Max Delay(ns)|
+---------------------+--------------+------+------+------------+-------------+
|           Clk_BUFGP | BUFGMUX_X3Y13| No   |  400 |  0.019     |  1.252      |
+---------------------+--------------+------+------+------------+-------------+

* Net Skew is the difference between the minimum and maximum routing
only delays for the net. Note this is different from Clock Skew which
is reported in TRCE timing report. Clock Skew is the difference between
the minimum and maximum path delays which includes logic delays.

* The fanout is the number of component pins not the individual BEL loads,
for example SLICE loads not FF loads.

Timing Score: 0 (Setup: 0, Hold: 0, Component Switching Limit: 0)

Asterisk (*) preceding a constraint indicates it was not met.
   This may be due to a setup or hold violation.

----------------------------------------------------------------------------------------------------------
  Constraint                                |    Check    | Worst Case |  Best Case | Timing |   Timing   
                                            |             |    Slack   | Achievable | Errors |    Score   
----------------------------------------------------------------------------------------------------------
  TS_Clk = PERIOD TIMEGRP "Clk" 25 ns HIGH  | SETUP       |     0.038ns|    24.924ns|       0|           0
  50%                                       | HOLD        |     0.210ns|            |       0|           0
----------------------------------------------------------------------------------------------------------


All constraints were met.


Generating Pad Report.

All signals are completely routed.

WARNING:Par:283 - There are 6 loadless signals in this design. This design will cause Bitgen to issue DRC warnings.

Total REAL time to PAR completion: 41 secs
Total CPU time to PAR completion: 42 secs

Peak Memory Usage:  485 MB

Placer: Placement generated during map.
Routing: Completed - No errors found.
Timing: Completed - No errors found.

Number of error messages: 0
Number of warning messages: 8
Number of info messages: 0

Writing design to file M65C02A.ncd

PAR done!

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Sun Apr 29, 2018 9:54 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
All:

I have been evaluating Python for use for some of my work. Therefore, over the past month or so, I've undertaken to put it to use. Like a good C programmer, which I'm not, I've written a number of programs for work and number of tools for my M65C02A core and my MiniCPU core. For the M65C02A, I wrote the microprogram assembler, Simple Microprogram ROM Tool (SMRTool), and the beginnings of an assembler. The M65C02A assembler is intended to be table driven. The initial version processes the out of the Mak Pascal compiler I've previously described.

In the process, I wrote another tool to populate the table with the instructions that I have deemed as accepted. In other words, the table generator attempted to only generate instruction sequences that a programmer or compiler should use. There are a number instructions that may be formed using the prefix instructions that essentially duplicate existing instructions. The core merrily will execute these less than optimal instructions. So the instruction table generator omits these constructions in the hope that the programmer/compiler writer will use the new extended syntax and avoid generating instruction sequences for which a native operating exists.

After writing that tool, and before completing the assembler, it became clear that I would need a simulator for the M65C02A that was outside of the ISE iSim environment. This would provide a way to check the behavior of the core itself in the iSim environment with an environment that may have a wider audience. Toward that objective, I've spent a considerable amount of free time over that last three weeks working on an M65C02A simulator.

My first inclination was to build a model of the M65C02A that faithfully followed the Verilog RTL model. It did not take long to dissuade me from that approach given my need for a simulator in fairly rapid order. Perhaps the next generation of the M65C02A simulator can be constructed along those lines, which would then enable the simulation environment to support development and debugging of the microprogram.

Since the path toward a model derived from the Verilog RTL model was put aside for the time being, I decided to pick up Mike Naberezny's Py65 Monitor and 6502/65C02/65Org16 device models. With a little help from Mike N. and Big Ed, the P65 Monitor could be configured from the command line to accept the M65C02A MPU selection.

Since that time, I've been busy getting a model of the M65C02A built and running on that environment. I've made some minor modifications to the monitor, assembler, and disassembler modules, and added the M65C02A device module. I've gotten majority of the model written and I've spot checked a large number of the addressing modes and instructions. In particular, I've checked many of the base instructions which are affected by the eight prefix instructions which are part of the M65C02A's instruction set.

The prefix instructions OSX, IND, SIZ, ISZ, OSZ, OIS, OAX, and OAY have been checked. The PSH #imm, PSH/PUL zp/abs, and the PHR/CSR rel16 have checked. The register stack instructions DUP, SWP, and ROT have been checked. The PLI/PHI (PHW/PLW) and INI (INW) instructions have been checked. The IP-relative with auto-increment mode has been checked: ORA/AND/EOR/ADD/STA/LDA/CMP/SUB/ASL/ROL/LSR/ROR/TRB/TSB/DEC/INC ip,I. The accumulator extensions for the ASL (arithmetic overflow) and LSR (ASR - arithmetic shift right) instructions in the 16-bit mode have been checked. The 16-bit CMP sets the V flag as expected which enables the multi-flag signed/unsigned conditional branch (8-bit displacement) and conditional jump (16-bit displacement) instructions. The 8-bit and 16-bit displacement unconditional branch, BRA rel, has been checked.

The OSX instruction has been checked to change the default stack pointer from S to X as desired. The JSR and RTS instructions have been checked to operated with OSX as desired as have the PHP/PHA/PHX/PHY, PLP/PLA/PLX/PLY, PSH/PUL.

Using X as a base pointer and the S as the stack pointer allows base pointer relative addressing using the native indexed by X addressing modes, and stack pointer relative using the OSX (OSZ and OIS) prefix instructions. The W65C0S instructions WAI and STP have been replaced by OSZ (OSZ + SIZ) and OIS (OSX + IND + SIZ). Thus, stack-relative access of 16-bit quantities is no more expensive than BP-relative addressing with X. Direct transfers between S and A are possible using the OAX TSX and OSX TAX instruction sequences.

I am not particularly good at using collaboration tools. Therefore, for the time being the Py65 files in my GitHub account will have to do for anyone interested in the M65C02A simulator. I will continue working on the model, and continuing to verify the implementation. I have a lot to learn about the Python ecosystem, so there's going to be a lag before I am able to contribute this model back to the baseline. In particular, I have not used the unit test environment, so none of the M65C02A features and peculiarities are currently being tested in an automated manner. I've used the interactive assembler and the monitor for testing. The cycle counts that decorate the instruction decoder are woefully inaccurate given the number of ways the prefix instructions can be added to the vast majority of instructions. The resulting extended instruction can have 16-bit operations/operands and one or two levels of indirection.

I am considering one major change. For the JSR abs, RMBx/SMBx zp, BBRx/BBSx zp,rel instructions, and possibly a few others (TRB/TSB) which don't have indexed by X addressing modes, I am considering converting them to (abs,X) or (zp,X) when the IND or OIS prefix instructions are added. I feel that this will make these instructions much more useful for HLLs.

If you pull the M65C02A Python Model and use it, please send me any bug reports you may find, as well as any suggestions for improvements, etc. Please remember, the assembler included is simply the 6502/65C02 8-bit assembler. Therefore, it does not yet deal with 16-bit immediate operands. I did modify the parse regular expression so it can correctly parse the BBRx/BBSx zp,rel syntax. However, I've not gotten the assembler to accept it and to generate the byte codes.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Sun Apr 29, 2018 10:03 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
I'm sure having a fast and convenient model will be a help! It's always good to be able to fork and tweak (or overhaul) an existing working bit of software.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Tue May 01, 2018 1:21 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Last night I was awakened by the girls wanting to go out around 2:00 am: it's hard to sleep through a 100 lb Akita breathing in your ear and scratching on the side of the bed. Good thing Akita's don't often bark. :D Since I was up, I decided to finish correcting some minor errors to my M65C02A instruction set table generator and adding the last pair of instructions to the table of acceptable instructions.

I am working on a table driven assembler in Python for the M65C02A core. The instruction set table generator, OpcodeTblGen.py, produces a file that will be loaded into the assembler and used to parse the assembler files. With my instruction table generator, I've restricted the combinations of prefix instructions that may be used. The instruction table generator uses a text file, OpcodeTblGen.txt, to produce the acceptable instruction combinations.

For those following this project, the following three text files found in my PyAsm65 Github account may be of interest: OpcodeTbl.txt, InstrByOpcodeTbl.txt, and InstrByNameTbl.txt. The last two files are generated by SortedOpcodeTblGen.py.

The first file is the complete list of instructions that the assembler will accept. The second file sorts the first file by the base opcode, and the third file sort the first file by the base name of an instruction. For the third file, the instruction names/mnemonics are further sorted by opcode; they start with the base opcode of the first occurrence of the name. Thus, M65C02A-specific instruction mnemonics will appear at the end of the file, and in the order in which they are appended to the list of instructions in the first file.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: M65C02A Core
PostPosted: Wed May 02, 2018 12:19 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Well I thought I had a good idea on how to make certain instructions which lack an indexed by X addressing mode, like TRB/TSB, RMBx/SMBx, BBRx/BBSx, more useful. When the prefix IND is applied to these instructions, my thinking was to automatically convert the zp or abs addressing modes of these instructions into (zp,X) and (abs,X) using the microprogram. This would enable these instructions to use local pointers in the stack frame, with the function's base pointer in X.

Unfortunately, that thought had already occurred to me several years ago, and the M65C02A RTL and microcode already had the mechanism built in to convert all zp and abs addressing modes into stack-relative addressing modes, when prefixed by OSX: zp | zp,X => zp,S, (zp) | (zp,X) => (zp,S), (zp),Y => (zp,S),Y, etc.

I guess I didn't mark the trail with enough breadcrumbs. :D Will have to modify the M65C02A simulation model previously discussed to incorporate this feature.

I like to leave myself long, detailed notes in my code, but the notes in the RTL and microprogram source is not there on this subject. :oops:

_________________
Michael A.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 137 posts ]  Go to page Previous  1 ... 5, 6, 7, 8, 9, 10  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron