6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Tue May 14, 2024 1:37 pm

All times are UTC




Post new topic Reply to topic  [ 141 posts ]  Go to page Previous  1 ... 6, 7, 8, 9, 10  Next
Author Message
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Dec 16, 2012 4:45 am 
Offline

Joined: Sat Oct 20, 2012 8:41 pm
Posts: 87
Location: San Diego
BigDumbDinosaur wrote:
7.7.2 The following instructions may be used with the emulation mode even though a Bank Address is not multiplexed on the Data Bus: PHK, PHB and PLB


Very strange, my WDC 65C816's do have the Bank Address multiplexed on the Data Bus in emulation mode.
I bought a whole tube of them about 13 years ago, the date codes are typically 9942 (1999)

I don't have any newer units to test with but is it possible they have changed the design of the processor?
Or are there major errors in the datasheet?


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Dec 16, 2012 10:29 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
I believe BDD has made a mistake. His claim that emulation mode can only access 64k is based on (a reading of) the manual, not on the behaviour of the device. The manual is not clear enough and still contains inconsistencies, so you have to use it with care. What the manual intends to say is that the contents of the Bank register are not multiplexed, not that nothing is multiplexed. The high order byte of the address is multiplexed, and for that reason 3-byte addressing does indeed work.

BDD is right to say that code running in high banks cannot (usefully) be interrupted, but my suspicion is that code can run from high memory if it's reached by a long JMP or JSR. It would be reasonable to do that if it's known that it won't be interrupted, or if the interrupt routine is complex enough to cater for this case - for example, the high code could be aborted, or some separate record of the high byte could be kept by convention. It would be an unusual set of choices, to run in emulation mode but allow high code to run, but, I suspect, it is not impossible. I haven't tested it.

The moral is that the '816 is a subtle and complex device, not perfectly documented. It's still a useful device and happily substitutes for a 65C02, provided you know what you're doing. If you stick to emulation mode and don't run random code sequences, you'll be fine.

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Dec 16, 2012 2:27 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
MichaelM wrote:
For those interested, I have posted Release 2.4 of the M65C02 core to its GitHUB repository.
Nice one, Michael! Thanks. I notice you've written some substantial notes in your GitHub wiki too. With diagrams!

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Dec 16, 2012 7:31 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8182
Location: Midwestern USA
clockpulse wrote:
BigDumbDinosaur wrote:
7.7.2 The following instructions may be used with the emulation mode even though a Bank Address is not multiplexed on the Data Bus: PHK, PHB and PLB

Very strange, my WDC 65C816's do have the Bank Address multiplexed on the Data Bus in emulation mode.
I bought a whole tube of them about 13 years ago, the date codes are typically 9942 (1999)

I don't have any newer units to test with but is it possible they have changed the design of the processor?
Or are there major errors in the datasheet?

I should clarify that statement by saying that the bank address is $00 regardless of what is in the DB or PB registers. That's implied from the WDC data sheet dated September 13, 2010. I haven't tested any MPUs I have here (date codes 0345, shipped from WDC inventory in 2009) for evidence of a non-zero bank address being emitted, so I can't confirm the data sheet's statements. It could be that the devices you have don't behave as stated in the data sheet, which wouldn't be a first. :D I sent a query to WDC asking for clarification.

Aside from the bank address matter, operation in emulation mode prevents the use of 16 bit registers, prevents stack relocation, forces the IRQ handler to differentiate between IRQ and BRK, etc. Also, while it is possible to use stack addressing instructions, such as PEA, PER, etc., the inability to select a 16 bit accumulator can make such instruction usage awkward, especially when it comes time to clean up the stack. All of this mitigates against emulation mode operation in new designs.

When I built POC V1 I initially ran it in emulation mode in order to debug the hardware. Once that was over with and I rewrote the ROM to operate in native mode, code got smaller, operation got faster (very much faster, in some cases) and some algorithms got simpler, thanks to 16 bit registers. I would never think of running the '816 in emulation mode except for initial hardware testing.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Dec 16, 2012 7:39 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Indeed, BDD, you wouldn't. But others might. My point is twofold: that emulation mode is indeed useful, and that implementing a useful subset of emulation mode in an FPGA is worthwhile (and a great deal easier than implementing the whole 816 featureset)

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Dec 16, 2012 8:59 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8182
Location: Midwestern USA
BigEd wrote:
Indeed, BDD, you wouldn't. But others might. My point is twofold: that emulation mode is indeed useful, and that implementing a useful subset of emulation mode in an FPGA is worthwhile (and a great deal easier than implementing the whole 816 featureset)

What you would have would be a 65C02 with extended addressing and some crippled '816-like features. :lol:

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Dec 16, 2012 9:41 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
What you would have is something enhanced beyond a 65C02.
Good grief - anything which doesn't look like one of your projects must be misguided?


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Jan 06, 2013 4:20 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
I followed the link that BigEd included in his post here. The data sheet listed a behavior during reset that is different from the NMOS 6502 and the more recent WDC implementation, the W65C02S. Somewhere on the WWW, while I was defining the instructions that I would implement in my M65C02 core, I read that the original W65C02 did not include the 4 Rockwell instructions, or the 65816 WAI and STP instructions. (I've not been able to find that reference/page again so I don't know if it was real or a dream; I should have bookmarked it.)

I've posed this question before: what behaviour after reset should be implemented? In the present M65C02 core, I implement the behavior that the reset vector is fetched without any writes or reads to the stack, i.e. page 1.

It also appears that the original implementation of the M65C02 core matched the instruction set of the GSC 65SC02 with the exception of the unimplemented instructions; the M65C02 treats all unimplemented instructions as single byte NOPs. I have read elsewhere a discussion that hypothesized about a problem in executing the kb9 basic interpreter as potentially being related to improper behavior of one or more unimplemented instructions. I think several on the forum have documented the behaviour of the unimplemented instructios as shown in the datasheet of the 65SC02. I suppose that begs the question of whether the behaviour of some unimplemented instructions should behave as multi-cycle NOPs instead of single cycle NOPs as I've implemented?

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Jan 06, 2013 4:40 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
I think when we last discussed the reset behaviour, I fell on the side of not performing any writes (in order not to perturb the state of the machine, in case a post-mortem debugging session was to follow)

As for NOPs, I would think code should not rely on any particular cycle counts (any more than it should rely on undocumented behaviour when they are not NOPs)

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Feb 24, 2013 9:53 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
I have updated the MAM65C02 repository on github. I have wrapped the M65C02_Core.v module with an external memory interface (asynchronous RAM/ROM and I/O devices), an interrupt handler (NMI and IRQ), and a clock generator. Thus, the wrapper, M65C02.v, implements a standalone processor, which only requires the addition of external memory devices and peripherals to implement a fully functional microcomputer based on the M65C02 core in a US$6 FPGA. (In looking through Digi-Key, it appears that for about US$15, a complete system can be configured around the M65C02.)

In addition, I have laid the groundwork for a more complete system-on-chip implementation by providing on-chip address decoding and extended address outputs. Adding a memory management unit should be fairly straight forward. The additional address outputs already included should allow the M65C02 SOC to support about 8MB of memory (RAM/ROM) using a number of mapping techniques.

I have not attempted to maximize performance. Instead, I have attempted to mimic the 6502 bus to the extent possible in the FPGA I chose to target: XC3S50A-4VQ100I. The description and other comments sections of the M65C02.v file provide additional details on the implementation. A summary of the synthesis/PAR results is provided in the README file in the root of the repository.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Mon Feb 25, 2013 8:42 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
That's a great step forward! I like Arlet's decision to use serial EEPROM and offer an SPI programming interface: it means that in principle anyone could put together a bit-banging programmer to turn the populated board into a functioning board, or indeed update the FPGA if needed. It's also true that helpful people could provide pre-programmed EEPROMS at a modest price, which opens up FPGA core-based systems to more people - they can use FPGAs without getting involved in the mechanics.

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Tue Feb 26, 2013 11:50 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Michael, that's a nice piece of work. I'm surprised your core can fit into such a small FPGA. I remember when I was doing my 6502 core comparison's, I had difficulty fitting any of the cores into anything smaller than an XC3S200, which is a bulky 208-pin device. I'm questioning myself now that maybe I made mistakes, or is there something very unique about your core?

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Tue Feb 26, 2013 1:28 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Like most designs there are the good parts and the bad parts. Fortunately for me, the good parts outweigh the bad parts. In the case of the M65C02 core, I have not fully optimized the core, but the fact that it uses a microprogrammed architecture does take advantage of some of the best features of a RAM-based FPGA.

Using the block RAMs for the control logic allows for a significant reduction in the number of LUTs required to implement the control logic. Even with control fields which are mostly encoded (and require some number of LUTs to decode), the savings is enough to allow a number of arithmetic functions to be added to speed up certain operations such as address generation. Fewer LUTs also require less interconnect, and therefore result in faster operation because the interconnect delays are reduced.

The microprogrammed implementation of the M65C02 core is also one of its bad parts. That is, it uses two block RAMs for implementing the instruction decoder and the control sequence. As a consequence, the XC3S50A only has a single free block RAM for use as internal program/data memory. (Actually, the instruction decoder only requires 0.5 of a block RAM to implement the fixed part of the microword. The other half, using the second port is still available, but I haven't tried to incorporate that portion into any logic or into the memory space of the M65C02 processor posted to github.)

I incorporated the last block RAM into the M65C02 last night as the Boot ROM, but I've not yet modified its testbench and conducted any verification of the result. However, the additional decode and multiplexer logic required has resulted in a lower operating speed. It is easy to fit the new design into a system operating with a memory cycle time corresponding to 14.7456 MHz instead of 18.432 MHz. The fit is easy for the tools, and there is some margin toward higher speeds, but the new configuration does not appear capable of reaching the 18.432 MHz target of the first release.

Interestingly, the W65C02S synthesizable core from WDC requires substantially more LUTs, registers, and slices than the M65C02 processor. To be fair, the information provided at the link appears to indicate that a UART (one of my UART implementations requires 277 registers, 366 LUTs, and 299 slices to implement) may be part of the core that's provided by WDC for synthesis, and I am completely unfamiliar with the Lattice FPGAs. Furthermore, the listed operating speed may indicate the memory cycle time of the resulting IP, and that would mean that the WDC core can operate at roughly 3x the memory cycle time of the M65C02 processor as configured. However, I suspect that it is the speed of the core without an external memory interface. Thus, the stated speed (42 MHz) is more in line with the theoretical clock rate of the M65C02 core: ~100 MHz (XC3S200A-5).

edit: added information regarding reg/LUT/slice count for synthesizable UART

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Sun Aug 04, 2013 11:11 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
All:

I have gotten back on track with my M65C02 soft-processor :) after spending a number of weeks working on my M16C5x soft-microcomputer on the same Spartan 3A board I will use for M65C02 testing.

I have spent my time this weekend optimizing the combinatorial path delays of the bus multiplexers in the top level soft-processor module, M65C02, the 65C02-compatible processor core module, M65C02_Core, the address generator module, M65C02_AddrGen, and the ALU modules: M65C02_ALU, M65C02_BIN, and M65C02_BCD.

I've had some success in this endeavor. I have been able to get the synthesizer to report an improvement in the predicted operating frequency from 55 MHz to 74 MHz. This is a result of improvements in the combinatorial path delays of about 35%. The mapper and the place and route tools have been able to report that the implementation meets a period timing constraint for 73.728 MHz. (Previous release of this soft-processor was restricted to operations less than 64 MHz in a -5 speed grade.)

To give credit where credit is due, I got the idea for optimizing the bus multiplexers from something that Arlet and EEyE were doing on this forum. Essentially, all data sources which are not enabled are gated to logic 0, and the data sources are all simply tied into an OR gate at the destination. I had argued for using virtual tri-state busses, but I've been able to get the synthesizer to select a logic 0 as the default value of a virtual tri-state output. (I was going to post a link to their Verilog code in that thread, but I was unable to locate it.)

Thus, I followed their lead, and the results are satisfying. There are other optimizations, particularly in the ALU, that can be made. However, I am going to proceed with verification on my Spartan 3A test board with the implementation recently uploaded to GitHUB.

Some of you may have downloaded a previous release of this soft-processor from GitHUB. The current release makes no fixes (and I don't think creates any bugs - the almost self-checking testbench stops at the same time as expected), so a re-download is not necessary unless you wish to take advantage of the reduced combinatorial path delays in the address generator and ALU to increase the operating speed of the soft-processor in your application.

_________________
Michael A.


Top
 Profile  
Reply with quote  
 Post subject: Re: Proper 65C02 core
PostPosted: Mon Aug 05, 2013 9:32 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Very nice speed improvement!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 141 posts ]  Go to page Previous  1 ... 6, 7, 8, 9, 10  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: