65c02 Emulator

leabre · Post by **leabre** » Tue Mar 02, 2004 9:31 pm

Greetings,

I have been working on a general purpose 65xx simulator program these past few weeks. I'm wanting to share it with people but everyone I post seems no one cares (except me).

The CPU Core is designed to be a CPU object that is a self-contained blackbox that can be used anywhere. This means that it uses events to communicate with the outside world.

I have implemented a 65c02 core, 6502 core, will be implementing a 6502u (6502 with undocumented opcodes), 65816, and a never released 65832 (32-bit 65816 decendant) (just for the heck of it since I have the data sheets).

The simulator will use the core for single stepping and debugging. The simulator will have an interface to plugin assembler, dissambler interfaces. Ultimately, given an assembly language file, I can change the source codes while in debug mode and won't have to restart or recompile to get the changes, just step into the next line. This is important, because I make most my NES/SNES/ATARI games in ASM anyway as opposed to some other language... ?C?, ?Pascal?.

The memory object sends events if a memory location has changed/been read. Basically you can watch all memory or just a specified range. If the range changes, you can be notified (and update a graphics display or whatever).

In recent days, I've started moving this project into C++ (it was in C# originally) because of the optimizer. Now, if you are using the optimizing module, when a JSR is invoked, it'll scan from the target range to the RTS and then will compile it into native code (in this case, Intel x86). Each subsequent call it'll just execute the native code instead. If a memory location in that range changes, it'll recompile that range. If a branch causes a jump outside the range then the compiled code will execute until the branch and then it'll execute the branched location emulated. If it returns into the range, it'll just run emulated (I can't compile everything and all things potentially dynamic).

This optimization isn't important so much for the 8-bit CPU, but becomes more important for the 16-bit version. I'm still trying to work out the timing in the CPU core, but even more so, I have to work out the timing as well in the compiled code, it still has to know the clock cycles and keep timing for it. Sigh.

Anyway, that's the project. I'm not such a low-level programmer usually, usually I write business applications for a living. But this is fun.

Anyway, is anyone here interested in using it when it's ready? I'll have the first 65c02 CPU core ready within the month and then the simulator program comes after I implement the assembler. I've never written one so this will be my first assembler.

Thanks,
Leabre

SimonJ5 · Post by **SimonJ5** » Wed Mar 03, 2004 10:31 am

Leabre,

I am sure that most people here would be interested in seeing your emulator. Upload it to Mike so that he can add it to the available downloads.

Personally speaking, I would be interested in seeing the datasheets for the '65832'; are they available on the 'net?

Regards,

Simon

kc5tja · Post by **kc5tja** » Thu Mar 04, 2004 6:22 am

leabre wrote:

I have been working on a general purpose 65xx simulator program these past few weeks. I'm wanting to share it with people but everyone I post seems no one cares (except me).

Considering the relative dirth of 65816 emulators/simulators around, I would be very interested in your software. Does it run under Windows? Linux? If it runs only under Windows (you mentioned you wrote business software for a living), how easy is it to port to Linux?

I've been thinking of a kit project, and may be interested in writing a software simulator for it, so that I can develop the software for it before I actually start building the hardware.

--
Samuel A. Falvo II

leabre · Post by **leabre** » Thu Mar 04, 2004 6:09 pm

kc5tja wrote:

leabre wrote:

I have been working on a general purpose 65xx simulator program these past few weeks. I'm wanting to share it with people but everyone I post seems no one cares (except me).

Considering the relative dirth of 65816 emulators/simulators around, I would be very interested in your software. Does it run under Windows? Linux? If it runs only under Windows (you mentioned you wrote business software for a living), how easy is it to port to Linux?

I've been thinking of a kit project, and may be interested in writing a software simulator for it, so that I can develop the software for it before I actually start building the hardware.

--
Samuel A. Falvo II

I wrote another reply but for some reason it didn't post so I'm writing again (but its a different answer).

The CPU core was originally written in C# with no platform/MS specific code (except 2 APIs to assist me with the high-resolution timing).

The CPU cores are changing into C++ for performance reasons. Anyway, I can do in C++ with 500 +/- lines of code that in C#, I can't do any better than 2,900 lines of code no matter how hard I try to do better.

My intention was to write the CPU core in C++ in MS VC++ and then provide managed wrapper so my debugger can interact with it. My debugger, assembler, profiler, etc. are all written in C# (they aren't as performance critical as is the CPU core).

The only thing platform specific about the CPU core is the optimizer that compiles the emulated instructions into x86 code to be executed natively in certain conditions. By doing this in mixed mode managed/unmanaged code allows me to provide the initial CPU design and memory manager and then drop in the improved memory manager that dose the optimizations when it is ready without having to rebuild/recompile any projects that depend on the CPU core.

I hadn't considered Linux because it's not simple enough for me (yet) (I like things extordinarily simple and straightforward) however, the source code to the CPU core will be provided. It shouldn't be any effort whatsover to seperate the unmanaged CPU core from the manage wrapper as they are in seperate files. However, I don't know if I'll be able to preserve the "completely-isolated-blackbox" approach I'm taking for the CPU core if it is seperated. I'm not a C++ programmer by trade so I'm not exactly sure at the mement how to achieve the same concept of events and delegates that exist in C#.

I'm aware that a delegate is much like a glorified function pointer, and an event is more akin to sending messages (in windows) but, I need it to be clean.

Thanks,
Shawn

kc5tja · Post by **kc5tja** » Fri Mar 05, 2004 1:18 am

leabre wrote:

The only thing platform specific about the CPU core is the optimizer that compiles the emulated instructions into x86 code to be executed natively in certain conditions. By doing this in mixed mode managed/unmanaged code allows me to provide the initial CPU design and memory manager and then drop in the improved memory manager that dose the optimizations when it is ready without having to rebuild/recompile any projects that depend on the CPU core.

So, then, it's not terribly very portable. Pity.

Quote:

I hadn't considered Linux because it's not simple enough for me

This is laughable. I gave up on Windows programming years ago because Windows is, without hesitation, the single hardest, most obscure, and utterly twisted API I've ever written for. For me, programming the Win32 API raw is *way* easier than using MFC or ATL. And I've written for quite a few, including AmigaOS (bar none, the simplest), X11 under various toolkits (GTK is my favorite), and raw SDL (I'm writing my own GUI layer for SDL right now).

You might want to try programming for Qt or for wxWindows. Both are 'cross-platform,' in that the exact same C(++) API applies to both Windows and Unix/X11 platforms without application modification.

Quote:

However, I don't know if I'll be able to preserve the "completely-isolated-blackbox" approach I'm taking for the CPU core if it is seperated. I'm not a C++ programmer by trade so I'm not exactly sure at the mement how to achieve the same concept of events and delegates that exist in C#.

If you're making use of COM, then I'll have to untangle that from the code, should I decide to port it. However, that being said, I don't know anything at all about C#; but the basic concepts you speak of apply to pretty much all languages (assuming, of course, Microsoft didn't conveniently redefine those terms to mean what they wanted, and not what the rest of the computer industry knows them to mean).

Quote:

I'm aware that a delegate is much like a glorified function pointer, and an event is more akin to sending messages (in windows) but, I need it to be clean.

Delegation and aggregation can do anything and everything that inheritance can do. However, for such a limited scope application as this, why are you using delegation? Does C# not provide inheritance?

Just curious, because if your code is portable enough, I can perhaps volunteer some amount of time to get it running under LInux.

--
Samuel A. Falvo II

leabre · Post by **leabre** » Fri Mar 05, 2004 6:42 pm

Quote:

Delegation and aggregation can do anything and everything that inheritance can do. However, for such a limited scope application as this, why are you using delegation? Does C# not provide inheritance?

Just curious, because if your code is portable enough, I can perhaps volunteer some amount of time to get it running under LInux.

Actually, I started out with the inheritance model but to do so means I have to make all my methods virtual. At least, in C#, after exhaustive performance tested, the virtual methods added 30% overhead, in some cases, even more. If enough of the methods in the call stack were virtual, rather than getting 800,000 iteration/sec on the adc/bit/rol opcodes I was getting 150,000. Remove the three virtual calls in the call chain and it jumps to 750,000 -- 820,000 max on my PIII 1Ghz 512MB Ram. Delegates are roughly the same as virtual methods in terms of performance. When optimizing. always look for the bottlenecks. In this case, every method on the CPU core is the inner loop, so it has to be tight. I found myself obfuscating the C# beyond maintanency to gain performance and I don't have those problems in C++.

Concerning C++. I haven't run any benchmarks on that core yet. I don't know how much overhead it adds.

Okay. I am intending to write the core in C++ seperated from any specific technology. The cpu_65c02.cpp file is raw C++. Because it isn't easy to mix manged/unmanaged in the way I desire, I have to aggregate. So I have another cpu_65c02_managed.cpp file that created a managed object that (can't inherit an unmanged object) so it has to create a class-member object of type cpu_65c02 and then aggregate the calls. But that adds overhead because the .NET runtime has to marshal the pointers. I can interop into a dll but that also adds roughly the same performance overhead (not as significant as the C# overhead).

Personally, I think the CPU cores should be portable to any platform, its just for my purposes, I'm abstracting the managed layer away for that use but I'm not going to tie it into the main CPU core and I'm not going to use COM. COM is a mess.

As a win32 assembly programmer, I can probly do the CPU in asm but I want to keep it easier to maintain. If I need that kind of performance I'll hand optimize the code in question in asm if needed. However, the optimizer has to be plugable to the CPU in some way. This way, myself or others can write other optimizers. Like I said, I'm not the best C++ programmer so I'm trying to figure all this stuff out. In C#, no problem, it's easy.

I'm not against require the CPU core to be inherited but in the C# world, by doing so, I'll need to override some methods and that adds more overhead than I am confortable with.

If you are truly willing to help, I can share my design notes with you and we can hammer this out together. If I can figure out how to install RedHat 9 in my Virtual PC (I keep getting problems) then I'm more than willing to work it out. Just remembre, primarily, I'm a Windows developer with Windows specific goals in mind. But I'm more than happy to make it portable in what ways I can. The best way, is to design it that way and not have too much platform specific code integrated into the cores or the memory manager. Currencly, in my C++ design, the only platform specific code I have is two API's for the high-resolution frequency timer.

But, I haven't figured out a good or successful way to throttle the CPU to 1/2MHz yet. I also haven't found anyone willing to help me with that area (even explain to me a technique) and examine the code to MAME, Nestopia and others are too difficult for me to isolate the relevent code or identify the general technique.

Thanks,
Leabre

8BIT · Post by **8BIT** » Fri Mar 05, 2004 7:46 pm

Quote:

...But, I haven't figured out a good or successful way to throttle the CPU to 1/2MHz yet. I also haven't found anyone willing to help me with that area (even explain to me a technique) and examine the code to MAME, Nestopia and others are too difficult for me to isolate the relevent code or identify the general technique.

Thanks,
Leabre

Leabre,

What I did on my sim was to keep track of the cycles used for each opcode and match that against the system timer. I used 1ms time slices so I could execute 1000 cycles work of opcodes. When I got there, I just waited for the next 1ms time slice to start and I did anlother 1000 clcyles worth. This works fine for most applications (except sound). I have included code in my latest version to try and throttle the instructions by using a delay loop so that 1000 cycles complete as close to the end of the 1ms window. It auto adjusts itself so if its too slow, it reduces the delay loop and if its too fast, it adds to it. It works better but the moving average still causes some wavering in the sound gerneration.

Now, honestly, I don't understand Windows programming at all. I read the previous posts and you guys are talking a foreign language. However, I did notice you are accessing a high freq timer. If this gives you a 1MHz resolution, you could use that as your time base for cycle counting.

If you want to see my source code, its on my web site under the download tab.

Daryl

http://65c02.tripod.com/

leabre · Post by **leabre** » Fri Mar 05, 2004 9:13 pm

8BIT,

Thank you much. That sounds like a good idea. I can't guarantee a 1MHz frequency on the timer for any machine except my own and even then, it is only accurate to 6 decimal place (which should be enough) but, at least the way I'm doing it, the overhead to check the frequency is something like .4 ms. I would have to do a lot of "adjusting" but it's a good idea to frame it like that.

The other tecnhique I'm trying to decipher from Nestopia.sourceforge.net is that they are dividing their time delays by the 60hz/50hz resulotion of the display (NTSC/PAL) and throttling accordingly. Or perhaps, they are only allowing so many cycles to transpire and then the screen is only updating every 1 hz. I'm not sure.

I'll look at your code and see what I can make of it.

Thanks,
Leabre

8BIT · Post by **8BIT** » Fri Mar 05, 2004 9:40 pm

leabre wrote:

I'll look at your code and see what I can make of it.

Take a look at the Continue_Execution function in the 65c02.c file. Also, the 65c02cpu.c file contains the cpu core itself.

Daryl

leabre · Post by **leabre** » Sat Mar 06, 2004 6:29 am

8BIT wrote:

leabre wrote:

I'll look at your code and see what I can make of it.

Take a look at the Continue_Execution function in the 65c02.c file. Also, the 65c02cpu.c file contains the cpu core itself.

Daryl

This would have to be the easiest CPU core to follow that I've ever seen, and the cleaneset, most straight forward implementation. My C# is actually the second most straitghtforward I've seen (I develop things with simplicity in mind because I would like to be able to maintain them 6 months from now). The problem is, the best I can do in C# is 2,900 lines of code.

Anyway, I'm thinking of porting this to Playstation and I only have C compilers for Playstation so you're CPU core looks like a really good starting point. I'll still do it in C++ for non PSX. I might implement a PPU if not sound, so I can play my NES roms on Playstation, that souunds like fun to me. I think I've already realized C# isn't the tool for the job (I was hoping it might work out) (in the same way, we all know Java isn't the tool for the job, but that doesn't stop people from trying). I will still release my CPU core in C# sourcecode for those who care/dare to see.

Thanks,
Leabre

8BIT · Post by **8BIT** » Sat Mar 06, 2004 5:29 pm

leabre wrote:

...This would have to be the easiest CPU core to follow that I've ever seen, and the cleaneset, most straight forward implementation. ...Thanks, Leabre

I cannot take credit for the CPU core. My sim was taken from the AppleWin Simulator written by Michael O'Brien. He was kind enough to post the source code. I took that, stripped the Apple ][ stuff off, added some of my own features, and released it. Thats why I made the source code available...since I'm not the original creator.

Hope it helps!

Daryl

schidester · Post by **schidester** » Mon Mar 08, 2004 3:25 pm

Quote:

I haven't figured out a good or successful way to throttle the CPU to 1/2MHz yet

The best technique is to run, say, 1000 cycles of your simulation, then wait until the next 1ms time.

That sounds obvious, but there are two important things to consider.

First, you should use a blocking function to do the 1ms wait. I forget what Windows API calls are available, but Visual Studio's online help will tell you the details. On Linux, some blocking wait calls are sleep(), nanosleep(), and pthread_cond_timedwait(). This is important because during the wait, the computer is free to do other things.

Second, don't run your 1000 cycles and then "sleep" for 1ms; your simulation will run slow because it may take your computer 50ms or so to run the 1000 cycles. Instead, use a call that waits until a specific time has arrived, such as 10:30:23.001 AM. In Linux, the pthread_cond_timedwait() function will wait until a specific time. In Windows, again, check the online help.

Scott

65c02 Emulator

65c02 Emulator

Re: 65c02 Emulator

Re: 65c02 Emulator

Re: 65c02 Emulator