Emulating NES CPU and PPU on PIC32, too slow?
Emulating NES CPU and PPU on PIC32, too slow?
Hey everyone, long time!
I've been off programming in PIC32-land for a while now. Very much enjoying it! Here's my GitHub page if you are interested: https://github.com/stevenchadburrow/AcolyteHandPICd32
Anyways, I recently had a crazy idea to run a C-based NES emulator on this PIC32. Why use FPGA's when you got something so much cooler, right?
Wrong. I did manage to actually get it 'running' (I copied code from https://github.com/franzflasch/nes_emu and made necessarily modifications). But its *magnitudes* slower than what I need. I did some cycle counts and its literally hundreds if not thousands of cycles per single CPU and PPUx3 cycles. So the NES runs at 1.79 MHz, the PPU runs 3 times faster, that's just roughly 4 * 1.79 MHz = 7.16 MHz equivalent. My PIC32 is running a nominal 100 MHz lets say, thus 100 / 7.16 = 13.966 = 14x PIC32 cycles per 1x NES emulated cycles. [ This is all very rough math. ]
Thus to keep up the hardware emulation, my PIC32 would have to be doing all the work of the CPU and PPU in about 14 clock cycles. That's literally impossible.
My question to you guys: Am I thinking about all of this correctly?
When I first saw it working at all, that was neat. But then it was SO SLOW, barely crawling. This morning I optimized some code and got it twice as fast as previously, but I think I need like 30+ times the speed to make it run smoothly. Thus *magnitudes*. Simple code optimization won't cut it. Yes?
Thanks for any insight.
Chad
I've been off programming in PIC32-land for a while now. Very much enjoying it! Here's my GitHub page if you are interested: https://github.com/stevenchadburrow/AcolyteHandPICd32
Anyways, I recently had a crazy idea to run a C-based NES emulator on this PIC32. Why use FPGA's when you got something so much cooler, right?
Wrong. I did manage to actually get it 'running' (I copied code from https://github.com/franzflasch/nes_emu and made necessarily modifications). But its *magnitudes* slower than what I need. I did some cycle counts and its literally hundreds if not thousands of cycles per single CPU and PPUx3 cycles. So the NES runs at 1.79 MHz, the PPU runs 3 times faster, that's just roughly 4 * 1.79 MHz = 7.16 MHz equivalent. My PIC32 is running a nominal 100 MHz lets say, thus 100 / 7.16 = 13.966 = 14x PIC32 cycles per 1x NES emulated cycles. [ This is all very rough math. ]
Thus to keep up the hardware emulation, my PIC32 would have to be doing all the work of the CPU and PPU in about 14 clock cycles. That's literally impossible.
My question to you guys: Am I thinking about all of this correctly?
When I first saw it working at all, that was neat. But then it was SO SLOW, barely crawling. This morning I optimized some code and got it twice as fast as previously, but I think I need like 30+ times the speed to make it run smoothly. Thus *magnitudes*. Simple code optimization won't cut it. Yes?
Thanks for any insight.
Chad
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Emulating NES CPU and PPU on PIC32, too slow?
In another topic, someone said ten to one is approximately the best you can expect in having a more-powerful microcontroller emulate a 6502. So far, I'm not remembering good-enough search terms to find the topic.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: Emulating NES CPU and PPU on PIC32, too slow?
sburrow wrote:
Thus to keep up the hardware emulation, my PIC32 would have to be doing all the work of the CPU and PPU in about 14 clock cycles. That's literally impossible.
Though the pipeline is only two instructions deep on those two processors; it is effectively working on two instructions at the same time. If I'm not mistaken the 6502 has sort of a one instruction deep prefetch pipeline, but doesn't actually start decoding the instruction until it is done with the first.
(Maybe someone with more in depth knowledge of the 6502 silicon can chime in on that)
There might be some other tricks to be had (see below)
Quote:
I did manage to actually get it 'running' (I copied code from https://github.com/franzflasch/nes_emu and made necessarily modifications).
Many of the original NES/SNES emulators from way back in the day as I recall had to hand craft a lot of the emulation in assembly.
There is also the fact that the ARM is a 32-bit CPU and may have some instructions for doing some SMID type operations allowing it to compute several 8-bit numbers in one call. I'm not super familiar with all of the bits of the ARM instructions that would do that though, so I don't know if the M0+ or M23 have those types of instructions.
(much like RISC-V, ARM has sort of a smorgasbord of options where you can pick and chose which sets of instructions you need for an application. Doesn't look like the M0+ or M23 have many categories selected.)
If SMID is possible, this would probably require some sort of an optimization path that could examine the 6502 instructions it was about to process and do some reordering/reworking so they could be run in parallel.
Doing SMID type things might go double for the PPU, having the 32-bit registers handling 4x 8-bit pixel data (or whatever format the NES uses, I don't recall)
That being said, 100MHz might still be too slow for something like that; you'd certainly have your work cut out for you.
Quote:
Why use FPGA's when you got something so much cooler, right?
Another thought, looking at some of the details you have listed on the GitHub for your Acolyte design, if you've left that many ports of the PIC open, could you interface to an actual 6502 and off load the instructions to there?
Or maybe use two boards? One could run the 6502 emulation the other could forward PPU calls to the other? (There is a reason why the NES had the PPU in a different chip after all)
Re: Emulating NES CPU and PPU on PIC32, too slow?
sburrow wrote:
Thus to keep up the hardware emulation, my PIC32 would have to be doing all the work of the CPU and PPU in about 14 clock cycles. That's literally impossible.
It sounds like you need to move to assembler rather than C though. Or another architecture, but I can see the advantage of a PIC32 device here.
I also know that a 1GHz ARM device can emulate a 6502 at just under 300Mhz which is pretty good, so there ought to be room for much improvement in the PIC32 (MIPS) environment at 100Mhz if you take the bold step of re-writing it all in assembler..
-Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
Re: Emulating NES CPU and PPU on PIC32, too slow?
The best results these days are from JIT approaches - PiTubeDirect has a very fast model which uses this approach. (Edit: but it uses a lot of RAM and runs on a Pi-sized ARM, rather more capable than your average microcontroller)
The extra difficulties with a console emulation are two-fold
- cycle accuracy will be very important
- you have at least three parallel processes to keep in sync: CPU, sound, video. Possibly more.
You will need extreme efficiency, which means you need a lot of ingenuity and diligence. It might be more than you presently manage to do.
The extra difficulties with a console emulation are two-fold
- cycle accuracy will be very important
- you have at least three parallel processes to keep in sync: CPU, sound, video. Possibly more.
You will need extreme efficiency, which means you need a lot of ingenuity and diligence. It might be more than you presently manage to do.
Last edited by BigEd on Wed Dec 04, 2024 11:04 am, edited 1 time in total.
Re: Emulating NES CPU and PPU on PIC32, too slow?
With modern architectures, where the fast-clocked CPU core is attached to fast cache backed by (relatively) slow memory, it's important to keep as much hot code and data in the caches as you can.
In the PIC32MZ, I'd suggest turning on microMIPS support (where many instructions use 16-bit opcodes rather than 32-bit opcodes) in order to make the best use of the instruction cache, and carefully select the sizes and locations of your data objects to try to keep all the emulated instruction fetch/execute loop data together in tightly packed cache lines.
Using 'C' should be fine for this... the compiler should do a great job at ordering instructions to avoid stalls, and for instruction cache reasons I'd suggest using -Os (optimize for size), rather than, say, -O3 (maaxxximmuuuum speeeeeed).
I'd try putting the whole emulator loop in a single function, to maximize the use of registers and try to restrict load/stores to emulated processor operations.
I think there's a lot of potential in the MIPS, but you have to be aware of the memory hierarchy and how performance collapses as you step down it.
In the PIC32MZ, I'd suggest turning on microMIPS support (where many instructions use 16-bit opcodes rather than 32-bit opcodes) in order to make the best use of the instruction cache, and carefully select the sizes and locations of your data objects to try to keep all the emulated instruction fetch/execute loop data together in tightly packed cache lines.
Using 'C' should be fine for this... the compiler should do a great job at ordering instructions to avoid stalls, and for instruction cache reasons I'd suggest using -Os (optimize for size), rather than, say, -O3 (maaxxximmuuuum speeeeeed).
I'd try putting the whole emulator loop in a single function, to maximize the use of registers and try to restrict load/stores to emulated processor operations.
I think there's a lot of potential in the MIPS, but you have to be aware of the memory hierarchy and how performance collapses as you step down it.
Re: Emulating NES CPU and PPU on PIC32, too slow?
How are you building it? I don't see a makefile or any build instructions in there (I also can't find the CPU simulation in your repository, so this doesn't look like something that other people can download and try, unless I'm missing something obvious).
The compiler should do a decent job of making a fast executable, but you will need to tell it to optimise. Try the various options, including -Os to optimise for size (that can sometimes give better performing code than the options that are actually aiming for performance, because caches are so important).
Other things you can try:
Force inlining of the flag-manipulation functions.
Rearrange the code to keep the most commonly used instructions and addressing modes together. It doesn't matter if EOR (zp, X) gives a cache miss, because no one ever uses it.
It has been more than 20 years since I last used gcc on MIPS, and the version we used then was already old. But enabling strict aliasing may or may not help.
The compiler should do a decent job of making a fast executable, but you will need to tell it to optimise. Try the various options, including -Os to optimise for size (that can sometimes give better performing code than the options that are actually aiming for performance, because caches are so important).
Other things you can try:
Force inlining of the flag-manipulation functions.
Rearrange the code to keep the most commonly used instructions and addressing modes together. It doesn't matter if EOR (zp, X) gives a cache miss, because no one ever uses it.
It has been more than 20 years since I last used gcc on MIPS, and the version we used then was already old. But enabling strict aliasing may or may not help.
Re: Emulating NES CPU and PPU on PIC32, too slow?
GARTHWILSON wrote:
someone said ten to one is approximately the best you can expect
Yuri wrote:
allowing it to compute several 8-bit numbers in one call
Yuri wrote:
ould you interface to an actual 6502 and off load the instructions to there
drogon wrote:
the bold step of re-writing it all in assembler.
BigEd wrote:
You will need extreme efficiency, which means you need a lot of ingenuity and diligence. It might be more than you presently manage to do.
sark02 wrote:
I'd suggest turning on microMIPS support
John West wrote:
I also can't find the CPU simulation in your repository
****
So, since a lot of y'all were mentioning 6502 optimizations, here's my thoughts on how to do that. We're looking at a lot of hex code, and the next one is an instruction.
$A9 = LDA#
I take that value, bit shift it, and add something like $9D010000 to it, so that its located in ROM, then jump to that location.
asm("jal $9D010A90");
At that memory location we have instructions specific for that code.
Code: Select all
void __attribute__((address(0x9D010A90))) inst_a9() {
// grab next byte
// store in virtual accumulator
// increment cycle counter
// return by going back to the loop
asm("jal $9D008000");
}
But when I was looking at cycle counts earlier, the 6502 emulation did not use nearly as much time as the PPU did. And, looking at the PPU code (in C), there is indeed a lot more going on. It is not just "find it, do it, move on". There's a lot of background stuff going on as well.
Just had a though though, not sure how familiar y'all are with the NES PPU. What if I were to only do stuff with it upon a change to the PPU. As in, the CPU accesses the PPU, and *that* is when I actually update registers, change things around, etc. Obviously the game code is not accessing the PPU all the time, so perhaps I could store everything in a 'state' in RAM, and then change it when it asks for a change. Still, if a particular game accesses the PPU more often, it could slow everything down a lot.
In the end, like you all said, it would require a ton of optimizations, assembly code, etc. And even then, it might not be enough with that PPU emulation. I'll be thinking it over some more, but thank you all for the advice. I appreciate all of the responses!
Chad
Re: Emulating NES CPU and PPU on PIC32, too slow?
GARTHWILSON wrote:
In another topic, someone said ten to one is approximately the best you can expect in having a more-powerful microcontroller emulate a 6502. So far, I'm not remembering good-enough search terms to find the topic.
- 65C02 Simulator on a bare metal Raspberry PI (2018)
Choosing a microcontroller (2017)
Embedded emulation on PIC, AVR/Arduino, ARM etc (2016)
PICO-56: 65C02/TMS9918A/AY-3-8910 on a Pi Pico (2023)
Write a NES emulator in 15mins and 1000 lines [Video] (2012)
Raspberry Pi Pico 6502 emulator (2021)
Last edited by BigEd on Wed Dec 04, 2024 11:45 am, edited 1 time in total.
Re: Emulating NES CPU and PPU on PIC32, too slow?
sburrow wrote:
So, since a lot of y'all were mentioning 6502 optimizations, here's my thoughts on how to do that. We're looking at a lot of hex code, and the next one is an instruction.
$A9 = LDA#
I take that value, bit shift it, and add something like $9D010000 to it, so that its located in ROM, then jump to that location.
asm("jal $9D010A90");
At that memory location we have instructions specific for that code.
$A9 = LDA#
I take that value, bit shift it, and add something like $9D010000 to it, so that its located in ROM, then jump to that location.
asm("jal $9D010A90");
At that memory location we have instructions specific for that code.
My own "pet" bytecode system is the bytecode produced by the BCPL compiler (Cintcode). You can treat 6502 (or 65C02) as a bytecode too as each opcode is one byte long.
Form a table of addresses of the handler for each instruction (you only have 256 instructions, so 256*4 = 1024 bytes of lookup table) then you need a fetch, multiply by 4, lookup and jump sequence for each opcode.
This is where doing it all in assembler can be advantageous - you keep most data in CPU registers.
The ARM (ARM32) has an opcode sequence that could have been purpose designed for executing bytecodes:
Code: Select all
ldrb r0,[regPC],#1 @ Fetch byte at PC, Increment PC
ldr pc,[ptrJ, r0, lsl #2] @ Take byte in r0, << 2, add to ptrJ
@ fetch that word, transfer into PC.So that's just 2 instrutions to fetch, increment the PC and jump to the handling code.
I've really no idea about MIPS so don't know how this would translate. The best I came up with in my RISC-V implementation is 6 instructions.
What I also do is at the end of each opcode handler is to in-line this code. So each handler is 2 more instruction (so the whole is 256 * 2 * 4 = 2048 bytes of 'overhead' in ARM). There is no "JSR", no stacks to handle - the whole thing just looks like one big program. (And in-fact the 65816 version of this bytecode interpreter doesn't use any native stack at all)
However it all depends on jump efficiency. My ARM implementation is some 10KB which fits entirely in the instruction cache on the ARM CPU I'm using it on. (Of-course everything else is data)
Modern C compilers can turn switch statements into something quite efficient too, so a big 256 entry switch statement can make it all work - often better than building up a table of functions to call from C as that will (may) involve stack call/return shenanigans. If I had to do it in C and wanted it fast I would not be afraid to experiment with GOTO rather than have the switch in a loop to see if that was any faster.
Of-course that's just the 6502. Alternating between 6502 and PPU instructions is more a challenge. Are there any multi-core PIC32 MCUs?
-Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
Re: Emulating NES CPU and PPU on PIC32, too slow?
What I'm getting from y'all (overall) is that "yes, it is theoretically possible". Seeing that the Pi Pico do that makes me think that, yes, it is possible, but only possible.
Can I muse with you for a minute? Say I start re-programming my own emulator (as I already am in my brain), 6502 CPU and PPU alike from the ground up. And I absolutely maximize the performance, however I can. What if, what if it still doesn't work? What if it is still too slow? What if I had to make too many shortcuts to where it wasn't feasible past a single game or two? When do I gamble and start a huge project, all with the possibility of failure not by software but by hardware limitations?
At what point do I take that gamble?
I came here sure that y'all would say, "Chad, your crazy, there's no way it's fast enough for that!" But I got an entirely different response. So, where do I go from here? Do I say, "Eh, the probability of failure it too high, I give up now." or "Surely there is a way!" I didn't come this far to 'give up', but a wise man should also see real limitations.
Right?
Thank you all, for the encouragement and wisdom.
Chad
Can I muse with you for a minute? Say I start re-programming my own emulator (as I already am in my brain), 6502 CPU and PPU alike from the ground up. And I absolutely maximize the performance, however I can. What if, what if it still doesn't work? What if it is still too slow? What if I had to make too many shortcuts to where it wasn't feasible past a single game or two? When do I gamble and start a huge project, all with the possibility of failure not by software but by hardware limitations?
At what point do I take that gamble?
I came here sure that y'all would say, "Chad, your crazy, there's no way it's fast enough for that!" But I got an entirely different response. So, where do I go from here? Do I say, "Eh, the probability of failure it too high, I give up now." or "Surely there is a way!" I didn't come this far to 'give up', but a wise man should also see real limitations.
Right?
Thank you all, for the encouragement and wisdom.
Chad
Re: Emulating NES CPU and PPU on PIC32, too slow?
I think you should persevere, at least in some way, towards
- a working 6502 system emulation
- a performant 6502 system emulation
- a sophisticated 6502 system emulation (adding sound, video, maybe sprites)
- an accurate 6502 system emulation.
Which is to say, NES is very ambitious, even for a desktop application, and if you haven't previously worked on and studied emulator tactics and organisations, a good way to proceed is to start simple.
- a working 6502 system emulation
- a performant 6502 system emulation
- a sophisticated 6502 system emulation (adding sound, video, maybe sprites)
- an accurate 6502 system emulation.
Which is to say, NES is very ambitious, even for a desktop application, and if you haven't previously worked on and studied emulator tactics and organisations, a good way to proceed is to start simple.
Re: Emulating NES CPU and PPU on PIC32, too slow?
Many moons ago I wrote an Apple II emulator for the PalmPilot PDA. Original site: https://palmapple.sourceforge.net/ , archived site with source: https://github.com/dschmenk/Appalm/tree/master. Basically a 68000 with an 8 bit data bus. At 16 MHz. So many concessions had to be made to get anywhere near emulating a 1 MHz 6502. I (re)wrote the 6502 emulation in 68000 assembly, keeping most state in registers. Handling video was a break from traditional emulation - instead of generating cycle accurate video in lock step with CPU emulation, I left the video generation up to a timed framerate. This would break certain advanced techniques, but the vast majority of games worked fine, albeit a little jerky on slow PalmPilots. Faster PalmPilots came on the market and the emulator ran quite nicely and the video framerate could be increased to look much smoother. I also didn't start this project from scratch - it had already been written, in C, based on an emulator for the PC. It sort-of worked, but took about an hour to boot to BASIC. I just went in and replaced components piecemeal. By the time I was done, only a couple of routines for floppy disk emulation remained from the original code base.
It looks like you are in a similar position. You have a working emulator, you just need to get in there and start understanding what your strategy should be. I have no idea what the PPU is, but does it really need to be emulated at 7.16 MHz or can you record the state changes and do much of it all at once? Writing a 6502 emulator in MIPS assembly shouldn't be too taxing if you just can't get the C compiler to cooperate (I also wrote a lot of MIPS assembly many, many moons ago). So go for it - if anything you'll learn much about emulating a system and tons about the PIC32.
It looks like you are in a similar position. You have a working emulator, you just need to get in there and start understanding what your strategy should be. I have no idea what the PPU is, but does it really need to be emulated at 7.16 MHz or can you record the state changes and do much of it all at once? Writing a 6502 emulator in MIPS assembly shouldn't be too taxing if you just can't get the C compiler to cooperate (I also wrote a lot of MIPS assembly many, many moons ago). So go for it - if anything you'll learn much about emulating a system and tons about the PIC32.
Re: Emulating NES CPU and PPU on PIC32, too slow?
Well, an update. But not what you expected!
I tried some minor modifications to the NES Emulator code to see if it would help. To check if I could find a starting place to optimize. Such as, not render the screen with the PPU but perhaps every other frame. I'm talking, have the PPU *not run at all* for a single frame (or more) and then draw it all at once whenever I wanted. Big failure, there is so many intricacies with interrupts and Sprite 0 collisions, it just wasn't working except when I would have it run as expected: 1 CPU cycle, 3 PPU cycles, draw pixels as you go.
Frustrated that I couldn't find much a starting point for the PPU (as the 6502 CPU wasn't really a concern for me anyways), I started thinking of other options. And another option came up. Why not make a Gameboy Emulator? I found a very clean single-file set of code here: https://github.com/deltabeard/Peanut-GB, and then modified the code from SDL into my own procedures. I had done this with the NES Emulator, so it wasn't much a problem.
After a day or so finding issues more with Elm-Chan's FatFS functions (or perhaps my logic in using them in weird particular ways?), I got it to work. AND, it runs at a very nice speed without any modification. It might even need a small delay, but running it straight in C seems to have it run at basically full speed. Cool!
You might consider this move a 'lazy' one. *shrug* I'm not really dying to get the NES-specifically working here, but I do want a large software library to play with. Having the Gameboy (and soon the Gameboy Color), I would have a very large library and for very little effort. More time for me, my hobbies, my wife and three kids, and even my job (in that order?!).
Anyways, that's where the story ends here, because I'm no longer in 6502-land again. I have many more plans for this, additional hardware updates, and other neat stuff, but again, that's not for this forum. Check my GitHub (link in first post) for updates when I'm happy to post them.
Thank you everyone! I appreciate the wisdom, encouragement, and support.
Chad
I tried some minor modifications to the NES Emulator code to see if it would help. To check if I could find a starting place to optimize. Such as, not render the screen with the PPU but perhaps every other frame. I'm talking, have the PPU *not run at all* for a single frame (or more) and then draw it all at once whenever I wanted. Big failure, there is so many intricacies with interrupts and Sprite 0 collisions, it just wasn't working except when I would have it run as expected: 1 CPU cycle, 3 PPU cycles, draw pixels as you go.
Frustrated that I couldn't find much a starting point for the PPU (as the 6502 CPU wasn't really a concern for me anyways), I started thinking of other options. And another option came up. Why not make a Gameboy Emulator? I found a very clean single-file set of code here: https://github.com/deltabeard/Peanut-GB, and then modified the code from SDL into my own procedures. I had done this with the NES Emulator, so it wasn't much a problem.
After a day or so finding issues more with Elm-Chan's FatFS functions (or perhaps my logic in using them in weird particular ways?), I got it to work. AND, it runs at a very nice speed without any modification. It might even need a small delay, but running it straight in C seems to have it run at basically full speed. Cool!
You might consider this move a 'lazy' one. *shrug* I'm not really dying to get the NES-specifically working here, but I do want a large software library to play with. Having the Gameboy (and soon the Gameboy Color), I would have a very large library and for very little effort. More time for me, my hobbies, my wife and three kids, and even my job (in that order?!).
Anyways, that's where the story ends here, because I'm no longer in 6502-land again. I have many more plans for this, additional hardware updates, and other neat stuff, but again, that's not for this forum. Check my GitHub (link in first post) for updates when I'm happy to post them.
Thank you everyone! I appreciate the wisdom, encouragement, and support.
Chad
Re: Emulating NES CPU and PPU on PIC32, too slow?
Hey everyone!
I'm posting on this topic again because... I've been working on my very own NES emulator! Over the past couple of days I've been coding the 6502 instruction set in C, using a lot of pre-compiler macros. Did you know that STA (d),y uses the indirect addressing to *store* a value, not read it?! Hm! That was 4 hours of debugging today
[ It was actually an very simple oversight since all the other uses of that type of addressing read instead of write. I was just carelessly copying code it seems... ]
Anyways, see attached pictures. I really don't have any PPU set up right now, what I'm doing here is 'faking it', but it seems to work none-the-less. I haven't yet implemented sprites. palettes, audio, etc. I just barely have nametables, interrupts, and... that's about it.
But here's the thing: It is running at nearly the exact speed it needs to run, but I would expect it to be running much faster without all the fancy stuff I haven't yet added to it. On top of that, it is only drawing this in a tiny corner of screen. Read above why I stopped this last time: This PIC32, even at 200 MHz now, is just running too slow. It seems like I'm going to run into a similar problem again!
I'll be playing with it and will update you when I have something good (or bad) to report. Thanks everyone!
Chad
I'm posting on this topic again because... I've been working on my very own NES emulator! Over the past couple of days I've been coding the 6502 instruction set in C, using a lot of pre-compiler macros. Did you know that STA (d),y uses the indirect addressing to *store* a value, not read it?! Hm! That was 4 hours of debugging today
Anyways, see attached pictures. I really don't have any PPU set up right now, what I'm doing here is 'faking it', but it seems to work none-the-less. I haven't yet implemented sprites. palettes, audio, etc. I just barely have nametables, interrupts, and... that's about it.
But here's the thing: It is running at nearly the exact speed it needs to run, but I would expect it to be running much faster without all the fancy stuff I haven't yet added to it. On top of that, it is only drawing this in a tiny corner of screen. Read above why I stopped this last time: This PIC32, even at 200 MHz now, is just running too slow. It seems like I'm going to run into a similar problem again!
I'll be playing with it and will update you when I have something good (or bad) to report. Thanks everyone!
Chad