65f02 wrote:
Probably a software emulation could sync its external bus cycles to Phi0 quite easily. As long as you operate from fast internal RAM, you don't care about Phi0/Phi2 anyway. When you encounter an address which requires an external bus cycle, you have all the time (and processing power) you need to wait for the next Phi0 clock edge and synchronize the bus cycle with that.
Yep, in fact there is enough time to put the CPU to sleep and wake on the edge interrupt. This reduces power consumption, but's only about 60mA max for the entire board when never sleeping. So, power consumption is not a concern.
65f02 wrote:
Cycle-correct execution is an aspect I had been wondering about. Arlet's FPGA core does use the correct number of instruction cycles, which could be considered a "luxury" when operating in fast, internal mode. (Although the byte-wide memory organization inside the FPGA makes it difficult to go much faster anyway.) But for executing timed code, it is of course important to get the cycle count right. The Apple II, which is close to my heart, is probably one of the worst offenders there, with its low-level disk routines based on cycle counting... So I am glad that I got cycle-exact execution "for free" from Arlet!
I am working on the Apple II right now. I am aware of how critical the cycle timing is for the disk drive. I worked at Central Point Software, writing the last version of Copy II 64/128, but I also worked on the ROM code for the Laser 128 there as well as the Option Board project. I got quite a history lesson from Mike Brown (owner of CPS and author of Copy ][+) on the disk drives.
Mapping $C000-$CFFF as "slow" (cycle exact) seems to work fine for stuff I have tried. I would like to have some flexibility here to have some write-through caching. That will require a CPU with more memory in order to do that. Hence the reluctant desire to change CPUs. As a simple drop-in CPU replacement with some acceleration capability, it works fine right now. You can adjust the memory map for the device you are using it with, and it is what it is. I just want it faster.
65f02 wrote:
From a "bang for the buck" perspective, the software emulators easily beat an FPGA implementation these days, given the amazing cost/performance ratio of the ARM Cortex cores. And they are probably also ahead regarding the achievable emulated clock rate, if you use a sufficiently powerful core (at the expense of somewhat higher supply current). If one wants/needs to stay very close to the original bus and instruction timing, FPGA implementations probably have an advantage.
I am not sure what the cost of ARM processors are (I should probably look). I use PIC micros. I am using a $4.56 (@100pcs) dual core 90/100MIPs 16 bit PIC micro currently. One core is not used, but I plan to use it for diagnostics (stand-up arcade machines). I want to switch to the 200MHz PIC32 which has 512K of RAM available. This would allow switching of ROMs on the fly, as well as having a large cache. I am not thrilled with MIPS assembly, and the CPU itself has some caching quirks you have to work around. For me, assembly is the only way to go as I have total control of everything. There is no mystery about delays and states.