Quote:
Does the NES allow you to wedge into the BRK vector?
Of course, as with all original 6502 systems, the BRK generates an IRQ (also the IRQ is not often used on NES, NMIs are typically used instead).
Quote:
I am a bit skeptical about the claims of stack machine density
You are probably correct to be sceptical. I guess it's like for (normal) compression, you can't tell if LZS or huffman will work better before trying both and tying your conclusions. That's why I wrote a tool the compresses using all (simple) algorithms that came to my mind, and write the results for each one of them ^^
So if I really wanted to know I'd have to implement both a stack virtual machine and register virtual machine, convert the (non-critical) 6502 code to both of them, and count bytes. That's the only way to have a definitive answer as to which one is best.
Also I think most of the claims about register machine being less dense is because they use 2 or 3 operand instructions, as in
ADD R1, R2, R3 (ARM or MIPS style)
or
ADD R1, R2 (AVR or Thumb style)
But what SWEET16 does (and what I could very well copy) is that it has an accumulator-register machine, with only 1 operand, so that instructions can be coded on a single byte, like :
ADD R1 (add R1 to R0 which happens to be A).
This also means a lot of additional register copying, so even if most opcodes can be coded on 1 byte instead of 2, it means more instructions. However I still belive it's a win.
A big problem of the register machine is that, after a call, you don't know what to expect in your regs. Even if it was a call to 6502 code, it could have itself called another VM code, and override the regs. Using a sliding register window like AcheronVM solves this, but is a source of bloat in RAM usage and VM itself.
Quote:
It doesn't sound like Sweet16 is powerful enough for what you want to do, and it is hard to modify, so creating custom operations seems to make the most sense.
It looks like turing-complete to me, but the lack of shifts and logical operations is a huge miss. I think chains of ASLs and LSRs makes an enormous amount of my source code ^^
Quote:
You can of course pass parameters on the page-1 hardware stack (as has been done a lot) but there are certain problems with it that are solved by having a data stack in ZP (indexed by X) that's separate from the return stack in page 1. Take for example the matter of using the page-1 hardware stack for passing parameters to a subroutine, but then that routine passes parameters to another, and now there's a subroutine-return address on top putting the target data farther down the stack than expected, so you get incorrect results.
You don't get incorrect results. The arguments passed are always before the return address on the stack, and when returning a value, a different return instruction is used, so that a value on the top of the stack is pop-ed and memorized, then the return address is pop-ed and stored, and the memorized value pushed again. A bit of overhead, sure, but a ZP secondary stack is overhead too.
Addressing page 1 stack is
not less optimal than page 0 in terms of bytes (remember, wherever I care about speed I won't even use the VM in the 1st place, what I want is make the VM as tiny as possible). Addressing bottom elements is worse (you need an initial TSX, and then 3 bytes each time), but accessing the top element is best as it's single byte (PHA/PLA), and that's what's being done most of the time. Only several instructions access non-top-of-stack data. This has also the advantage of letting both X and Y free for misc. use within the virtual machine when data deep into the stack doesn't have to be accessed.
However, yes, multiple call/returns versions have to be handled, a possible source of bloat for the VM itself. Perhaps I could just alternate values for the S register, and have the best of both worlds ?
I think "stack machine" is a horrible name for a machine with 2 stacks. It should be called "stacks machine", then ? One more reason to not like them, you can get cheated on the # of stacks the machine uses.