Quote:
GCC backends can now be written to use a variable number of zero-page locations as 'soft-registers'
Although I'm not the person to ask about GCC, I would comment that when someone says the 6502 has no registers, the answer is that basically all of zero page is processor registers, all accessible in 3 clocks (most of the 6502's contemporaries couldn't do anything at all in 3 clocks) and useful for all pointers and indirects. Forth makes very efficient use of ZP for the data stack and of course page 1 for the return stack, almost as if the 6502 was specifically designed for it. Even multitasking is fairly simple in Forth on the 6502, although you would have to be pretty careful with stack space if you want to do more than about 4 tasks at once and go to say 6 or 8 or 10.
Related to its ZP performance, fetching operands' low byte first was not just some irritating quirk, but part of the scheme to make it more efficient with bus cycles, since it could assume correctly that it had to get a low byte anyway, and it could fetch it while it's still figuring out whether it's supposed to get a high byte.
Since you're new to the forum, we have no idea how familiar you are with the 6502, but I had used it for a few years before someone pointed out to me what should have been obvious. It was a new way of thinking for me.
Welcome to the forum.