My original 65Org32 proposal (it would be good to review the
topic) was that it would be like the '816 in that it has the (32-bit) offset registers and the extra instructions, but it would have no emulation mode, no page or bank
boundaries, no address bus multiplexing, and no 3- or 4-byte instructions. It
would have the barrel shifter, and it ought to have a MULtiply instruction if not also a DIVide. I can't get excited about a 32-bit NMOS 6502, although I know that would be easiest to do just extending the widths on the Verilog 6502 models. The other extreme is to go to a ton of registers, deep pipelining, branch prediction, onboard cache, etc., and end up with something has has little to no resemblance to the 6502 or '816, like the
65GZ032 project went, and, after a lot of progress and even some working hardware, still fizzled out before it was done.
I might still see having direct-page, absolute, and "long" addressing though, all three, even though they all cover the same 4 gigaword address range; because sometimes you want the DP offset so you use the DP addressing, sometimes the DBR or PBR offset so you use absolute addressing, and sometimes no offset so you use "long" addressing (although it's no longer than the others-- it just ignores the offsets).
I personally don't expect to ever address more than 16 megawords of address space; but being able to handle 32 bits at a time is important in higher-level languages; and in Forth, any cell might be an address, data, an index value, etc.. Although merging 8-bit op codes with 24-bit addresses would save some memory and bus cycles, I suspect it will increase internal complexity, perhaps reduce maximum clock speed, and sometimes even make for more-difficult programming.