Sheep64 on Mon 22 Feb 2021 wrote:
You might want to provide operand sizes which are binary and/or Fibonacci length. Specifically, 0, 1, 2, 3, 4, 5 and 8 byte.
jeffythedragonslayer's question about ADD/ADC,
jeffythedragonslayer's flag mnemonic (which led to another discussion about 65816 REP/SEP) and this topic, I consider 24 bit addition and similar.
I've considered 8/16/32/64 prefix instructions which also allow access to additional registers. This could be implemented with one multiplexer to a legacy ALU and a RISCy ALU. The legacy ALU allows arbitrary precision binary/decimal addition/subtraction using carry input. The legacy ALU also allows arbitrary precision binary/decimal increment. The RISC ALU allows larger operations, binary only, no carry in. Hopefully, 8 bit decimal adjust and 64 bit addition are equally balanced and the major latency comes from one multiplexer. I hadn't previously considered that operations can be combined to handle unusual sizes. So, for example, prefixed ADC performs 16/32/64 bit ADD and the carry out may be used with legacy 8 bit ADC. This is sufficient for 1, 2, 3, 4, 5 and 8 byte using only one or two addition operations. The rare cases of 6 or 7 byte operations are possible but require more instructions.
randyhyde may correctly think that I'm an idiot because 65000 already handles this case.
Proxy on Tue 17 May 2022 wrote:
Usually RISC CPUs have 3 operands: 2 source and 1 destination. ... But honestly I don't fully know the consequences of choosing one over the other.
3-address register-to-register operations are not very useful if there are less than six symmetric registers. 2-address destructive operations can always be preceded with a register transfer, although this requires faster system (and more energy) to achieve the same amount of useful work. If money and energy is not the limitation, 2-address instructions have the greatest density - and this magnifies the effect of instruction caching. That's why we see x86 servers with 768MB of third level cache and battery powered RISC with one tier of cache. Likewise, smaller 2-address instructions are preferable when issuing multiple instructions per clock cycle - although that's a quagmire of security problems.
It is possible to retro-fit 2-address register architecture with a 3-address prefix and this is especially useful if implementation is already 3-address internally. However, if you're doing this, your architecture has probably "jumped the shark".
Overall, 2-address instructions are preferable for running legacy, single threaded binaries at maximum speed. 3-address instructions are best for MIPS per Watt.