Instruction set Design - Reasoning?

BigEd · Post by **BigEd** » Wed Apr 06, 2016 10:27 am

Would there be any harm in always setting the V flag for a ROL?
(An earlier query asked if C should be affected by INX, DEX and so on. The answer, I think, is no, because X or Y could be used to control a loop, and preserving C from one iteration to the next could be important. I'm not so sure that preserving V is important.)

barrym95838 · Post by **barrym95838** » Wed Apr 06, 2016 3:11 pm

If by "setting" you mean "updating", then I'm on board with you. My 65m32 allows at least four different ways to do an INX (each a single machine instruction), and each of them have different flag effects, allowing the careful programmer some flexibility in that area.

Mike B.

BigEd · Post by **BigEd** » Wed Apr 06, 2016 3:40 pm

[Updating is a much better choice of word!]

kc5tja · Post by **kc5tja** » Thu Jun 09, 2016 1:01 am

This is non-sequitor, I know, but thought I'd contribute for informational purposes.

BigEd wrote:

MichaelM wrote:

Similarly, the RISC-like 16-bit Inmos Transputer did not have a carry flag (or processor status register), and doing higher precision (32-bit or more) arithmetic was a bit difficult

Well, well - you are quite right. I hadn't realised that. The same is true of the modern RISC-V architecture, which comes in 32-bit and 64-bit flavours - multi-word arithmetic being less crucial on a wider machine. On an 8-bit machine it's essential.

(It turns out that the high level language Occam for the Transputer provided functions for wide arithmetic.)

Multi-precision arithmetic is still possible and with acceptable overhead as if you had an explicit carry bit (remember, these instructions are all single-cycle if you use an in-order, single-pipeline CPU, and faster on wider dispatching architectures):

Code: Select all

    add x1, x2, x3       ; Add low half
    add x4, x5, x6       ; Add high half
    slt x7, x1, x2      ; X7 = 1 iff X1 < X2, else 0 (IOW, the carry bit)
    add x4, x4, x7      ; Accumulate the carry.

The reason the RISC-V lacks any CPU flags what-so-ever is that, frankly, flags are complicated in anything that isn't a single-issue, in-order CPU. They require every instruction to have an implicit register dependency (your flags) which takes a fair amount of logic to work around to get high performance.

I think one of the reasons why Intel CPUs beat the Motorola 68K series once superscalar architectures became a thing is because, if you look at x86 instruction semantics, not all instructions touch flags like Motorola's did. The POWER architecture works around this by having a "flags queue", where each instruction that is allowed to update flags will set CC0, thus bumping its former contents to CC1, CC2, etc. I think there are 8 CCs on a 32-bit architecture. If you need to branch on a CC, you must explicitly specify which CC you're interested in testing. (If you've read about the Mill CPU, this should sound familiar; Mill CPU does the same sort of thing to all intermediate computations, not just the flags.)

BigEd · Post by **BigEd** » Thu Jun 09, 2016 9:18 am

Yikes, I didn't know POWER had that explicit queue of conditions!

John West · Post by **John West** » Thu Jun 09, 2016 5:15 pm

Don't be alarmed! POWER's CR register was not a queue. Integer arithmetic operations that modified flags (most didn't) wrote to CR0. Floating point wrote to CR1. Comparison instructions had a field saying which set of flags to write to, and the branch unit instructions (which included logical operations between flags) could work on any of them. Once written, flags stayed put.

Higher CR fields didn't get used much in practice, at least not in the code I saw. They would have been useful for complicated boolean expressions, but when everyone writes in C or C++ and uses the short-circuiting && and || operators, you never get the opportunity.

What stopped it being a bottleneck is that you never wrote just a single flag. All bits in your destination field got written at once. Treat CR as 8 independent 4-bit registers, and that makes it possible to use register renaming. The 6502's style, with every instruction writing to a different subset of bits, would still be possible but more of a headache.

kc5tja · Post by **kc5tja** » Thu Jun 09, 2016 6:40 pm

John West wrote:

Don't be alarmed! POWER's CR register was not a queue. Integer arithmetic operations that modified flags (most didn't) wrote to CR0. Floating point wrote to CR1. Comparison instructions had a field saying which set of flags to write to, and the branch unit instructions (which included logical operations between flags) could work on any of them. Once written, flags stayed put.

Thanks for the clarification! I remember that that PowerPC 601 definitely had 8 CRs available for use, but I could have sworn that it bumped CRs when told to do so.

Instruction set Design - Reasoning?

Re: Instruction set Design - Reasoning?

Re: Instruction set Design - Reasoning?

Re: Instruction set Design - Reasoning?

Re: Instruction set Design - Reasoning?

Re: Instruction set Design - Reasoning?

Re: Instruction set Design - Reasoning?

Re: Instruction set Design - Reasoning?