Would there be any harm in always setting the V flag for a ROL?
(An earlier query asked if C should be affected by INX, DEX and so on. The answer, I think, is no, because X or Y could be used to control a loop, and preserving C from one iteration to the next could be important. I'm not so sure that preserving V is important.)
Instruction set Design - Reasoning?
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: Instruction set Design - Reasoning?
If by "setting" you mean "updating", then I'm on board with you. My 65m32 allows at least four different ways to do an INX (each a single machine instruction), and each of them have different flag effects, allowing the careful programmer some flexibility in that area.
Mike B.
Mike B.
Re: Instruction set Design - Reasoning?
[Updating is a much better choice of word!]
Re: Instruction set Design - Reasoning?
This is non-sequitor, I know, but thought I'd contribute for informational purposes.
Well, well - you are quite right. I hadn't realised that. The same is true of the modern RISC-V architecture, which comes in 32-bit and 64-bit flavours - multi-word arithmetic being less crucial on a wider machine. On an 8-bit machine it's essential.
(It turns out that the high level language Occam for the Transputer provided functions for wide arithmetic.)
Multi-precision arithmetic is still possible and with acceptable overhead as if you had an explicit carry bit (remember, these instructions are all single-cycle if you use an in-order, single-pipeline CPU, and faster on wider dispatching architectures):
The reason the RISC-V lacks any CPU flags what-so-ever is that, frankly, flags are complicated in anything that isn't a single-issue, in-order CPU. They require every instruction to have an implicit register dependency (your flags) which takes a fair amount of logic to work around to get high performance.
I think one of the reasons why Intel CPUs beat the Motorola 68K series once superscalar architectures became a thing is because, if you look at x86 instruction semantics, not all instructions touch flags like Motorola's did. The POWER architecture works around this by having a "flags queue", where each instruction that is allowed to update flags will set CC0, thus bumping its former contents to CC1, CC2, etc. I think there are 8 CCs on a 32-bit architecture. If you need to branch on a CC, you must explicitly specify which CC you're interested in testing. (If you've read about the Mill CPU, this should sound familiar; Mill CPU does the same sort of thing to all intermediate computations, not just the flags.)
BigEd wrote:
MichaelM wrote:
Similarly, the RISC-like 16-bit Inmos Transputer did not have a carry flag (or processor status register), and doing higher precision (32-bit or more) arithmetic was a bit difficult
(It turns out that the high level language Occam for the Transputer provided functions for wide arithmetic.)
Code: Select all
add x1, x2, x3 ; Add low half
add x4, x5, x6 ; Add high half
slt x7, x1, x2 ; X7 = 1 iff X1 < X2, else 0 (IOW, the carry bit)
add x4, x4, x7 ; Accumulate the carry.
I think one of the reasons why Intel CPUs beat the Motorola 68K series once superscalar architectures became a thing is because, if you look at x86 instruction semantics, not all instructions touch flags like Motorola's did. The POWER architecture works around this by having a "flags queue", where each instruction that is allowed to update flags will set CC0, thus bumping its former contents to CC1, CC2, etc. I think there are 8 CCs on a 32-bit architecture. If you need to branch on a CC, you must explicitly specify which CC you're interested in testing. (If you've read about the Mill CPU, this should sound familiar; Mill CPU does the same sort of thing to all intermediate computations, not just the flags.)
Re: Instruction set Design - Reasoning?
Yikes, I didn't know POWER had that explicit queue of conditions!
Re: Instruction set Design - Reasoning?
Don't be alarmed! POWER's CR register was not a queue. Integer arithmetic operations that modified flags (most didn't) wrote to CR0. Floating point wrote to CR1. Comparison instructions had a field saying which set of flags to write to, and the branch unit instructions (which included logical operations between flags) could work on any of them. Once written, flags stayed put.
Higher CR fields didn't get used much in practice, at least not in the code I saw. They would have been useful for complicated boolean expressions, but when everyone writes in C or C++ and uses the short-circuiting && and || operators, you never get the opportunity.
What stopped it being a bottleneck is that you never wrote just a single flag. All bits in your destination field got written at once. Treat CR as 8 independent 4-bit registers, and that makes it possible to use register renaming. The 6502's style, with every instruction writing to a different subset of bits, would still be possible but more of a headache.
Higher CR fields didn't get used much in practice, at least not in the code I saw. They would have been useful for complicated boolean expressions, but when everyone writes in C or C++ and uses the short-circuiting && and || operators, you never get the opportunity.
What stopped it being a bottleneck is that you never wrote just a single flag. All bits in your destination field got written at once. Treat CR as 8 independent 4-bit registers, and that makes it possible to use register renaming. The 6502's style, with every instruction writing to a different subset of bits, would still be possible but more of a headache.
Re: Instruction set Design - Reasoning?
John West wrote:
Don't be alarmed! POWER's CR register was not a queue. Integer arithmetic operations that modified flags (most didn't) wrote to CR0. Floating point wrote to CR1. Comparison instructions had a field saying which set of flags to write to, and the branch unit instructions (which included logical operations between flags) could work on any of them. Once written, flags stayed put.