I am working on simulating the instructions of an 8080 processor.
The following code fragments simulate the dcr A and dcr M instructions. Unlike other processors, the 8080 increment and decrement instructions affect the aux carry flag.
02157 ;
02158 ; dcr A ; Decrement register A
02159 ;
AFCA 02160 dcrA_:
AFCA A6 15 [3] 02161 ldx Reg_A ; Decrement the register
AFCC CA [2] 02162 dex
02163
AFCD 8A [2] 02164 txa ; Make a copy of the result
AFCE 45 15 [3] 02165 eor Reg_A ; Borrowed from upper nybble if its low bit changed
02166
AFD0 86 15 [3] 02167 stx Reg_A ; Store the new value
02168
AFD2 02169 dcrCommon:
AFD2 29 10 [2] 02170 and #$10 ; Isolate and save aux carry flag
AFD4 85 2B [3] 02171 sta Scratch
02172
AFD6 02173 dcrMCommon:
AFD6 A5 14 [3] 02174 lda Reg_CC ; Clear the flags we are updating
AFD8 29 2A [2] 02175 and #$2A
02176
AFDA 05 2B [3] 02177 ora Scratch ; Fold in the aux carry flag
02178
AFDC 1D AA90 [4/5] 02179 ora ValueFlags,X ; Set the S, Z and P flags accordingly
AFDF 85 14 [3] 02180 sta Reg_CC
02181
AFE1 4C ADE4 [3] 02182 jmp PCPlus1
Are you committed to the NMOS instruction set? I don't know the 8080's instruction set to help you there, but I see things that could be shortened up if you have the CMOS 65c02 instruction set available. For example, here you can to LDA (ZP) without the ,Y (meaning you don't have to do LDY #0 first), and DEA to decrement A without transferring it to X and back.
I plan to stick with the NMOS instruction set for now to not preclude any systems until it becomes clear just how slow the result is. Typical 8080 code consists of many mov instructions which are simple and fast to simulate.
Using CMOS instructions definitely results in slightly smaller and faster code plus potentially a much faster CPU clock rate.
One thing I missed before is page aligning the flag lookup table.
I am working on simulating the instructions of an 8080 processor.
The following code fragments simulate the dcr A and dcr M instructions. Unlike other processors, the 8080 increment and decrement instructions affect the aux carry flag.
02157 ;
02158 ; dcr A ; Decrement register A
02159 ;
AFCA 02160 dcrA_:
AFCA A6 15 [3] 02161 ldx Reg_A ; Decrement the register
AFCC CA [2] 02162 dex
02163
AFCD 8A [2] 02164 txa ; Make a copy of the result
AFCE 45 15 [3] 02165 eor Reg_A ; Borrowed from upper nybble if its low bit changed
02166
AFD0 86 15 [3] 02167 stx Reg_A ; Store the new value
02168
AFD2 02169 dcrCommon:
AFD2 29 10 [2] 02170 and #$10 ; Isolate and save aux carry flag
AFD4 85 2B [3] 02171 sta Scratch
02172
AFD6 02173 dcrMCommon:
AFD6 A5 14 [3] 02174 lda Reg_CC ; Clear the flags we are updating
AFD8 29 2A [2] 02175 and #$2A
02176
AFDA 05 2B [3] 02177 ora Scratch ; Fold in the aux carry flag
02178
AFDC 1D AA90 [4/5] 02179 ora ValueFlags,X ; Set the S, Z and P flags accordingly
AFDF 85 14 [3] 02180 sta Reg_CC
02181
AFE1 4C ADE4 [3] 02182 jmp PCPlus1
I don't think you need the scratch variable. The AND #$10 at AFD2 will make the Accumulator the same as either LDA #$10 or LDA #0. You can then just follow through with:
ORA Reg_CC
AND #$3C ; preserve the #$10 bit ( bit #4?)
ORA ValueFlags,X
STA Reg_CC
Possibly of interest, a previous implementation is mentioned here.
Thanks. That was interesting to read.
My goal for the initial version is maximum speed as practical instead of trying to do something with single digit values of KBytes.
What that means is that there will be more inlining of code than not, but tail recursion exploited to control bloat somewhat. You can see that in the form of the ???Common labels in the provided fragments.
Also, the code to convert a virtual to physical address:
AFD2 02169 dcrCommon:
AFD2 29 10 [2] 02170 and #$10 ; Isolate and save aux carry flag
AFD4 85 2B [3] 02171 sta Scratch
02172
AFD6 02173 dcrMCommon:
AFD6 A5 14 [3] 02174 lda Reg_CC ; Clear the flags we are updating
AFD8 29 2A [2] 02175 and #$2A
02176
AFDA 05 2B [3] 02177 ora Scratch ; Fold in the aux carry flag
02178
AFDC 1D AA90 [4/5] 02179 ora ValueFlags,X ; Set the S, Z and P flags accordingly
AFDF 85 14 [3] 02180 sta Reg_CC
02181
AFE1 4C ADE4 [3] 02182 jmp PCPlus1
The AND #$10 at AFD2 will make the Accumulator the same as either LDA #$10 or LDA #0.
That is correct.
IamRob wrote:
You can then just follow through with:
ORA Reg_CC
AND #$3C ; preserve the #$10 bit ( bit #4?)
ORA ValueFlags,X
STA Reg_CC
The Reg_CC variable contains the processor flag values as they were when the instruction started.
Consider the case in which the aux carry flag was originally set but the decrement clears it...
What is in the ValueFlags table? the flags that are to be set accordingly to a generic math operation or the flags as they should result form DCR?
In DCR the P flag is set whenever the result is 7FH ($7f), but for other operations this would happen for different values, so I take that the ValueFlags is a table just for the DCR operation.
Now, in DCR the aux-carry flag is set whenever the resulting lower nibble is "1111" (binary), so you could add the correct aux-carry value to the table.
(or - if you prefer - whenever the original value had "0000" in the lower nibble)
What is in the ValueFlags table? the flags that are to be set accordingly to a generic math operation or the flags as they should result form DCR?
In DCR the P flag is set whenever the result is 7FH ($7f), but for other operations this would happen for different values, so I take that the ValueFlags is a table just for the DCR operation.
Now, in DCR the aux-carry flag is set whenever the resulting lower nibble is "1111" (binary), so you could add the correct aux-carry value to the table.
(or - if you prefer - whenever the original value had "0000" in the lower nibble)
ValueFlags is a 256-byte table containing the values of the Sign, Zero and Parity flags as the 8080 would set or reset after an instruction like "ora A" is done. A lookup table is the most efficient way to determine the state of the (even) parity flag; the sign and zero flags came free.
I do not know what you mean by DCR. The ValueFlags table has nothing to do with the aux carry.
The aux carry flag is set when there is a carry from the low nybble into the upper one.
lda regCC
and #maskC ; $2A ?
sta regCC
ldx regA
txa ; A= old value
dex
stx regA
eor regA ; oldval Eor newval
and #maskAC ; $10 ?
ora ValueFlags,X
ora regCC
sta regCC
Then of course, if memory is not an issue, you could build a DcrFlags table and do something faster:
AFD2 02169 dcrCommon:
AFD2 29 10 [2] 02170 and #$10 ; Isolate and save aux carry flag
AFD4 85 2B [3] 02171 sta Scratch
02172
AFD6 02173 dcrMCommon:
AFD6 A5 14 [3] 02174 lda Reg_CC ; Clear the flags we are updating
AFD8 29 2A [2] 02175 and #$2A
02176
AFDA 05 2B [3] 02177 ora Scratch ; Fold in the aux carry flag
02178
AFDC 1D AA90 [4/5] 02179 ora ValueFlags,X ; Set the S, Z and P flags accordingly
AFDF 85 14 [3] 02180 sta Reg_CC
02181
AFE1 4C ADE4 [3] 02182 jmp PCPlus1
The AND #$10 at AFD2 will make the Accumulator the same as either LDA #$10 or LDA #0.
That is correct.
IamRob wrote:
You can then just follow through with:
ORA Reg_CC
AND #$3A ; preserve the #$10 bit ( bit #4?)
ORA ValueFlags,X
STA Reg_CC
The Reg_CC variable contains the processor flag values as they were when the instruction started.
Consider the case in which the aux carry flag was originally set but the decrement clears it...
I did. The outcome is still the same as your code above, but without the Scratch variable.
The AND #$10 preserves the Aux Carry flag, which comes from the DEX TXA EOR above.
The AND #$3A also preserves the Aux Carry flag. If you punch the code in you will see you get exactly the same result as when using the scratch variable.
I had a typo in my previous post. It should have been #$3A, not #$3C.
lda regCC
and #maskC ; $2A ?
sta regCC
ldx regA
txa ; A= old value
dex
stx regA
eor regA ; oldval Eor newval
and #maskAC ; $10 ?
ora ValueFlags,X
ora regCC
sta regCC
Yes, the code to do the add and subtract instructions scrub the flags at the top because of the need to capture the carry or borrow after the operation did not allow doing it below without storing an intermediate value in memory. My original dcr and inr code did the clearing afterward because I did not discover until very recently that those two instructions affected aux carry.
You are confirming that I'll have to accept having to trade six additional bytes (times about fifteen for all of the different incarnations of dcr and inr) to save using the scratch variable.
BB8 wrote:
Then of course, if memory is not an issue, you could build a DcrFlags table and do something faster:
ldx regA
dex
stx regA
lda reg_CC
and #maskC ; $2A ?
ora DcrFlag,X
sta reg_CC
And another similar table for inr. I'll have to keep this idea for later in case I want the speed.
BB8 wrote:
As for "dcr M", I'd keep the 8080's memory aligned to a page boundary, so you don't need to add the displacement for the low byte.
You are right. The memory block for the virtual 8080 is currently already page aligned. I currently forsee no need to do otherwise and will address removal of the adjustment of the lower byte later.
Last edited by BillG on Fri Mar 26, 2021 9:29 pm, edited 1 time in total.