Simulating an 8080. Is there a better way to do this?

BillG · Post by **BillG** » Thu Mar 25, 2021 8:39 am

I am working on simulating the instructions of an 8080 processor.

The following code fragments simulate the dcr A and dcr M instructions. Unlike other processors, the 8080 increment and decrement instructions affect the aux carry flag.

Is there a better way to do this?

Code: Select all

                          02157 ;
                          02158 ; dcr A ; Decrement register A
                          02159 ;
 AFCA                     02160 dcrA_:
 AFCA A6 15           [3] 02161          ldx    Reg_A     ; Decrement the register
 AFCC CA              [2] 02162          dex
                          02163
 AFCD 8A              [2] 02164          txa              ; Make a copy of the result
 AFCE 45 15           [3] 02165          eor    Reg_A     ; Borrowed from upper nybble if its low bit changed
                          02166
 AFD0 86 15           [3] 02167          stx    Reg_A     ; Store the new value
                          02168
 AFD2                     02169 dcrCommon:
 AFD2 29 10           [2] 02170          and    #$10      ; Isolate and save aux carry flag
 AFD4 85 2B           [3] 02171          sta    Scratch
                          02172
 AFD6                     02173 dcrMCommon:
 AFD6 A5 14           [3] 02174          lda    Reg_CC    ; Clear the flags we are updating
 AFD8 29 2A           [2] 02175          and    #$2A
                          02176
 AFDA 05 2B           [3] 02177          ora    Scratch   ; Fold in the aux carry flag
                          02178
 AFDC 1D AA90       [4/5] 02179          ora    ValueFlags,X  ; Set the S, Z and P flags accordingly
 AFDF 85 14           [3] 02180          sta    Reg_CC
                          02181
 AFE1 4C ADE4         [3] 02182          jmp    PCPlus1

Code: Select all

                          02240 ;
                          02241 ; dcr M ; Decrement memory register
                          02242 ;
 B010                     02243 dcrM_:
 B010 18              [2] 02244          clc
 B011 A5 1A           [3] 02245          lda    Reg_L     ; Get target address
 B013 69 00           [2] 02246          adc    #<RAM80
 B015 85 12           [3] 02247          sta    Ptr
 B017 A5 1B           [3] 02248          lda    Reg_H
 B019 69 0B           [2] 02249          adc    #>RAM80
 B01B 85 13           [3] 02250          sta    Ptr+1
                          02251
 B01D A0 00           [2] 02252          ldy    #0
 B01F B1 12         [5/6] 02253          lda    (Ptr),Y
 B021 AA              [2] 02254          tax              ; Decrement the register
 B022 CA              [2] 02255          dex
                          02256
 B023 8A              [2] 02257          txa              ; Make a copy of the result
 B024 51 12         [5/6] 02258          eor    (Ptr),Y   ; Borrowed from upper nybble if its low bit changed
 B026 29 10           [2] 02259          and    #$10      ; Isolate and save aux carry flag
 B028 85 2B           [3] 02260          sta    Scratch
                          02261
 B02A 8A              [2] 02262          txa              ; Store the new value
 B02B 91 12           [6] 02263          sta    (Ptr),Y
                          02264
 B02D 4C AFD6         [3] 02265          jmp    dcrMCommon

The only thing I can think of is storing the values and not calculating the condition codes until/unless they are needed.

GARTHWILSON · Post by **GARTHWILSON** » Thu Mar 25, 2021 9:20 am

Are you committed to the NMOS instruction set? I don't know the 8080's instruction set to help you there, but I see things that could be shortened up if you have the CMOS 65c02 instruction set available. For example, here you can to LDA (ZP) without the ,Y (meaning you don't have to do LDY #0 first), and DEA to decrement A without transferring it to X and back.

BillG · Post by **BillG** » Thu Mar 25, 2021 10:50 am

I plan to stick with the NMOS instruction set for now to not preclude any systems until it becomes clear just how slow the result is. Typical 8080 code consists of many mov instructions which are simple and fast to simulate.

Using CMOS instructions definitely results in slightly smaller and faster code plus potentially a much faster CPU clock rate.

One thing I missed before is page aligning the flag lookup table.

BigEd · Post by **BigEd** » Thu Mar 25, 2021 11:48 am

Possibly of interest, a previous implementation is mentioned here. (I fully support a new implementation!)

IamRob · Post by **IamRob** » Thu Mar 25, 2021 10:23 pm

BillG wrote:

I am working on simulating the instructions of an 8080 processor.

The following code fragments simulate the dcr A and dcr M instructions. Unlike other processors, the 8080 increment and decrement instructions affect the aux carry flag.

Is there a better way to do this?

Code: Select all

                          02157 ;
                          02158 ; dcr A ; Decrement register A
                          02159 ;
 AFCA                     02160 dcrA_:
 AFCA A6 15           [3] 02161          ldx    Reg_A     ; Decrement the register
 AFCC CA              [2] 02162          dex
                          02163
 AFCD 8A              [2] 02164          txa              ; Make a copy of the result
 AFCE 45 15           [3] 02165          eor    Reg_A     ; Borrowed from upper nybble if its low bit changed
                          02166
 AFD0 86 15           [3] 02167          stx    Reg_A     ; Store the new value
                          02168
 AFD2                     02169 dcrCommon:
 AFD2 29 10           [2] 02170          and    #$10      ; Isolate and save aux carry flag
 AFD4 85 2B           [3] 02171          sta    Scratch
                          02172
 AFD6                     02173 dcrMCommon:
 AFD6 A5 14           [3] 02174          lda    Reg_CC    ; Clear the flags we are updating
 AFD8 29 2A           [2] 02175          and    #$2A
                          02176
 AFDA 05 2B           [3] 02177          ora    Scratch   ; Fold in the aux carry flag
                          02178
 AFDC 1D AA90       [4/5] 02179          ora    ValueFlags,X  ; Set the S, Z and P flags accordingly
 AFDF 85 14           [3] 02180          sta    Reg_CC
                          02181
 AFE1 4C ADE4         [3] 02182          jmp    PCPlus1

I don't think you need the scratch variable. The AND #$10 at AFD2 will make the Accumulator the same as either LDA #$10 or LDA #0. You can then just follow through with:

ORA Reg_CC
AND #$3C ; preserve the #$10 bit ( bit #4?)
ORA ValueFlags,X
STA Reg_CC

BillG · Post by **BillG** » Fri Mar 26, 2021 1:14 am

BigEd wrote:

Possibly of interest, a previous implementation is mentioned here.

Thanks. That was interesting to read.

My goal for the initial version is maximum speed as practical instead of trying to do something with single digit values of KBytes.

What that means is that there will be more inlining of code than not, but tail recursion exploited to control bloat somewhat. You can see that in the form of the ???Common labels in the provided fragments.

Also, the code to convert a virtual to physical address:

Code: Select all

 B010 18              [2] 02244          clc
 B011 A5 1A           [3] 02245          lda    Reg_L     ; Get target address
 B013 69 00           [2] 02246          adc    #<RAM80
 B015 85 12           [3] 02247          sta    Ptr
 B017 A5 1B           [3] 02248          lda    Reg_H
 B019 69 0B           [2] 02249          adc    #>RAM80
 B01B 85 13           [3] 02250          sta    Ptr+1

is inlined instead of a separate subroutine, saving the expense of a call/return pair at the cost of ten bytes per use.

BillG · Post by **BillG** » Fri Mar 26, 2021 1:23 am

IamRob wrote:

Code: Select all

 AFD2                     02169 dcrCommon:
 AFD2 29 10           [2] 02170          and    #$10      ; Isolate and save aux carry flag
 AFD4 85 2B           [3] 02171          sta    Scratch
                          02172
 AFD6                     02173 dcrMCommon:
 AFD6 A5 14           [3] 02174          lda    Reg_CC    ; Clear the flags we are updating
 AFD8 29 2A           [2] 02175          and    #$2A
                          02176
 AFDA 05 2B           [3] 02177          ora    Scratch   ; Fold in the aux carry flag
                          02178
 AFDC 1D AA90       [4/5] 02179          ora    ValueFlags,X  ; Set the S, Z and P flags accordingly
 AFDF 85 14           [3] 02180          sta    Reg_CC
                          02181
 AFE1 4C ADE4         [3] 02182          jmp    PCPlus1

The AND #$10 at AFD2 will make the Accumulator the same as either LDA #$10 or LDA #0.

That is correct.

IamRob wrote:

You can then just follow through with:

ORA Reg_CC
AND #$3C ; preserve the #$10 bit ( bit #4?)
ORA ValueFlags,X
STA Reg_CC

The Reg_CC variable contains the processor flag values as they were when the instruction started.

Consider the case in which the aux carry flag was originally set but the decrement clears it...

BB8 · Post by **BB8** » Fri Mar 26, 2021 11:33 am

What is in the ValueFlags table? the flags that are to be set accordingly to a generic math operation or the flags as they should result form DCR?
In DCR the P flag is set whenever the result is 7FH ($7f), but for other operations this would happen for different values, so I take that the ValueFlags is a table just for the DCR operation.

Now, in DCR the aux-carry flag is set whenever the resulting lower nibble is "1111" (binary), so you could add the correct aux-carry value to the table.
(or - if you prefer - whenever the original value had "0000" in the lower nibble)

BillG · Post by **BillG** » Fri Mar 26, 2021 12:39 pm

BB8 wrote:

What is in the ValueFlags table? the flags that are to be set accordingly to a generic math operation or the flags as they should result form DCR?
In DCR the P flag is set whenever the result is 7FH ($7f), but for other operations this would happen for different values, so I take that the ValueFlags is a table just for the DCR operation.

Now, in DCR the aux-carry flag is set whenever the resulting lower nibble is "1111" (binary), so you could add the correct aux-carry value to the table.
(or - if you prefer - whenever the original value had "0000" in the lower nibble)

ValueFlags is a 256-byte table containing the values of the Sign, Zero and Parity flags as the 8080 would set or reset after an instruction like "ora A" is done. A lookup table is the most efficient way to determine the state of the (even) parity flag; the sign and zero flags came free.

I do not know what you mean by DCR. The ValueFlags table has nothing to do with the aux carry.

The aux carry flag is set when there is a carry from the low nybble into the upper one.

BB8 · Post by **BB8** » Fri Mar 26, 2021 12:46 pm

[snipped my previous comment]

I'm possibly wrong, because I was basing some considerations on the Z80 which behaves differently on the P flag... Sorry

BillG · Post by **BillG** » Fri Mar 26, 2021 1:02 pm

Oh, DCR, the decrement register instruction. I cannot believe I missed that.

The P flag doubles as an overflow flag on some Z80 instructions.

BigEd · Post by **BigEd** » Fri Mar 26, 2021 1:15 pm

> My goal for the initial version is maximum speed as practical instead of trying to do something with single digit values of KBytes.

Sounds good to me!

BB8 · Post by **BB8** » Fri Mar 26, 2021 2:15 pm

this one lets you spare the Scratch location; no speed gain

Code: Select all

	lda regCC
	and #maskC	; $2A ?
	sta regCC
	ldx regA
	txa			; A= old value
	dex
	stx regA
	eor regA		; oldval Eor newval
	and #maskAC	; $10 ?
	ora ValueFlags,X
	ora regCC
	sta regCC

Then of course, if memory is not an issue, you could build a DcrFlags table and do something faster:

Code: Select all

	ldx regA
	dex
	stx regA
	lda reg_CC
	and #maskC	; $2A ?
	ora DcrFlag,X
	sta reg_CC

As for "dcr M", I'd keep the 8080's memory aligned to a page boundary, so you don't need to add the displacement for the low byte.

IamRob · Post by **IamRob** » Fri Mar 26, 2021 7:43 pm

BillG wrote:

IamRob wrote:

Code: Select all

 AFD2                     02169 dcrCommon:
 AFD2 29 10           [2] 02170          and    #$10      ; Isolate and save aux carry flag
 AFD4 85 2B           [3] 02171          sta    Scratch
                          02172
 AFD6                     02173 dcrMCommon:
 AFD6 A5 14           [3] 02174          lda    Reg_CC    ; Clear the flags we are updating
 AFD8 29 2A           [2] 02175          and    #$2A
                          02176
 AFDA 05 2B           [3] 02177          ora    Scratch   ; Fold in the aux carry flag
                          02178
 AFDC 1D AA90       [4/5] 02179          ora    ValueFlags,X  ; Set the S, Z and P flags accordingly
 AFDF 85 14           [3] 02180          sta    Reg_CC
                          02181
 AFE1 4C ADE4         [3] 02182          jmp    PCPlus1

The AND #$10 at AFD2 will make the Accumulator the same as either LDA #$10 or LDA #0.

That is correct.

IamRob wrote:

You can then just follow through with:

ORA Reg_CC
AND #$3A ; preserve the #$10 bit ( bit #4?)
ORA ValueFlags,X
STA Reg_CC

The Reg_CC variable contains the processor flag values as they were when the instruction started.

Consider the case in which the aux carry flag was originally set but the decrement clears it...

I did. The outcome is still the same as your code above, but without the Scratch variable.

The AND #$10 preserves the Aux Carry flag, which comes from the DEX TXA EOR above.
The AND #$3A also preserves the Aux Carry flag. If you punch the code in you will see you get exactly the same result as when using the scratch variable.

I had a typo in my previous post. It should have been #$3A, not #$3C.

BillG · Post by **BillG** » Fri Mar 26, 2021 8:54 pm

BB8 wrote:

this one lets you spare the Scratch location; no speed gain

Code: Select all

	lda regCC
	and #maskC	; $2A ?
	sta regCC
	ldx regA
	txa			; A= old value
	dex
	stx regA
	eor regA		; oldval Eor newval
	and #maskAC	; $10 ?
	ora ValueFlags,X
	ora regCC
	sta regCC

Yes, the code to do the add and subtract instructions scrub the flags at the top because of the need to capture the carry or borrow after the operation did not allow doing it below without storing an intermediate value in memory. My original dcr and inr code did the clearing afterward because I did not discover until very recently that those two instructions affected aux carry.

You are confirming that I'll have to accept having to trade six additional bytes (times about fifteen for all of the different incarnations of dcr and inr) to save using the scratch variable.

BB8 wrote:

Then of course, if memory is not an issue, you could build a DcrFlags table and do something faster:

Code: Select all

	ldx regA
	dex
	stx regA
	lda reg_CC
	and #maskC	; $2A ?
	ora DcrFlag,X
	sta reg_CC

And another similar table for inr. I'll have to keep this idea for later in case I want the speed.

BB8 wrote:

As for "dcr M", I'd keep the 8080's memory aligned to a page boundary, so you don't need to add the displacement for the low byte.

You are right. The memory block for the virtual 8080 is currently already page aligned. I currently forsee no need to do otherwise and will address removal of the adjustment of the lower byte later.

Simulating an 8080. Is there a better way to do this?

Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?

Re: Simulating an 8080. Is there a better way to do this?