6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Mon Apr 29, 2024 7:34 am

All times are UTC




Post new topic Reply to topic  [ 171 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7 ... 12  Next
Author Message
PostPosted: Sun Jun 23, 2019 7:52 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
The normal bitwise operators also work if you define 1 as true. The principle advantage of using -1 is that you can use (A & F)|(B & !F) as a selection operator, versus (A * F)|(B * !F) - given that the 6502 isn't good at multiplication.


Top
 Profile  
Reply with quote  
PostPosted: Mon Jun 24, 2019 3:11 am 
Offline

Joined: Mon Sep 17, 2018 2:39 am
Posts: 132
Hi!

BigEd wrote:
I think the way BBC Basic does booleans is nice: booleans are integers, 0 is false and anything else is true. But -1 is a kind of official true. With these definitions, boolean logic can use the same operators as bitwise logic. (I think this is similar to old-fashioned C, although C has different operators for boolean and bitwise operations.)

-1 in your case would indeed be 255.


In a compiler you can have different representations than the visible values.

For example, in FastBasic, all booleans are internally set to "0" or "1", and represented with only one byte. The compiler takes care of transforming any non-zero integer value to 1 before passing to operators expecting a boolean. So, " A = B OR C " is compiled to " A = (B<>0) OR (C<>0) ".


Top
 Profile  
Reply with quote  
PostPosted: Mon Jun 24, 2019 9:35 am 
Offline

Joined: Sat Apr 20, 2019 5:31 pm
Posts: 104
It works in a different way here, since Kotlin is strongly typed, you can't really put anything other than boolean where boolean is expected, so C's favorite error:

Code:
if(a=1) { ... }


Will fail at compile time, since value of assignment operation is of type unit (= void in C) and if requires bool.

I have two types of operations:

- ones that preserve expression type and require just two stack registers (one from previous op, one created), i.e. addition gets operands from two top registers, adds them, stores result in the one below top register and frees top register, thus now we have addition result on top and it can be used by "surrounding" expression

- ones that change result type and require new register, like comparison, that evaluates to bool:

Code:
evaleq SP(?dest)[bool] = SP(?left)[ubyte], SP(?right)[ubyte] -> """
    lda #1 // put true in dest reg
    sta {dest},x
    lda {left},x
    cmp {right},x
    beq +
    lda #0 // aaaaw... so they're false after all...
    sta {dest},x
:"""


These require three registers - one from previous op and two created and freed after evaluation, the result again in top register. Excatly like in the stack treatsie :D


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 16, 2019 5:23 am 
Offline

Joined: Sat Apr 20, 2019 5:31 pm
Posts: 104
Being frustrated by trying to create nice code for assigning things to array elements I decided to implement "when" statement. "When" is like if...elseif...elseif or switch...case on steroids. It has two kinds of syntax:

Code:
    when {
        b == 1 -> c++
        b == 2 -> c--
    }


meaning: "when b equals 1, increment c, when b equals 2, decrement c"

But Kotlin IDE would tell you to rewrite this case simply with:

Code:
    when (b) {
        1 -> c++
        2 -> c--
    }


And that's second syntax. In Wolin these get compiled to:

Code:
alloc SP<__wolin_reg0>, #2 // for block level expression when{\nb==1->c++\nb==2->c--\n}
alloc SP<__wolin_reg1>, #1 // for condition result
alloc SP<__wolin_reg2>, #1 // for evaluating when condition
label __wolin_lab_whenLabel_0
alloc SP<__wolin_reg3>, #1 // LEFT equality check: evaleq
let SP(0)<__wolin_reg3>[ubyte] = __wolin_pl_qus_wolin_test_b<pl.qus.wolin.test.b>[ubyte] // simple id from var
alloc SP<__wolin_reg4>, #1 // RIGHT equality check: evaleq
let SP(0)<__wolin_reg4>[ubyte] = #1[ubyte] // atomic ex
evaleq SP(2)<__wolin_reg2>[bool] = SP(1)<__wolin_reg3>[ubyte], SP(0)<__wolin_reg4>[ubyte] // two sides
free SP<__wolin_reg4>, #1 // RIGHT equality check: evaleq
free SP<__wolin_reg3>, #1 // LEFT equality check: evaleq
bne SP(2)<__wolin_reg0>[uword] = #1[bool], __wolin_lab_whenLabel_1[adr]
add __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword] = __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword], #1[byte] // simple id
goto __wolin_lab_whenEndLabel_0[adr]
label __wolin_lab_whenLabel_1
alloc SP<__wolin_reg5>, #1 // LEFT equality check: evaleq
let SP(0)<__wolin_reg5>[ubyte] = __wolin_pl_qus_wolin_test_b<pl.qus.wolin.test.b>[ubyte] // simple id from var
alloc SP<__wolin_reg6>, #1 // RIGHT equality check: evaleq
let SP(0)<__wolin_reg6>[ubyte] = #2[ubyte] // atomic ex
evaleq SP(2)<__wolin_reg2>[bool] = SP(1)<__wolin_reg5>[ubyte], SP(0)<__wolin_reg6>[ubyte] // two sides
free SP<__wolin_reg6>, #1 // RIGHT equality check: evaleq
free SP<__wolin_reg5>, #1 // LEFT equality check: evaleq
bne SP(2)<__wolin_reg0>[uword] = #1[bool], __wolin_lab_whenEndLabel_0[adr]
sub __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword] = __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword], #1[byte] // simple id
label __wolin_lab_whenEndLabel_0
free SP<__wolin_reg2>, #1 // for evaluating when condition
free SP<__wolin_reg1>, #1 // for condition result
free SP<__wolin_reg0>, #2 // for block level expression when{\nb==1->c++\nb==2->c--\n}, type = uword


for first syntax, and into this for second syntax:

Code:
alloc SP<__wolin_reg0>, #2 // for block level expression when(b){\n1->c++\n2->c--\n}
let SP(0)<__wolin_reg0>[uword] = __wolin_pl_qus_wolin_test_b<pl.qus.wolin.test.b>[ubyte] // simple id from var
alloc SP<__wolin_reg1>, #1 // for condition result
alloc SP<__wolin_reg2>, #2 // for evaluating when condition
label __wolin_lab_whenLabel_0
let SP(0)<__wolin_reg2>[uword] = #1[ubyte] // atomic ex
evaleq SP(1)<__wolin_reg1>[bool] = SP(2)<__wolin_reg0>[uword], SP(0)<__wolin_reg2>[ubyte]
bne SP(1)<__wolin_reg1>[bool] = #1[bool], __wolin_lab_whenLabel_1[adr]
add __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword] = __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword], #1[byte] // simple id
goto __wolin_lab_whenEndLabel_0[adr]
label __wolin_lab_whenLabel_1
let SP(0)<__wolin_reg2>[ubyte] = #2[ubyte] // atomic ex
evaleq SP(2)<__wolin_reg1>[bool] = SP(3)<__wolin_reg0>[uword], SP(0)<__wolin_reg2>[uword]
bne SP(2)<__wolin_reg1>[bool] = #1[bool], __wolin_lab_whenEndLabel_0[adr]
sub __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword] = __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword], #1[byte] // simple id
label __wolin_lab_whenEndLabel_0
free SP<__wolin_reg2>, #2 // for evaluating when condition
free SP<__wolin_reg1>, #1 // for condition result
free SP<__wolin_reg0>, #2 // for block level expression when(b){\n1->c++\n2->c--\n}, type = uword


I still need to work on type flow, as I assume there should be no implicit/default type conversions, so each two operand pseudo-asm mnemonic should process only values of the same type, unlike here (byte is added to uword):

Code:
add __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword] = __wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword], #1[byte] // simple id


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 16, 2019 7:57 am 
Offline

Joined: Sat Apr 20, 2019 5:31 pm
Posts: 104
And now resulting code for:

Code:
var b: ubyte
var c: uword
var d: uword

fun main() {
    when(b) {
        1 -> c++
        2 -> c--
        else -> c =0
    }
}


Result:

Code:
__wolin_pl_qus_wolin_test_main:

; allocSP<__wolin_reg0>,#2


    dex
    dex

; allocSP<__wolin_reg1>,#1

    dex

; letSP(0)<__wolin_reg1>[ubyte]=__wolin_pl_qus_wolin_test_b<pl.qus.wolin.test.b>[ubyte]


    lda __wolin_pl_qus_wolin_test_b
    sta 0,x

; allocSP<__wolin_reg2>,#1

    dex

; allocSP<__wolin_reg3>,#1

    dex

; label__wolin_lab_whenLabel_0

__wolin_lab_whenLabel_0:

; letSP(0)<__wolin_reg3>[ubyte]=#1[ubyte]


    lda #1
    sta 0,x

; evaleqSP(1)<__wolin_reg2>[bool]=SP(2)<__wolin_reg1>[ubyte],SP(0)<__wolin_reg3>[ubyte]


    lda #1 // rowne
    sta 1,x
    lda 2,x
    cmp 0,x
    beq +
    lda #0 // jednak rozne
    sta 1,x
:

; bneSP(1)<__wolin_reg2>[bool]=#1[bool],__wolin_lab_whenLabel_1[adr]


    lda 1,x
    cmp #1
    bne __wolin_lab_whenLabel_1

; add__wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword]=__wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword],#1[uword]


    clc
    lda __wolin_pl_qus_wolin_test_c
    adc #<1
    sta __wolin_pl_qus_wolin_test_c
    lda __wolin_pl_qus_wolin_test_c+1
    adc #>1
    sta __wolin_pl_qus_wolin_test_c+1


; goto__wolin_lab_whenEndLabel_0[adr]

    jmp __wolin_lab_whenEndLabel_0

; label__wolin_lab_whenLabel_1

__wolin_lab_whenLabel_1:

; letSP(0)<__wolin_reg3>[ubyte]=#2[ubyte]


    lda #2
    sta 0,x

; evaleqSP(1)<__wolin_reg2>[bool]=SP(2)<__wolin_reg1>[ubyte],SP(0)<__wolin_reg3>[ubyte]


    lda #1 // rowne
    sta 1,x
    lda 2,x
    cmp 0,x
    beq +
    lda #0 // jednak rozne
    sta 1,x
:

; bneSP(1)<__wolin_reg2>[bool]=#1[bool],__wolin_lab_whenLabel_2[adr]


    lda 1,x
    cmp #1
    bne __wolin_lab_whenLabel_2

; sub__wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword]=__wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword],#1[uword]


    sec
    lda __wolin_pl_qus_wolin_test_c
    sbc #<1
    sta __wolin_pl_qus_wolin_test_c
    lda __wolin_pl_qus_wolin_test_c+1
    sbc #>1
    sta __wolin_pl_qus_wolin_test_c+1


; goto__wolin_lab_whenEndLabel_0[adr]

    jmp __wolin_lab_whenEndLabel_0

; label__wolin_lab_whenLabel_2

__wolin_lab_whenLabel_2:

; allocSP<__wolin_reg4>,#2


    dex
    dex

; allocSP<__wolin_reg5>,#2


    dex
    dex

; letSP(0)<__wolin_reg5>[uword]=#0[ubyte]


    lda #0
    sta 0,x
    lda #0
    sta 0+1,x

; let__wolin_pl_qus_wolin_test_c<pl.qus.wolin.test.c>[uword]=SP(0)<__wolin_reg5>[uword]


    lda 0,x
    sta __wolin_pl_qus_wolin_test_c
    lda 0+1,x
    sta __wolin_pl_qus_wolin_test_c+1


; freeSP<__wolin_reg5>,#2


    inx
    inx

; freeSP<__wolin_reg4>,#2


    inx
    inx

; label__wolin_lab_whenEndLabel_0

__wolin_lab_whenEndLabel_0:

; freeSP<__wolin_reg3>,#1

    inx

; freeSP<__wolin_reg2>,#1

    inx

; freeSP<__wolin_reg1>,#1

    inx

; freeSP<__wolin_reg0>,#2


    inx
    inx

; ret

    rts


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 17, 2019 9:20 am 
Offline

Joined: Sat Apr 20, 2019 5:31 pm
Posts: 104
Today I've finally fixed variable initialization on declaration, so both:

Code:
var b = 2


and

Code:
var b: ubyte = 2


compile properly:

Code:
    dex
    lda #2
    sta 0,x
    lda 0,x
    sta __wolin_pl_qus_wolin_test_b
    inx


(of course it will get optimized in pseudo asm to simple lda/sta without stupid passing of #2 by additional SP reg)


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 17, 2019 1:46 pm 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
This is a neat project. I've been following along as you post updates.
Quote:
(of course it will get optimized in pseudo asm to simple lda/sta without stupid passing of #2 by additional SP reg)
How do you plan to do the optimizing? I saw that you were planning some peephole stuff on the first page. Will it be pseudo asm like LLVM?


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 17, 2019 5:00 pm 
Offline

Joined: Sat Apr 20, 2019 5:31 pm
Posts: 104
Well, from what I see the best-est way would be to do two optimizations - one in pseudo asm, and another one in 6502. Regarding the first - I find interesting cases as I go, for example, consider this simple lambda function:

Code:
(a: ubyte, b: ubyte) → a+b


that gets pseudo-compiled to this bloated pseudo-code:

Code:
label lambda_function_0

alloc SP<r.temp12>, #1 // for statement a+b
let SP(0)<r.temp12>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte] // simple id from var
alloc SP<r.temp13>, #1 // for right side
let SP(0)<r.temp13>[ubyte] = SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte] // simple id from var
add SP(1)<r.temp12>[ubyte] = SP(1)<r.temp12>[ubyte], SP(0)<r.temp13>[ubyte] // two sides
free SP<r.temp13>, #1 // for right side, type =ubyte
let SPF(2)<lambdaReturn>[ubyte] = SP(0)<r.temp12>[ubyte] // LAMBDA return assignment
free SP<r.temp12>, #1 // for statement a+b, type = ubyte
free SPF, #2 // free fn arguments and locals for lambda_function_0

ret


1) pl.qus.wolin.test.lambda_function_0.b is put into temp13 and never written, so all occurences of temp13 can be replaced with b + remove assignment:

Code:
label lambda_function_0

alloc SP<r.temp12>, #1 // for statement a+b
let SP(0)<r.temp12>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte] // simple id from var
// KROK1 allokacja temp13
// KROK1 podstawienie temp13
add SP(1)<r.temp12>[ubyte] = SP(1)<r.temp12>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
// KROK1 deallokacja temp13
let SPF(2)<lambdaReturn>[ubyte] = SP(0)<r.temp12>[ubyte] // LAMBDA return assignment
free SP<r.temp12>, #1 // for statement a+b, type = ubyte
free SPF, #2 // free fn arguments and locals for lambda_function_0

ret


2) temp12 is put in lambdaReturn and freed immediately, so I can replace ALL occurences of temp12 with lambdaReturn:

Code:
label lambda_function_0

// KROK2 allokacja temp12
KROK2 let SPF(2)<lambdaReturn>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte] // simple id from var
// KROK1 allokacja temp13
// KROK1 podstawienie temp13
KROK2 add SPF(2)<lambdaReturn>[ubyte] = SPF(2)<lambdaReturn>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
// KROK1 deallokacja temp13
KROK2 let SPF(2)<lambdaReturn>[ubyte] = SPF(2)<lambdaReturn>[ubyte] // LAMBDA return assignment
// KROK2 deallokacja temp12
free SPF, #2 // free fn arguments and locals for lambda_function_0

ret


3) eliminate identities:

Code:
label lambda_function_0

// KROK2 allokacja temp12
KROK2 let SPF(2)<lambdaReturn>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte] // simple id from var
// KROK1 allokacja temp13
// KROK1 podstawienie temp13
KROK2 add SPF(2)<lambdaReturn>[ubyte] = SPF(2)<lambdaReturn>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
// KROK1 deallokacja temp13
// KROK2 KROK3 tożsamość
// KROK2 deallokacja temp12
free SPF, #2 // free fn arguments and locals for lambda_function_0

ret


4) a is put in lambdaReturn, starting from first write (inclusice) lambdaReturn I can use a instead of lambdaReturn on RHS and remove assignment:

Code:
label lambda_function_0

// KROK2 allokacja temp12
KROK2 KROK4 // podstawienie a do lambdaReturn
// KROK1 allokacja temp13
// KROK1 podstawienie temp13
KROK2 add SPF(2)<lambdaReturn>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
// KROK1 deallokacja temp13
// KROK2 KROK3 tożsamość
// KROK2 deallokacja temp12
free SPF, #2 // free fn arguments and locals for lambda_function_0

ret


5) finally getting what this function really does :D, namely - taking two function stack registers a and b and storing result of their addition in return value on function stack:

Code:
label lambda_function_0
add SPF(2)<lambdaReturn>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
free SPF, #2 // free fn arguments and locals for lambda_function_0

ret


(there are more interesting cases in Main.kt, although with Polish comments)

But I feel it will still require some obvious and well known optimizations directly in 6502 code. Is there such assembler in existence?


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 17, 2019 5:57 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
An "optimising assembler" is a bit of a contradiction in terms. But the 6502 is so simple a machine that optimisations are either really obvious to compilers (like the ones above) or fairly obtuse.

An earlier example resulted in a lot of consecutive INX instructions. You could replace those with TXA : CLC : ADC #n : TAX, modulo prior knowledge of the state of the C flag and any need to preserve the contents of A.

For a more general approach, I would try to generate code that moves data around "lazily". Only flush a result out of A when something else needs to come into A (or when you need all values consistent in memory). Likewise, only issue a load to A when you need to actually operate on that value, and then load it by the shortest reliable path. Try to perform operations that use a value that is already loaded, rather than those that need to flush something out first. Applying those principles at the code generation stage will reduce the amount of optimisation you need to do later, and may result in a faster and less resource hungry compiler overall.

Some operations can be done without first loading things into a register; INC, DEC, ASL, LSR, ROL, ROR. Chains of these, sometimes with conditional branches, are idiomatic for dealing with types wider than one byte, or for operating on operand pairs of dissimilar width. An example of the latter: adding an 8-bit value to a 32-bit accumulator.
Code:
 LDA a+0
 CLC
 ADC b
 STA a+0
 BCC :+
 INC a+1 ; INC doesn't affect C
 BNE :+
 INC a+2
 BNE :+
 INC a+3
: ; sum complete
Make sure these idioms are built into your code generator.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 18, 2019 6:34 am 
Offline

Joined: Sat Apr 20, 2019 5:31 pm
Posts: 104
Chromatix wrote:
An "optimising assembler" is a bit of a contradiction in terms.


Isn't my example in pseudo-asm indeed "optimising pseud-assembler"? :D Well - if you assume that it's a human who writes the code, it might be contradiction. But if you create a compiler that writes directly to 6502 asm, you can get a lot of places with

Code:
inx
dex


or

Code:
jsr xxxx
rts


Which are obvious cases for in-assembler optimizations!

Chromatix wrote:
An earlier example resulted in a lot of consecutive INX instructions. You could replace those with TXA : CLC : ADC #n : TAX, modulo prior knowledge of the state of the C flag and any need to preserve the contents of A.


Nice one! Thanks! Added to templates:

Code:
alloc SP, #?count -> """
    txa
    sec
    sbc #{count}
    tax"""


Although how many dex would it take for the above code to be faster? Note also that I will be suming consecutive "alloc SP / free SP" into one, so unles there's a very complex expression there shound not be a chain of inx/dex like in my examples.

Chromatix wrote:
For a more general approach, I would try to generate code that moves data around "lazily". Only flush a result out of A when something else needs to come into A (or when you need all values consistent in memory). Likewise, only issue a load to A when you need to actually operate on that value, and then load it by the shortest reliable path. Try to perform operations that use a value that is already loaded, rather than those that need to flush something out first. Applying those principles at the code generation stage will reduce the amount of optimisation you need to do later, and may result in a faster and less resource hungry compiler overall.


In Kotlin it's rather obvious, as you explicitly declare variables as mutable (var) or immutable (val).

Chromatix wrote:
Some operations can be done without first loading things into a register; INC, DEC, ASL, LSR, ROL, ROR. Chains of these, sometimes with conditional branches, are idiomatic for dealing with types wider than one byte, or for operating on operand pairs of dissimilar width. An example of the latter: adding an 8-bit value to a 32-bit accumulator.
Code:
 LDA a+0
 CLC
 ADC b
 STA a+0
 BCC :+
 INC a+1 ; INC doesn't affect C
 BNE :+
 INC a+2
 BNE :+
 INC a+3
: ; sum complete
Make sure these idioms are built into your code generator.


Yep, but that would be done by my pseudo-asm templates.

What I suspect, that even after pseudo-asm optimizations there will be pretty obvious optimization candidates in 6502 code. Besides, an assembler that could properly assemble something like "bne 130" without complaining about range would be a nice thing for compiler-generated code!


Last edited by qus on Thu Jul 18, 2019 6:51 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 18, 2019 6:48 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
One of the hallmarks of an assembler is that it does not second-guess you; there is a one-to-one correspondence between mnemonics specified and instructions issued. If you write INX : DEX, then it assumes that you have a good reason for it - a timing loop, perhaps, or the side-effect of setting flags, or maybe you are even writing a test suite of some kind. Although "optimising assemblers" do exist for some weird platforms, these do things like organising the code in memory, not changing the code itself. (A famous example was an ancient drum-memory computer, where the location of the next instruction was explicitly specified for each one executed; optimisation in this context consisted of placing each instruction under the read heads just when the previous one completed.)

So eliding those cases falls to the responsibility of a peephole optimiser, which can use information from higher levels of the code generation process than an assembler can. It would know, for example, that you are not writing timing loops or CPU test suites, and whether or not the NZ flags resulting from the INX : DEX sequence are subsequently used. Only when those prerequisites are met is deleting the pair a safe optimisation.

Or, preferably, you could use the "lazy evaluation" technique to track the stack pointer evolution as it goes, and only insert instructions to update it just before the stack pointer is actually used. That would relieve that particular burden from the peephole optimiser.

Most 6502 assemblers do support macros, and you can use those to support long branches with clean syntax. You may find these useful.


Last edited by Chromatix on Thu Jul 18, 2019 6:51 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 18, 2019 6:51 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
(Crossed in post.)

I did a quick search and there are a few pages of results when you search this site for 'peephole' - nothing jumped out as a fully-functional example, but it's a well known term for this kind of local pattern-recognition. Compilers these days spoil us with their many and varied tactics for optimising - we shouldn't forget that there's plenty of scope for optimisation which falls well short of that but it is still very worthwhile.

Indeed a search for "6502 peephole" looks fruitful too.

So this comment really was too strong:
Chromatix wrote:
An "optimising assembler" is a bit of a contradiction in terms.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 18, 2019 9:25 am 
Online

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 293
Almost every 6502 assembler is an optimising assembler.
Chromatix wrote:
there is a one-to-one correspondence between mnemonics specified and instructions issued.

It isn't quite one-to-one. Consider
Code:
    LDA variable

LDA could assemble to AD or A5, depending on the value of variable. The programmer doesn't have to explicitly say which to use in each case. That's a minor optimisation, but it's an important one. We're just so used to it happening that we don't think about it.

I realise that's probably not what you meant: you're saying that every instruction in the source code will produce exactly one instruction in the output. That's true (although macros and conditionals blur the lines a bit). But that's not the only kind of optimisation.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 18, 2019 9:33 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
I actually consider that case a misfeature of 6502 assembly - it makes for an ambiguity between ZP and absolute addressing. In most cases that doesn't matter - *if* you are assembling for a 6502, and *if* you are not planning to use self-modifying code.

As soon as you start writing code for a 65816, however, distinguishing between Direct Page (replacing ZP), Absolute, and Long addresses is essential, since the same address can map to three different physical addresses for those three modes (DP is qualified by DPR but not DBR, Absolute is qualified by DBR, Long uses neither). The standard 6502-derived syntax for '816 assembly doesn't give you any help with that, precisely because of this ambiguity introduced by an "optimisation".


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 25, 2019 5:18 pm 
Offline

Joined: Sat Apr 20, 2019 5:31 pm
Posts: 104
Small update. Since in Kotlin EVERYTHING is an expression, you can do neat tricks. So instead of

Code:
fun main() {
    if(d == 0)
        {b=6}
        else
        {b=9}
}


You'd do:

Code:
fun main() {
    b = if(d == 0)
        6
        else
        9
}


(because expression value == value of last expression in a block)

The same applies to when expression and is now supported in Wolin.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 171 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7 ... 12  Next

All times are UTC


Who is online

Users browsing this forum: John West and 20 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: