Well, from what I see the best-est way would be to do two optimizations - one in pseudo asm, and another one in 6502. Regarding the first - I find interesting cases as I go, for example, consider this simple lambda function:
Code:
(a: ubyte, b: ubyte) → a+b
that gets pseudo-compiled to this bloated pseudo-code:
Code:
label lambda_function_0
alloc SP<r.temp12>, #1 // for statement a+b
let SP(0)<r.temp12>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte] // simple id from var
alloc SP<r.temp13>, #1 // for right side
let SP(0)<r.temp13>[ubyte] = SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte] // simple id from var
add SP(1)<r.temp12>[ubyte] = SP(1)<r.temp12>[ubyte], SP(0)<r.temp13>[ubyte] // two sides
free SP<r.temp13>, #1 // for right side, type =ubyte
let SPF(2)<lambdaReturn>[ubyte] = SP(0)<r.temp12>[ubyte] // LAMBDA return assignment
free SP<r.temp12>, #1 // for statement a+b, type = ubyte
free SPF, #2 // free fn arguments and locals for lambda_function_0
ret
1) pl.qus.wolin.test.lambda_function_0.b is put into temp13 and never written, so all occurences of temp13 can be replaced with b + remove assignment:
Code:
label lambda_function_0
alloc SP<r.temp12>, #1 // for statement a+b
let SP(0)<r.temp12>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte] // simple id from var
// KROK1 allokacja temp13
// KROK1 podstawienie temp13
add SP(1)<r.temp12>[ubyte] = SP(1)<r.temp12>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
// KROK1 deallokacja temp13
let SPF(2)<lambdaReturn>[ubyte] = SP(0)<r.temp12>[ubyte] // LAMBDA return assignment
free SP<r.temp12>, #1 // for statement a+b, type = ubyte
free SPF, #2 // free fn arguments and locals for lambda_function_0
ret
2) temp12 is put in lambdaReturn and freed immediately, so I can replace ALL occurences of temp12 with lambdaReturn:
Code:
label lambda_function_0
// KROK2 allokacja temp12
KROK2 let SPF(2)<lambdaReturn>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte] // simple id from var
// KROK1 allokacja temp13
// KROK1 podstawienie temp13
KROK2 add SPF(2)<lambdaReturn>[ubyte] = SPF(2)<lambdaReturn>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
// KROK1 deallokacja temp13
KROK2 let SPF(2)<lambdaReturn>[ubyte] = SPF(2)<lambdaReturn>[ubyte] // LAMBDA return assignment
// KROK2 deallokacja temp12
free SPF, #2 // free fn arguments and locals for lambda_function_0
ret
3) eliminate identities:
Code:
label lambda_function_0
// KROK2 allokacja temp12
KROK2 let SPF(2)<lambdaReturn>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte] // simple id from var
// KROK1 allokacja temp13
// KROK1 podstawienie temp13
KROK2 add SPF(2)<lambdaReturn>[ubyte] = SPF(2)<lambdaReturn>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
// KROK1 deallokacja temp13
// KROK2 KROK3 tożsamość
// KROK2 deallokacja temp12
free SPF, #2 // free fn arguments and locals for lambda_function_0
ret
4) a is put in lambdaReturn, starting from first write (inclusice) lambdaReturn I can use a instead of lambdaReturn on RHS and remove assignment:
Code:
label lambda_function_0
// KROK2 allokacja temp12
KROK2 KROK4 // podstawienie a do lambdaReturn
// KROK1 allokacja temp13
// KROK1 podstawienie temp13
KROK2 add SPF(2)<lambdaReturn>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
// KROK1 deallokacja temp13
// KROK2 KROK3 tożsamość
// KROK2 deallokacja temp12
free SPF, #2 // free fn arguments and locals for lambda_function_0
ret
5) finally getting what this function really does
, namely - taking two function stack registers a and b and storing result of their addition in return value on function stack:
Code:
label lambda_function_0
add SPF(2)<lambdaReturn>[ubyte] = SPF(1)<pl.qus.wolin.test.lambda_function_0.a>[ubyte], **SPF(0)<pl.qus.wolin.test.lambda_function_0.b>[ubyte]** // two sides
free SPF, #2 // free fn arguments and locals for lambda_function_0
ret
(there are more interesting cases in Main.kt, although with Polish comments)
But I feel it will still require some obvious and well known optimizations directly in 6502 code. Is there such assembler in existence?