I've recently been playing with a library called Lark (
https://github.com/lark-parser/lark) - it allows to process grammars using EBNF (and few others) and use them to parse arbitrary input.
I'm calling my new thing "
nice65". I've spent around an hour to build a quick PoC and added some extremely basic grammar for CC65 assembly language.
All my code is hosted here:
https://github.com/and3rson/nice65Currently it can reformat some very simple code/labels/comments. And it's only around 100 lines of Python code so far, most of which are the formal definition of CA65 grammar.
Code:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Before
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
.data
foo:.byte 1
.code
; Fill zeropage with zeroes
fill:
PHa
Phx
lDa #0
LdX#0
@again: sta $00 ,x ;Yeah, we can use stz, but I just need some code to test nice65!
inx
bne fill ; Repeat
@ridiculously_long_label_just_for_the_sake_of_it:PLX
pla
rts
Code:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; After
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
.data
foo: .byte 1
.code
; Fill zeropage with zeroes
fill:
PHA
PHX
LDA #0
LDX #0
@again: STA $00, x ; Yeah, we can use stz, but I just need some code to test nice65!
INX
BNE fill ; Repeat
@ridiculously_long_label_just_for_the_sake_of_it:
PLX
PLA
RTS
I'll be trying to improve it, my to-do features are:
- Reliable processing of assembly code without occasionally messing things up
- Support for CA65 macros & control commands
- Differentiating & respecting top-level comments (i. e. function docstring - "; Add two numbers and return result") versus context comments (e. g. "; return if buffer is empty"). This is important since it will allow nice65 to "understand" function boundaries in the code and add empty lines between them to make the code more comprehensible.
- Linter functionality, including warnings for potential mistakes/improvements (e. g. "missing RTS before global label", "JSR/RTS can be replaced with JMP", "Incorrect addrressing mode", etc).
I want to make this similar to Python's black (
https://github.com/psf/black) - one set of sane rules with very little configuration.
Of course, it won't do static analysis as good as CA65, but it might still find lots of common errors that CA65 would silently ignore.
Question to the community: I'd like to conduct a small questionnaire to hear your thoughts on what a "perfect" 6502 assembly code style looks like in your mind, for example:
- What indentation rules do you follow
- Do you use empty lines and how
- Tabs/spaces
- Uppercase/lowercase indentifiers
- Labels & instructions - one line or multiple, different indentation for global & local ("cheap") labels, etc.
- Maximum line length
- Your personal code style preferences for which you would like to have a configuration option
I know there probably won't be "one size fits all", but I think most of the above can be nicely compiled into what I'd call a "sane code style rules for 6502 ASM", while controversial preferences can be made into configurable options (e. g. maximum line length, comment wrapping, etc). Looking forward to your thoughts!