Hi folks, I just wanted to give you an update on the status of my MOS backend for the LLVM compiler suite, as of 2020.08.10. The code is very raw at this point, but the LLVM assembler successfully compiled a hello-world type program for three 6502 targets, thus providing a proof of concept.
The LLVM assembler, llvm-mc, understands and assembles all NMOS 6502 opcodes. The assembler correctly understands symbols, and it's possible to use them as branch targets, do pointer math on them, and the like. Fixups work as expected at link time. The assembler correctly deals with 6502 relative branches. BEQ, BCC, etc., all correctly calculate PC relative offsets in the unusual 6502 convention, in the range of [-126,+129]. Since llvm-mc is GNU assembler compatible, you can use all GNU assembler features while writing 65xx code, including macros, ifdefs, and similar.
The assembler is capable of intelligently figuring out whether symbols should refer to zero page or 16-bit locations, at the time of compilation. If, at compile time, you place a symbol in a section named ".zeropage", ".directpage", or ".zp", then that symbol will be assumed to be located in zero page; otherwise, it will be assumed to refer to a 16-bit address.
The assembler and linker both understand that $ is a legal prefix for hexadecimal constants. Much existing 6502 assembly code depends on this older convention. Everything that depends on the lexer (which is almost everything in LLVM) can now recognize 6502 format hexadecimal constants. The modern 0x prefix works fine as well.
Both the assembler and the linker support the ELF format, for both object files and executables. The ELF format has been extended with a machine type of 6502 (naturally) to permit storing 65xx code in ELF files. Also, the ELF file format has been extended to support 65xx compatible processors, and it includes support for 65xx specific relocations and fixups.
Because the 6502 assembler and linker both work with ELF files, you can use any of your favorite ELF tools to inspect or understand ELF files generated by the LLVM tools. The llvm-readobj, llvm-objdump, llvm-objcopy, llvm-strip, and likely the other command line tools as well, work as expected. This also means that generic tools that work on ELF files, can read and dump basic information about MOS executables.
Hello-world type programs have been proven to compile, and work as expected, on emulated Commodore 64, VIC-20, and Apple II machines.
C support is unimplemented as of this writing. Don't try to compile C code. You will be sad.
I'm interested in finding people with LLVM experience, who want to work on this project with me, either by developing new features or beating out bugs. LLVM is a huge code base, and the barrier to being productive in it, is quite high; but LLVM should be usable now, as a gas-compatible assembler and linker, for MOS and MOS clones.
Please do not post this information to social media. The code is absolutely not ready for mass consumption yet.
UPDATE 2021.06.21: This work has been merged with Daniel Thornburgh's work to create a functional C compiler. Please see
https://www.llvm-mos.org for an overview.