Compiler writing
Compiler writing
What resources exist to learn how to write a compiler that folks might recommend? C specifically.
I should mention, I'm not doing this because I need a compiler, so I don't need to be directed towards existing ones. This is for fun and education.
A quality book, or YouTube series would be awesome.
I should mention, I'm not doing this because I need a compiler, so I don't need to be directed towards existing ones. This is for fun and education.
A quality book, or YouTube series would be awesome.
Re: Compiler writing
Dan Moos wrote:
What resources exist to learn how to write a compiler that folks might recommend? C specifically.
I should mention, I'm not doing this because I need a compiler, so I don't need to be directed towards existing ones. This is for fun and education.
A quality book, or YouTube series would be awesome.
I should mention, I'm not doing this because I need a compiler, so I don't need to be directed towards existing ones. This is for fun and education.
A quality book, or YouTube series would be awesome.
Start here: https://compilers.iecc.com/crenshaw/
Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
- BitWise
- In Memoriam
- Posts: 996
- Joined: 02 Mar 2004
- Location: Berkshire, UK
- Contact:
Re: Compiler writing
Aho’s Dragon book is an often referenced guide to compiler writing
https://docs.google.com/open?id=0B1Mogs ... WR5NWVTSVE
https://docs.google.com/open?id=0B1Mogs ... WR5NWVTSVE
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs
Re: Compiler writing
If you want something which leans toward practice rather than theory, I liked this book back in the day:
https://holub.com/compiler/
https://holub.com/compiler/
Re: Compiler writing
This one is billed "A Compiler Writing Journey" and covers the step by step implementation of a C compiler with buildable examples at the end of each chapter:
https://github.com/DoctorWkt/acwj
https://github.com/DoctorWkt/acwj
Re: Compiler writing
Not a book, but this is a classic:
Let's Build a Compiler, by Jack Crenshaw
via this discussion which has many other pointers, and which headlines this series by Nora Sandler:
Writing a C Compiler:
Let's Build a Compiler, by Jack Crenshaw
via this discussion which has many other pointers, and which headlines this series by Nora Sandler:
Writing a C Compiler:
Quote:
I’ve been working on my own C compiler, nqcc for the past several weeks, using Abdulaziz Ghuloum’s An Incremental Approach to Compiler Construction as a roadmap. I really like Ghuloum’s approach: you start by compiling a tiny, trivial subset of your source language all the way down to assembly. Then you add new language features, one step at a time.
Re: Compiler writing
Thanks all. Lots to look at here!
Re: Compiler writing
I saw that one of my profs, Hank Dietz, from college has posted the notes to his compiler class. A pretty quick intro into compilers: http://aggregate.org/EE380/notes.pdf
It's interesting to note that Terrence Parr worked with Prof. Dietz about the time Terrence started ANTLR.
It's interesting to note that Terrence Parr worked with Prof. Dietz about the time Terrence started ANTLR.
Re: Compiler writing
The Dragon book, mentioned above, seems to be well respected by older programmers, but it may be a little long in the tooth. I suggest "Engineering: A Compiler" by Cooper and Torczon.
Re: Compiler writing
For now I've got an old copy of the Dragon Book on the way. Everything in my research seems to imply that, although it is very dated in the languages it refers to, the concepts are described in a manner that makes it still very useful.
My hope is to create "enough" of a C compiler such that I can create games on my homebrew.
I have no need for standards compliance (I would do my best to be so though)
I have no need for total completeness (I imagine it would slowly gain features as I needed them)
I have no illusions about having even a fraction of the standard library.
Basically, I just want to have most of the flow control statements, the types and operators that I need, and the ability to have local variables. And the ability to inline assembly when the limits of my feeble compiler are hit.
I'll probably end up with a fairly limited "C like" language, and that is fine! Not caring about portability probably will be freeing in some ways!
My hope is to create "enough" of a C compiler such that I can create games on my homebrew.
I have no need for standards compliance (I would do my best to be so though)
I have no need for total completeness (I imagine it would slowly gain features as I needed them)
I have no illusions about having even a fraction of the standard library.
Basically, I just want to have most of the flow control statements, the types and operators that I need, and the ability to have local variables. And the ability to inline assembly when the limits of my feeble compiler are hit.
I'll probably end up with a fairly limited "C like" language, and that is fine! Not caring about portability probably will be freeing in some ways!
- BitWise
- In Memoriam
- Posts: 996
- Joined: 02 Mar 2004
- Location: Berkshire, UK
- Contact:
Re: Compiler writing
C is a tricky language to compile properly. Its grammar has some ambiguities that mean you must look ahead several tokens to determine what syntax rule should be followed.
I can't remember the specifics but I think type casts (e.g. '(t) x' -- the (t) looks like a sub expression until you find work out that t is a type) and array definitions in types (e.g. char *x[] -- x is an array of char pointers) are examples. A lot of compilers use LALR(n) based code generators (e.g. yacc, bison, antlr) to handle this.
I've been searching for a good language to write a native 6502 compiler for. I've yet to find one. Early languages like BCPL and B are inherently word based and untyped so you end up doing a lot of address computations to access memory.
PASCAL and MODULA-2 have complicated stack requirements for nested procedures. BASIC and LISP are normally interpreted on 8-bit machines. FORTH can be either compiled or interpreted and works quite well but its like 'Marmite' and you either love it or hate it.
I wonder if some of these languages would be better compiled to pseudo-code and run on a virtual machine rather than native code. The 6502 works well for intepreters.
I can't remember the specifics but I think type casts (e.g. '(t) x' -- the (t) looks like a sub expression until you find work out that t is a type) and array definitions in types (e.g. char *x[] -- x is an array of char pointers) are examples. A lot of compilers use LALR(n) based code generators (e.g. yacc, bison, antlr) to handle this.
I've been searching for a good language to write a native 6502 compiler for. I've yet to find one. Early languages like BCPL and B are inherently word based and untyped so you end up doing a lot of address computations to access memory.
PASCAL and MODULA-2 have complicated stack requirements for nested procedures. BASIC and LISP are normally interpreted on 8-bit machines. FORTH can be either compiled or interpreted and works quite well but its like 'Marmite' and you either love it or hate it.
I wonder if some of these languages would be better compiled to pseudo-code and run on a virtual machine rather than native code. The 6502 works well for intepreters.
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs
Re: Compiler writing
BitWise wrote:
BASIC and LISP are normally interpreted on 8-bit machines.
BASIC - often 'tokenised' then "detokenised" for the LIST or editing command (With note that some Forths do this too) However some BASICS only tokenise the keywords, so you end up with
Code: Select all
PRINT 3.14Some years back I wrote what I considered my "ideal" BASIC and I took it a stage further - I turned everything into a token. So the above was turned into 2 tokens - one for keyword and one for a constant - stored in floating point binary representation a symbol table. The token was an index into the symbol table. That made run-time fast as there were never any numbers to parse or variables to create storage for. It also made LIST hard because you now had a binary representation of a number to print out (So I cheated and also stored the textual representation for nothing other than the LIST command. I even tokenised comments.
Back to compilers - what I did was essentially a one-pass compiler with the symbol table and some run-time fixups (scan for goto/gosub/proc/func targets) there is also a "just in time" compiler for expression evaluation - I cache the RPN generated by the shunting yard expression evaluator so turning that into a 2-pass compiler would not be that hard - and it's on my "to-do" list once I'm happy with my BCPL system on my '816 board. (Although plan A there is to translate the existing tokeniser/sym. table builder into BCPL then have the run-time in ASM, however after doing a lot more with BCPL I'm now looking at compiling directly into code for the BCPL Cintcode VM)
Cheers,
-Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: Compiler writing
drogon wrote:
Some years back I wrote what I considered my "ideal" BASIC and I took it a stage further - I turned everything into a token. So the above was turned into 2 tokens - one for keyword and one for a constant - stored in floating point binary representation a symbol table. The token was an index into the symbol table. That made run-time fast as there were never any numbers to parse or variables to create storage for. It also made LIST hard because you now had a binary representation of a number to print out (So I cheated and also stored the textual representation for nothing other than the LIST command. I even tokenised comments.
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!
Mike B. (about me) (learning how to github)
Mike B. (about me) (learning how to github)
Re: Compiler writing
barrym95838 wrote:
drogon wrote:
Some years back I wrote what I considered my "ideal" BASIC and I took it a stage further - I turned everything into a token. So the above was turned into 2 tokens - one for keyword and one for a constant - stored in floating point binary representation a symbol table. The token was an index into the symbol table. That made run-time fast as there were never any numbers to parse or variables to create storage for. It also made LIST hard because you now had a binary representation of a number to print out (So I cheated and also stored the textual representation for nothing other than the LIST command. I even tokenised comments.
Sounds like a worthwhile project though.
Cheers,
-Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/