6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 2:05 am

All times are UTC




Post new topic Reply to topic  [ 37 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
 Post subject: Re: Compiler writing
PostPosted: Mon May 25, 2020 8:03 pm 
Offline

Joined: Thu Apr 23, 2020 5:04 pm
Posts: 53
BitWise wrote:
C is a tricky language to compile properly. Its grammar has some ambiguities that mean you must look ahead several tokens to determine what syntax rule should be followed.

I can't remember the specifics but I think type casts (e.g. '(t) x' -- the (t) looks like a sub expression until you find work out that t is a type) and array definitions in types (e.g. char *x[] -- x is an array of char pointers) are examples. A lot of compilers use LALR(n) based code generators (e.g. yacc, bison, antlr) to handle this.

Most of C is pretty easy to parse (probably LL(1)), but for those one or two cases you basically need a small lookahead and symbol information. Formal grammars usually separate lexing, parsing and symbol-table-handling. That results in C not quite fitting too well to the LL, LR, LALR etc. grammar theory.

IMHO if the original poster is looking into compiler writing with a focus on C, I would stay away from books that spend too much space on lexing and parsing. The biggest part of parsing C is simple and the slightly tricky parts are usually not addressed.

BitWise wrote:
I've been searching for a good language to write a native 6502 compiler for. I've yet to find one.

Of course there is always: https://en.wikipedia.org/wiki/Brainfuck :-)


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Mon May 25, 2020 8:30 pm 
Offline

Joined: Mon Sep 17, 2018 2:39 am
Posts: 138
Hi!

BitWise wrote:
I've been searching for a good language to write a native 6502 compiler for. I've yet to find one. Early languages like BCPL and B are inherently word based and untyped so you end up doing a lot of address computations to access memory.
PASCAL and MODULA-2 have complicated stack requirements for nested procedures. BASIC and LISP are normally interpreted on 8-bit machines. FORTH can be either compiled or interpreted and works quite well but its like 'Marmite' and you either love it or hate it.

I wonder if some of these languages would be better compiled to pseudo-code and run on a virtual machine rather than native code. The 6502 works well for intepreters.


BASIC is simple for writing a compiler. See my https://github.com/dmsc/fastbasic , this compiles to a very simple VM. The language and compiler are designed to be fast and small on the 6502 - the full IDE with FP support is only 9.5KB, this includes graphics and sound commands and a full screen editor.

The problem with BASIC is the lack of data types, this means that the compiler should be a lot more complex if you want speed. In fastbasic all variables default to integer (16 bit signed), and you need to add a "%" to the variable names to specify floating point variables - this is inverse of other BASIC as my target is writing games that don't normally need floating point. The compiler tries to parse expressions as integer first and floating point if not possible, this means that "PRINT A/3 " gives a different result than "PRINT A/3.0", as the first one is evaluated as integer.

The screenshot shows a typical session in the integrated full-screen editor, you can see that the compiler also support abbreviations ("DR. instead of DRAWTO, GR. instead of GRAPHICS, etc.).

Another languages you could consider:
- Action!: https://en.wikipedia.org/wiki/Action!_( ... g_language) , this was somewhat popular for the Atari 8-bit computers, it is a simple algol base language.
- PL/M: https://en.wikipedia.org/wiki/PL/M , another similar language, used by Intel for the 8008, 8080 and 8086 processors.


Attachments:
Screenshot from 2020-05-25 16-12-02.png
Screenshot from 2020-05-25 16-12-02.png [ 14.87 KiB | Viewed 1963 times ]
Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Tue May 26, 2020 12:52 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
I'm still a long, long way from adding floats to WozBASIC, but I have been briefly pondering the best way to do it without breaking backward compatibility. M$BASIC on the Level 2 TRS-80 had DEFINT, DEFSTR, DEFSNG and DEFDBL which could be used to "pre-declare" the types of the variables without needing to adorn them with %, $, ! or # in later references, and that's one possibility that could work with some old programs if the default type was INT.

A great many WozBASIC programs were heavily dependent on 40x24 text, lo-res split-screen color graphics, one-bit sound and/or paddle inputs though, so backward compatibility is probably not particularly useful unless the host hardware provided these idiosyncratic resources as well. With this in mind, I believe my focus will be on getting it running well on "plain-jane" hardware, and leaving hooks to allow extending it to fancier capabilities on a case-by-case basis.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Fri May 29, 2020 2:35 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
There's a write-up of the design of a Basic compiler for the Gigatron here:
https://forum.gigatron.io/viewtopic.php?p=1452#p1452
with interesting notes on the process of implementing such a project.


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Fri May 29, 2020 5:22 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
In an interpreter, it's relatively easy to give variables "dynamic" types, and resolve the relationship between a given operator and the routine needed to execute it as and when it occurs. There are many interpreted languages which do that. But it's a step that takes extra time and contributes to the speed penalty that an interpreter incurs. It also makes it more difficult to implement a compiler for that language later, when a speed improvement is desired.

In a compiler, it's much more advantageous to resolve all the variable types up front. That can be done with explicit declarations of type by the programmer, or implicitly through type inference. The latter requires observing the type of value that is first assigned to a variable, and enforcing that type throughout the variable's lifetime - though more sophisticated approaches are also possible. The biggest remaining difficulty comes with the arguments of functions; unless additional copies of functions are produced for each combination of argument types that occur in calls, it may be easiest to specify types explicitly when declaring a function, unless some additional information within the function (such as a nested function call or an operator specific to a particular type) can be used.

I mentioned Lua earlier because it has a very simple and powerful type system. Any variable or value will be a Nil, Boolean, Numeric, String, or Table - the latter being the aggregate type used for arrays, maps, and structures. In fact a structure in Lua is just a map with syntactic sugar - a.b is identical to a["b"]. Numerics are just double-precision floats, mostly because Lua is designed to run on modern hardware. Now, as an interpreted language with dynamically-typed variables, Lua falls afoul of the above distinction, but I think it is possible to derive an inferred-type language from very similar principles, and then to write a compiler for that language that will sensibly fit into a 6502-based SBC.

I think it is also sensible to generate code for a VM rather than directly for the 6502, as the latter is a decidedly tricky problem. AcheronVM is the obvious choice here, since it is relatively efficient on the 6502 and can easily be customised to support required language constructs.


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Mon Jul 20, 2020 10:23 pm 
Offline

Joined: Mon May 01, 2017 7:13 am
Posts: 83
To your question: https://qr.ae/pNsYhf


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Tue Jul 21, 2020 9:26 pm 
Offline

Joined: Mon May 01, 2017 7:13 am
Posts: 83
My opinion on this topic might be unpopular, but I'm going to espouse it anyway, with the clear caveat that it is my opinion only.

For whatever reason, all 8-bit compiler and interpreter projects start out with the dictum that the project needs to be "simple." This is usually because the person in question has no exposure to training in compilers, or interpreters, or the like, all of which are standard college curricula by this point.

So, the person comes back some years later with an "amazing" compiler for Squish, or Frobnutz, or C--, or some other damned language that no one has ever heard of before.

Or maybe, MAYBE, the person comes back with a C compiler, which is full of bugs and has no test suite and crashes when squinted at.

Yes, yes, it's all a hobby and people should be able to do what they want.

But no one EVER takes the approach I'd recommend, which is instead of developing a compiler or interpreter by trial and error, you start by reading the myriad books on compiler construction.

And start with EXISTING compiler frameworks.

All of which has two benefits. First, you learn how REAL compilers are made, and you don't simply repeat the rookie mistakes that have been made by a thousand hobbyist authors before you.

Second, when you're done, you have something that actually might be useful to others.

My approach is not an egocentric approach, but it is absolutely a professional one. Real compiler authors don't sit around all day writing toy languages. The vast majority of modern compiler work is in extending existing functionality. Some of it is in pure R&D, but not that much.

Building a compiler for some arbitrary goofball language, because it's somehow "simpler," is an https://en.wikipedia.org/wiki/Anti-pattern. Decades of programmers have done it before you. If you want to know how it's done, don't write a compiler. READ ABOUT HOW IT'S DONE. Start with the Cooper/Torczon book. Then read the 2006 Dragon book.


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Wed Jul 22, 2020 4:42 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
I looked up those books on Amazon. They're discouragingly expensive - actually the first time I've seen books with a rental option, which incidentally costs as much as most books of similar size do to buy outright. I suppose that's what the university textbook market is like in the US. I know when I needed textbooks at university in the UK, I paid less than that in total for three years' worth of books.

Alas, I took a degree course which covered many useful things, but not compiler writing.


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Wed Jul 22, 2020 7:37 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
johnwbyrd wrote:
My opinion on this topic ... start by reading the myriad books on compiler construction.

And start with EXISTING compiler frameworks.

All of which has two benefits. First, you learn how REAL compilers are made, and you don't simply repeat the rookie mistakes that have been made by a thousand hobbyist authors before you.

Second, when you're done, you have something that actually might be useful to others.

For a target other than a 6502, a more widely-seen target and one more amenable to having a C compiler (or running one), I'd probably agree. And certainly if the aim is to make something that might be widely used. And, given the time (and money, and energy) the idea of studying the state of play is generally good advice too, rather than starting by inventing flint-knapping.

But for 6502, and for a hobby-scoped project, and for someone who has no expectation of working in the field of compiler development or maintenance, I'd say going ahead and finding your own way might be fun, and fulfilling. It's a reasonable choice. But you're quite right to point out that it's by no means the only way to proceed.


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Wed Jul 22, 2020 11:44 am 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 704
Location: North Tejas
johnwbyrd wrote:
Yes, yes, it's all a hobby and people should be able to do what they want.


I'll just throw it out there that PHP, Python, Ruby and yes, even C all started out as somebody's hobby project.


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Wed Jul 22, 2020 11:48 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Indeed, but how well versed were those people in constructing those sorts of things?

The original question of this thread remains interesting (in my opinion!)
> What resources exist to learn how to write a compiler that folks might recommend? C specifically.

Now, if some resources are very expensive textbooks, it would be good to hear of near-equivalents which are free or cheap. (Some people may have access to a university library; others won't.)


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Thu Jul 23, 2020 2:40 am 
Offline

Joined: Mon May 01, 2017 7:13 am
Posts: 83
BigEd wrote:
For a target other than a 6502, a more widely-seen target and one more amenable to having a C compiler (or running one), I'd probably agree.


I cannot concur. I think there have been valiant attempts to make C compilers for the 65xx, but most of these attempts have decades-long roots, and thus have technical limitations baked into such products based on the state of compiler technology in the late 1980s.

There's way too much talk about the inviability of C for the 6502. I claim that the 65xx can be a first-class and reasonably performant target for C18; however, people won't believe such a thing until they see it.


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Thu Jul 23, 2020 9:12 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
It would be good to see it done!


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Thu Jul 23, 2020 10:26 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
I'm reading through the Crenshaw tutorial, which is free to read. It does offer a relatively straightforward approach to writing a compiler, and the result is apparently capable of fitting into a small machine. The examples are given in Pascal and generate code for a 68K, but that really just invites you to translate them into your source language and target ISA of choice. For the 6502 specifically, using AcheronVM as the target ISA is entirely reasonable and would probably result in much better performance than interpreted BASIC, even if it falls short of hand-crafted assembly.

There may be some limitations on the type of syntax and the amount of code optimisation that can easily be performed with this approach, but the results don't look too bad if you get to design the language yourself (or it happens to be one that is convenient for this approach to parsing). Having not got all the way to the end, there may yet be solutions to these limitations that I haven't reached yet.


Top
 Profile  
Reply with quote  
 Post subject: Re: Compiler writing
PostPosted: Thu Jul 23, 2020 1:13 pm 
Offline

Joined: Thu Apr 23, 2020 5:04 pm
Posts: 53
BigEd wrote:
It would be good to see it done!

http://www.compilers.de/vbcc.html


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 37 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 40 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: