Some people say they either don’t have enough time to work on side projects or actually do not have time to work on them. But this is not the case for me. Turns out I have a lot of time just laying around so I decided to work on something big that I can learn a bunch from. It didn’t take me long until I decided to learn how to create my own programming language. How did I arrive at that? I mean, I would definitely learn a lot from working on a compiler and having my own programming language sounds exciting doesn’t it? A few weeks back I started learning Golang. So I decided to write my language in Golang.
First, I researched how I should structure my compiler. Here is a good compiler diagram that I found online:
Following this diagram, you pass in a piece of code to a compiler, and the input goes through a Lexer, where the input string split into keyword strings, that are then turn into tokens that will contain information about the keywords. For example, if you have a piece of code like:
print(1 + 1)
One example output that a Lexer might produce is “print”, “(“, “1”, “+”, “1”, “)”. These strings can be then turned into tokens. An example of list of tokens you could make from this list of strings might be:
[{Value: "print", Type: "Function"}, {Value: "(", Type: "Operator"}, {Value: "1", Type: "Integer"}, {Value: "+", Type: "Operator"}, {Value: "1", Type: "Integer"}, {Value: ")", Type: "Operator"}]
The way you design your token is entirely upto how you want to design your compiler.
Then these tokens will get sent to the parser, where AST (Abstract Syntax Tree) will be generated with the given tokens. Depending on
the grammar that your compiler follows, the parser will produce a corresponding AST. An example of AST of above example might look like:
Finally, this AST will be interpreted and produce a corresponding assembly, which can be compiled down to an executable that you can run on your machine. I am planning to follow this structure closely. Let’s see what we get :).