This compiler project uses a vertical slice approach inspired by the paper titled “An Incremental Approach to Compiler Construction” by Abdulaziz Ghuloum (2006).
Step 0 builds the basic core for the Delta language compiler.
Download the initial source code: project.zip.
delta/delta.peg
FileThis is a PEG (Parsing Expression Grammar) file as described in the Arpeggio documentation in the “Grammars written in PEG notations” section. All the syntax rules for the Delta language will be described in this file.
delta/__init__.py
File
This file is executed when the delta
package is imported by any client code. It contains the definition of three classes: SyntaxMistake
, Phase
, and Compiler
.
delta.SyntaxMistake
ClassAn instance of this class is raised when the compiler detects a syntax error when parsing an input program.
delta.Phase
Class
This class defines an enum
(a set of symbolic names bound to unique values) that represent the four phases supported by our compiler:
SYNTACTIC_ANALYSIS = 1
SEMANTIC_ANALYSIS = 2
CODE_GENERATION = 3
EVALUATION = 4
These are used when debugging in order to specify up to what phase we want the compiler to run before it stops.
delta.Compiler
Class
This class is the main driver for all the compilation process. When instantiating this class (when the __init__
method is implicitly called) you must provide as an argument a string with the name of the PEG root rule.
You can now use the newly created Compiler
object to call the realize
method providing as arguments a string with the input source code to compile and an optional Phase
enum member that indicates up to what phase should the compiler run. If this last argument is not provided, it assumes the very last phase: Phase.EVALUATION
. When the realize
method ends, the values of one or more of the following properties are available depending on the provided phase:
Phase.SYNTACTIC_ANALYSIS
: Property available:
parse_tree_str
Phase.SEMANTIC_ANALYSIS
: Properties available:
parse_tree_str
symbol_table
Phase.CODE_GENERATION
: Properties available:
parse_tree_str
symbol_table
wat_code
Phase.EVALUATION
: Properties available:
parse_tree_str
symbol_table
wat_code
result
A ValueError
is raised if you try to access a property that is not currently available.
If realize
is called explicitly or implicitly with the Phase.EVALUATION
option, the method returns the same value contained in the result
property. This behaviour comes in handy when implementing unit tests.
The implementation of the Compiler
class uses the facilities provided by third-party packages. Be aware of the following details:
The elements required for syntax analysis are explained in these sections from Arpeggio’s documentation: Grammars, Parse tree, Handling errors, and Parser configuration.
Semantic analysis and code generation are performed using Arpeggio’s visitors. These are a variation of the GoF’s visitor design pattern. To use them properly check the section Semantic analysis - Visitors from Arpeggio’s documentation.
delta/semantics.py
File
This file contains the code for the semantics
module. The module defines two classes: SemanticMistake
and SemanticVisitor
.
delta.SemanticMistake
ClassAn instance of this class is raised when the compiler detects a semantic error.
delta.SemanticVisitor
Class
This is a subclass of the arpeggio.PTNodeVisitor
that performs the semantic analysis of the input program being compiled. This is its initial definition:
delta/codegen.py
File
This file contains the code for the codegen
module. The module defines one class only: CodeGenerationVisitor
.
delta.CodeGenerationVisitor
Class
This is a subclass of the arpeggio.PTNodeVisitor
that performs the code generation of WebAssembly text format.