This compiler project uses a vertical slice approach inspired by the paper titled “An Incremental Approach to Compiler Construction” by Abdulaziz Ghuloum (2006).
Step 0 builds the basic core for the Delta language compiler.
Download the initial source code: project.zip.
delta/delta.peg FileThis is a PEG (Parsing Expression Grammar) file as described in the Arpeggio documentation in the “Grammars written in PEG notations” section. All the syntax rules for the Delta language will be described in this file.
delta/__init__.py File
This file is executed when the delta package is imported by any client code. It contains the definition of three classes: SyntaxMistake, Phase, and Compiler.
delta.SyntaxMistake ClassAn instance of this class is raised when the compiler detects a syntax error when parsing an input program.
delta.Phase Class
This class defines an enum (a set of symbolic names bound to unique values) that represent the four phases supported by our compiler:
SYNTACTIC_ANALYSIS = 1SEMANTIC_ANALYSIS = 2CODE_GENERATION = 3EVALUATION = 4These are used when debugging in order to specify up to what phase we want the compiler to run before it stops.
delta.Compiler Class
This class is the main driver for all the compilation process. When instantiating this class (when the __init__ method is implicitly called) you must provide as an argument a string with the name of the PEG root rule.
You can now use the newly created Compiler object to call the realize method providing as arguments a string with the input source code to compile and an optional Phase enum member that indicates up to what phase should the compiler run. If this last argument is not provided, it assumes the very last phase: Phase.EVALUATION. When the realize method ends, the values of one or more of the following properties are available depending on the provided phase:
Phase.SYNTACTIC_ANALYSIS: Property available:
parse_tree_str
Phase.SEMANTIC_ANALYSIS: Properties available:
parse_tree_strsymbol_table
Phase.CODE_GENERATION: Properties available:
parse_tree_strsymbol_tablewat_code
Phase.EVALUATION: Properties available:
parse_tree_strsymbol_tablewat_coderesult
A ValueError is raised if you try to access a property that is not currently available.
If realize is called explicitly or implicitly with the Phase.EVALUATION option, the method returns the same value contained in the result property. This behaviour comes in handy when implementing unit tests.
The implementation of the Compiler class uses the facilities provided by third-party packages. Be aware of the following details:
The elements required for syntax analysis are explained in these sections from Arpeggio’s documentation: Grammars, Parse tree, Handling errors, and Parser configuration.
Semantic analysis and code generation are performed using Arpeggio’s visitors. These are a variation of the GoF’s visitor design pattern. To use them properly check the section Semantic analysis - Visitors from Arpeggio’s documentation.
delta/semantics.py File
This file contains the code for the semantics module. The module defines two classes: SemanticMistake and SemanticVisitor.
delta.SemanticMistake ClassAn instance of this class is raised when the compiler detects a semantic error.
delta.SemanticVisitor Class
This is a subclass of the arpeggio.PTNodeVisitor that performs the semantic analysis of the input program being compiled. This is its initial definition:
delta/codegen.py File
This file contains the code for the codegen module. The module defines one class only: CodeGenerationVisitor.
delta.CodeGenerationVisitor Class
This is a subclass of the arpeggio.PTNodeVisitor that performs the code generation of WebAssembly text format.