Tutorial @ EuroPython 2026. July 13, 2026. Kraków, Poland.
|
The most up to date version of this document is available at: The source code repository for this tutorial is available on GitHub at: |
1. Tutorial Overview
1.1. General Description
In this tutorial, participants will learn how to build in Python a compiler that translates a scaled-down programming language called chiqui_forth into executable WebAssembly (Wasm) code.
Wasm is a relatively new technology that can be used to create client-side and standalone applications that are efficient, secure, and portable. It’s an intermediate language for a stack-based virtual machine that uses a just-in-time (JIT) compiler to produce native machine code. Wasm provides a portable compilation target for languages such as C/C++, Rust, C#, and many others.
Although writing a compiler for a full blown programming language is a daunting task, it’s possible to write swiftly a Wasm targeting compiler for a fairly small language, chiqui_forth in this case. This tiny language is a subset of the Forth programming language, which first appeared in 1970. It’s a stack-oriented language that is programmed using Reverse Polish Notation (RPN), which happens to be quite easy to convert into Wasm.
1.2. Outcomes
By the end of this session, you won’t just have a Wasm file; you’ll have a new mental model for software:
-
The Compiler Pipeline: Implement the full flow from raw text to executable binary.
-
Wasm Mastery: Gain a practical understanding of WebAssembly, the technology powering modern browser-based video editors, games, and serverless clouds.
-
Custom Tooling: Leave with a working Python-based compiler that you can extend to your own custom syntax.
1.3. Prerequisites
-
Python Proficiency: Comfort with variables, loops, lists, dictionaries, and file I/O.
-
Terminal Basics: Ability to navigate folders and run scripts from a command line.
No prior knowledge of compiler design, Wasm, or web development is required.
2. Introduction to Compiler Design
2.1. Compilers vs. Interpreters
Because computers cannot natively understand human-written code, you must use either a compiler or an interpreter to translate and execute any program written in a high-level language.
A compiler is a program that reads the entire source code all at once and converts it into a separate, standalone file — such as machine code — that is ready to run at maximum speed. In contrast, an interpreter works on the fly. It reads, translates, and executes your code line by line, in real time, right while the program is running.
To understand the difference, imagine you need to read a book written in Polish, but you only speak English:
-
The Compiler Approach: You hand the entire Polish book to a professional translator. She translates every single page and hands you back a completely new, standalone book written entirely in English. You can now read it smoothly from start to finish — and if you want to read it again, you can just pick up that English book and read it instantly.
-
The Interpreter Approach: You sit next to a live interpreter. As you flip through the Polish pages, she translate each sentence aloud to you in English one by one. It requires less prep work up front, but the reading process itself is much slower. Furthermore, if you want to read the book a second time, the interpreter has to sit down and translate the exact same sentences aloud to you all over again from scratch.
| For this tutorial, our primary focus will be on designing and writing a compiler. |
2.2. Why Build a Compiler?
Building even a tiny compiler for a simple language is incredibly rewarding because it completely strips away the “magic” behind how programming languages work. By writing one yourself, you get to see exactly how your lines of text are broken down, organized, and turned into something a computer can run. This behind-the-scenes perspective makes you a much sharper problem solver and permanently changes the way you look at code. Ultimately, it turns you from someone who just writes software into someone who has a better understanding on how things work underneath.
2.3. Compilation Phases
A typical compiler operates as a pipeline, breaking down the source code into abstract representations before translating it into the final target language.
The compilation process is commonly divided into the following distinct phases:
- Lexical Analysis (Scanning)
-
This phase reads the raw stream of characters from the source code and groups them into meaningful, atomic sequences called tokens (such as keywords, identifiers, and operators).
- Syntactic Analysis (Parsing)
-
This phase structuralizes the linear sequence of tokens into a hierarchical syntax tree, verifying that the code adheres to the formal grammatical rules of the programming language.
- Semantic Analysis
-
This phase inspects the syntax tree to enforce language-specific rules that grammar alone cannot catch, such as validating type safety, checking variable scope, and ensuring proper function signatures.
- Intermediate Code Generation
-
This phase translates the verified syntax tree into a clean, platform-independent low-level representation (often referred to as IR) to ease the subsequent steps of optimization and translation.
- Code Optimization
-
This phase analyzes and transforms the intermediate code to improve its efficiency, reducing execution time, memory footprint, or energy consumption without altering its intended behavior.
- Final Code Generation
-
This phase maps the optimized intermediate representation into the specific target machine language—such as native CPU assembly text, machine code, or WebAssembly binary format.
3. Introduction to WebAssembly
3.1. What is WebAssembly?
WebAssembly is a binary format designed for a stack machine, serving as a virtual Instruction Set Architecture (ISA) that defines a standardized, abstract machine language for a sandboxed CPU. Engineered from its inception as a compiler target rather than a handwritten language, it allows developers to compile software from various high-level source languages into a highly portable format.
In client-side web environments, WebAssembly is not intended to replace JavaScript, but rather to complement it by offloading computationally heavy tasks while leaving standard scripting and DOM orchestration to the browser’s native engine. This collaborative design allows a host runtime to seamlessly translate the portable bytecode directly into a physical processor’s native instructions, powering high-performance applications across both client-side browsers and secure, server-side infrastructures.
3.2. Main Features
- Secure
-
Wasm executes code inside a safe, sandboxed environment that isolates it from the underlying host system.
- Portable
-
The compiled binary format runs smoothly across different operating systems, devices, and architectures without modifications.
- Performant
-
It delivers near-native execution speeds by utilizing a compact binary layout that loads and parses almost instantly.
- Open Standard
-
It is developed by a W3C community group to ensure it remains a royalty-free, transparent technology for everyone.
3.3. Limitations
- I/O Isolation
-
It cannot directly access the host file system or network sockets without a standardized interface layer like WASI.
- Data Marshalling Overhead
-
Passing complex data across the linear memory boundary requires manual serialization, creating performance bottlenecks with host languages.
- Absence of Hardware Memory Mitigations
-
Standard OS-level memory protections are missing inside Wasm modules, leaving internal data vulnerable to buffer overflows.
- Immature Tooling and Ecosystem
-
High-level debugging, profiling, and cross-language dependency management remain fragmented compared to mature native environments.
- Binary Size and Cold Start Latency
-
Bundling runtimes or large standard libraries increases file sizes, slowing network transmission and server initialization times.
3.4. Ecosystem
3.4.1. Languages Targeting Wasm
Various compilers and languages compile down to Wasm bytecode:
-
Emscripten for C/C++
-
Blazor for C#
-
and many others
3.4.2. WASI
The WebAssembly System Interface (WASI) provides a standardized API framework for Wasm modules.
-
It enables WebAssembly binaries to run entirely outside of a web browser sandbox.
-
Grants sandboxed modules secure, direct access to host system resources, including:
-
File Systems
-
Networking layers
-
System Time clocks
-
| While we will not be using WASI in this tutorial, it is a crucial piece of the WebAssembly ecosystem worth keeping on your radar for standalone and server-side runtimes. |
3.5. Key Milestones
| Date | Milestone |
|---|---|
June 2015 |
WebAssembly is initially announced to the public. |
March 2017 |
The design for the Minimum Viable Product (MVP) is officially concluded. |
November 2017 |
Native support is achieved across all 4 major desktop browsers (Google’s Chrome, Apple’s Safari, Microsoft’s Edge, and Mozilla’s Firefox), as well as mobile platforms for Android and iOS. |
March 2019 |
Mozilla introduces WASI (WebAssembly System Interface), extending Wasm’s sandboxed execution model beyond the browser to stand-alone server and CLI runtimes. |
December 2019 |
The W3C officially announces WebAssembly as the fourth language of the open Web, joining HTML, CSS, and JavaScript. |
February 2024 |
WASI Preview 2 launches, introducing the stable WebAssembly Component Model to enable high-performance, polyglot software interoperability. |
December 2024 |
WebAssembly 2.0 is officially ratified as a W3C standard, introducing native SIMD (Vector types), Bulk Memory operations, Reference Types, and Multiple Memories. |
September 2025 |
WebAssembly 3.0 is ratified by the W3C, bringing native Garbage Collection (WasmGC) for high-level languages, 64-bit linear memory addressing (Memory64), and first-class Exception Handling. |
February 2026 |
Global browser adoption reaches an estimated 96% of all installed browsers worldwide. |
4. Wasmtime: A Standalone Runtime for WebAssembly
4.1. What is Wasmtime?
Wasmtime is a specialized tool designed to run WebAssembly programs outside of a web browser. While WebAssembly was originally created to let developers run high-speed code inside web pages, tools like Wasmtime bring that same power to your regular desktop or server. Think of Wasmtime as a tiny, ultra-fast virtual machine or “sandbox”. It takes a compiled Wasm file (which could be written in languages like C++, Rust, or Go) and safely runs it at near-native speed on your operating system, keeping the code completely isolated so it cannot harm your computer.
The wasmtime PyPI package is the bridge that connects this WebAssembly runner directly to Python. Normally, Python is known for being easy to write but a bit slow for heavy math or intense processing. By installing this package, you can use Python to load a fast WebAssembly module and run its functions seamlessly right inside your Python script. This gives you the best of both worlds: you can write your main application using Python’s simple syntax, while instantly outsourcing heavy-duty tasks to lightning-fast WebAssembly modules without ever leaving your Python environment.
4.2. Installing wasmtime-py
For most users, the standard package manager pip is the way to go. Open your terminal or command prompt and run:
pip install wasmtime
Sometimes the pip command doesn’t work. Use this specific command if the one above fails:
python -m pip install wasmtime
| If the standard commands fail, try using pip3 or python3. Some environments, particularly on macOS and Linux, use the 3 suffix to distinguish the modern interpreter from legacy system versions and prevent version conflicts. |
Run the following command at the terminal to verify that Wasmtime was successfully installed:
python -c "import wasmtime; print('Wasmtime installed successfully')"
If everything is OK, you should see the message Wasmtime installed successfully printed in your terminal.
|
Wasmtime Platform Support
Wasmtime is designed for 64-bit operating systems and supports the following environments:
Note that Wasmtime does not support legacy 32-bit platforms (such as 32-bit ARM or x86). |
5. Hand coding WebAssembly
WebAssembly Text Format, or WAT, is the textual assembly language corresponding to Wasm’s binary format. There is a direct, one-to-one correspondence between the two formats, meaning WAT simply acts as a human-readable representation of a Wasm module. Source files written in this format typically use the .wat file extension.
As previously noted, WebAssembly is designed to be generated by compilers rather than written by hand. However, learning to code manually in WAT is an invaluable step for anyone wanting to grasp this technology: it provides the necessary clarity to understand exactly how the low-level stack engine operates, making it the ultimate tool for mastering Wasm’s inner workings.
5.1. S-expressions
In both the binary and textual formats, the fundamental unit of code in WebAssembly is a module. In the text format, a module is represented as one big S-expression. S-expressions are a long-established, simple textual format for representing trees, allowing us to view a module as a tree of nodes that describe the module’s structure and its code.
| S-expression stands for “symbolic expression”, and it is commonly encountered in the different Lisp language dialects such as Common Lisp, Scheme, Racket, and Clojure. These programming languages use s-expressions to represent the computer program and also the program’s data. |
Let’s see what an S-expression looks like. Each node in the tree goes inside a pair of parentheses: ( … ). The first label inside the parentheses tells you what type of node it is, and after that there is a space-separated list of either attributes or child nodes. For example:
(module
(func
(export "myFunction")
return))
In this example the node types are module, func and export. The root of the s-expression is the module node. This node has func as its only child. The func node has two children: the export node and the return attribute. Finally, the export node has only one child, which is the "myFunction" attribute. Indentation is used here to clarify the tree-structure of the s-expression.
5.2. Comments
WAT supports two types of comments:
-
Line comments: Start with
;;and continue to the end of the line. -
Block comments: Delimited by
(;and;).
Examples:
(;
Block comment:
It can span
multiple lines.
;)
(module) ;; Line comment: This is the smallest module we can have.
Comments are ignored by the Wasm assembler and are used strictly for code documentation.
5.3. Data Types
WebAssembly programs natively support multiple categories of data types. The core numeric types are:
-
i32: 32-bit integer -
i64: 64-bit integer -
f32: 32-bit float -
f64: 64-bit float
| Advanced vector and reference types also exist under Wasm 3.0 but are outside the scope of this tutorial. |
These four core types are used as prefixes in most WebAssembly instructions. They are also used when declaring variables, parameters, and function return types.
5.4. Working with the Stack
A stack is a data structure with two operations: push and pop. Items are pushed (inserted) onto the stack and subsequently popped (removed) from the stack in last in, first out (LIFO) order.
|
A real-life example is a stack of pancakes: you can only take a pancake (at least easily) from the top of the stack, and you can only add a pancake to the top of the stack. Source: www.freevector.com |
WebAssembly execution is defined in terms of a stack machine where the basic idea is that every instruction pushes and/or pops a certain number to/from a stack. For example, to add the numbers 7 and 5, the following three steps need to be carried out on the stack:
-
Push the 7.
-
Push the 5.
-
Pop the two top values (5 and 7), perform the addition (7 + 5), and finally push the result (12).
The following figure illustrates the three steps described above:
The actual WAT code to do this arithmetic operation would look like this:
i32.const 7 (1)
i32.const 5 (2)
i32.add (3)
| 1 | Push the 32-bit integer constant 7 onto the stack. |
| 2 | Push the 32-bit integer constant 5 onto the stack. |
| 3 | Pop the top two values, add them together, and push the result (12) back onto the stack. |
5.5. Functions
In WebAssembly, functions serve as the primary execution blocks. A function executes its logic by pulling arguments from the stack, performing its operations, and leaving the results on the stack for subsequent instructions.
5.5.1. A Floating-Point Average Example
The following example demonstrates how to declare a function, define its parameters, and execute the arithmetic steps required to compute and return the average of three floating-point values:
;; =============================================================================
;; File: example.wat
;;
;; WebAssembly text format (WAT) source code example.
;; =============================================================================
(module
;; Compute the average of three values
(func (1)
(export "average") (2)
(param $x f64) (3)
(param $y f64)
(param $z f64)
(result f64) (4)
local.get $x (5)
local.get $y
f64.add (6)
local.get $z (7)
f64.add (8)
f64.const 3.0 (9)
f64.div (10)
)
)
| 1 | Define a function. |
| 2 | Export the function as "average". |
| 3 | Declare parameters $x, $y, and $z as 64-bit float. Note that, in WAT, variable names must begin with a dollar sign. |
| 4 | Declare that the function returns a 64-bit float. |
| 5 | Push $x and $y onto the stack. |
| 6 | Pop the top two values, add them together, and push the result back onto the stack. |
| 7 | Push $z onto the stack. |
| 8 | Pop the top two values, add them together, and push the result back onto the stack. |
| 9 | Push 3.0 onto the stack. |
| 10 | Pop the top two values, divide them, and push the result back onto the stack; this sole remaining value is the function’s result. |
5.5.2. Function Declarations and Parameters
Functions are defined using the func keyword inside an S-expression. To reference a parameter by name, you assign it an identifier that begins with a dollar sign (e.g., $x).
Parameters are declared using the param keyword followed by one of the core data types.
5.5.3. Working with Variables
WebAssembly treats function parameters as mutable local variables. You interact with these variables using two key instructions:
-
local.get: Copies the value of a specific parameter or local variable and pushes it onto the top of the stack. This instruction does not alter the variable itself. -
local.set: Pops the top value off the stack and writes it directly into the specified variable, completely replacing its previous contents.
5.5.4. Returning Values
You declare the expected data type of a function’s return value using the result keyword in the function’s signature. To successfully return data, the stack must contain a value matching that type when the function exits.
WebAssembly handles the exit in two ways:
-
Implicit Return: When execution naturally reaches the function’s closing parenthesis, it exits automatically, consuming whatever final value is left on top of the stack as the return value.
-
Explicit Return: You can use the
returninstruction to immediately halt execution and exit the function early. Just like an implicit return, whatever value is currently on top of the stack will be popped and used as the return value.
5.6. Invoking Wasm from Python
Executing a Wasm module within Python using the wasmtime runtime requires a structured lifecycle: establishing an isolated execution environment, compiling the module, and instantiating it. Serving as a continuation of the previous example, the following code demonstrates this complete workflow by loading the resulting WAT file, instantiating it without external host imports, extracting its exported function, and safely marshaling Python arguments and results across the runtime boundary.
| This dense low-level configuration is standard Wasmtime boilerplate. Feel free to copy and paste this wrapper function directly into your code to start executing Wasm files without getting bogged down in the runtime setup. |
# ==============================================================================
# File: wat_launcher.py
#
# A generic host script for loading, compiling, and executing arbitrary
# WebAssembly modules from within Python using the Wasmtime runtime.
# ==============================================================================
from wasmtime import Store, Module, Instance
def call_wat_fun(file_name, fn_name, *args): (1)
"""Loads, compiles, and executes a WebAssembly function.
Args:
file_name (str): The path to the .wat or .wasm file.
fn_name (str): The name of the exported Wasm function to invoke.
*args: Variable length argument list to pass to the Wasm function.
Returns:
Any: The marshaled result returned from the WebAssembly function execution.
"""
store = Store() (2)
module = Module.from_file(store.engine, file_name) (3)
instance = Instance(store, module, []) (4)
function = instance.exports(store)[fn_name] (5)
return function(store, *args) (6)
if __name__ == '__main__': (7)
print(call_wat_fun('example.wat', 'average', 1.0, 2.0, 6.0)) (8)
| 1 | Handles loading, compiling, and executing a WebAssembly function. |
| 2 | An isolated sandbox environment holding runtime state (globals, memory, instances). |
| 3 | Reads the .wat or .wasm file and compiles it into machine code via Wasmtime. |
| 4 | Binds the compiled module to the store runtime state. The empty array [] is for imports. |
| 5 | Extracts the explicitly exported function by its string name from the instance. |
| 6 | Invokes the function. Wasmtime requires passing the store context as the first argument. |
| 7 | Ensures the following code only runs if the script is executed directly. |
| 8 | Passes three Python floats across the Wasm boundary to calculate an average and prints the result. |
5.7. Exercise A ★
Add to the example.wat file a new function called fah_to_cel that takes as argument a 64-bit floating point value that corresponds to a temperature \(F\) in degrees Fahrenheit and converts it to degrees Celsius.
|
If \(F\) is a temperature in degrees Fahrenheit, to convert it to \(C\) degrees Celsius you should use the following formula:
\[C = \frac{ 5.0 \times (F - 32.0) }{ 9.0 }\]
|
Add the necessary code to the wat_launcher.py file to check that the computed results are correct using the following values:
| °F | °C |
|---|---|
\(212.0\) |
\(100.0\) |
\(32.0\) |
\(0.0\) |
\(-40.0\) |
\(-40.0\) |
5.8. Exercise B ★
Add to the example.wat file a new function called quadratic_root that takes three 64-bit floating point values corresponding to the coefficients \(a\), \(b\), and \(c\) of the quadratic equation \(ax^2+bx+c = 0\), and computes its first root.
|
To find the first root \(x\) given the coefficients \(a\), \(b\), and \(c\), use the following formula:
\[x = \frac{ -b + \sqrt{b^2 - 4 a c} }{ 2 a }\]
|
Add the necessary code to the wat_launcher.py file to check that the computed results are correct using the following test cases (assume all inputs yield real roots):
| \(a\) | \(b\) | \(c\) | Expected \(x\) |
|---|---|---|---|
\(2.0\) |
\(4.0\) |
\(2.0\) |
\(-1.0\) |
\(1.0\) |
\(0.0\) |
\(0.0\) |
\(0.0\) |
\(4.0\) |
\(5.0\) |
\(1.0\) |
\(-0.25\) |
6. The chiqui_forth Compiler
WebAssembly provides a portable compilation target for languages such as C/C++, Rust, C#, AssemblyScript, and many others. Although writing a compiler for a full blown programming language can be a daunting task, it’s possible to write fairly quickly a WebAssembly targeting compiler for a very small language. This tiny language is called chiqui_forth, a subset of the Forth programming language.
| Chiqui (pronounced like “cheeky”) is an informal word in Spanish that refers to something small or tiny. In this case a tiny version of the Forth language. |
6.1. The chiqui_forth Language
6.1.1. Overview
Forth is a language unlike most others. It’s not functional or object oriented, it doesn’t have type-checking, and it basically has zero syntax. It was written in the 70s, but is still used today mainly in embedded systems and system controllers.
Forth is a stack-oriented language that is programmed using the Reverse Polish Notation (RPN), which is fairly easy to convert into WebAssembly. The syntax of Forth is extremely straightforward. It consists of a series of space-delimited words. Forth implements a stack machine, just like WebAssembly. The words are processed from left to right. When a number word is found (numbers are represented as 32-bit integers), it’s pushed into the stack. When an operation word is found some values from the top of the stack are popped, the operation is performed with those values, and the result (if any) is pushed back into the stack.
Using a formal notation, the \(\texttt{+}\) operation word can be expressed like this:
And the \(\texttt{*}\) operation word can be expressed like this:
A more elaborate example:
1 2 + 3 4 + * .
This is what is happening:
- Push 1.
- Push 2.
- Pop two values, add them, push result (3).
- Push 3.
- Push 4.
- Pop two values, add them, push result (7).
- Pop two values (7 and 3), multiply them, push result (21).
- The dot (.) word pops a value and prints it.
| At the end of the program the stack must be empty, otherwise you’ll get a validation error from the WebAssembly runtime. |
The above Forth program is equivalent to the following Python code:
print((1 + 2) * (3 + 4))
6.1.2. Comments
Anything contained between a pair of opening and closing parentheses is a comment and it is therefore ignored by the compiler.
( This is a comment. )
6.1.3. Variables
A variable name must start with a letter and can be followed by any additional letters or digits. When using a variable in a program, the variable name by itself pushes the variable’s value into the stack. A variable name that ends with an exclamation mark (!) pops a value from the top of the stack and assigns it to the said variable. For example:
1 2 + x! (1)
x x + . (2)
| 1 | Push 1. Push 2. Pop two values (2 and 1), add them, push result (3). Pop value (3) and assign it to variable x. |
| 2 | Push value of x twice. Pop two values (3 and 3), add them, push result (6). Pop value (6) and print it. |
Variables have a default value of 0 in case they’re used before being assigned for the first time.
6.1.4. Input/Output
These are the I/O related words provided by chiqui_forth:
| Word | Description |
|---|---|
\(\texttt{.}\) |
Pops the integer value at the top of the stack and prints it on the standard output followed by a single space. |
\(\texttt{emit}\) |
Similar to the dot word ( |
\(\texttt{nl}\) |
Print a newline character on the standard output. This is exactly the same as:
|
\(\texttt{input}\) |
Reads from the standard input an integer value and pushes into the stack. Pushes a zero if the value read is not a valid integer. |
6.2. Compiler Implementation
An initial implementation of the chiqui_forth compiler is contained in the forth/chiqui_forth.py file.
6.2.1. The main function
The main function is a good place to check out the general steps carried out by the compiler.
main functiondef main():
"""Control all the steps carried out by the compiler."""
check_args() (1)
full_source_name = argv[1] (2)
words = read_words(full_source_name) (3)
remove_comments(words) (4)
result = [] (5)
result.append(WAT_SOURCE_BEGIN) (6)
declare_vars(result, find_vars_used(words)) (7)
code_generation(result, words) (8)
result.append(WAT_SOURCE_END) (9)
file_content = '\n'.join(result) (10)
file_name, _ = splitext(full_source_name) (11)
create_wat_file(file_name, file_content) (12)
create_wasm_file(file_name, file_content) (13)
| 1 | Verify that the Python program received a command line argument at the terminal with the name of the input Forth source file. If not, display an error message and exit. |
| 2 | Get the name of the input file from the second command line argument (argv[1]). In case you were wondering, the first command line argument (argv[0]) contains the name of the Python program file being executed. |
| 3 | Read the content from the input file and split it into space-delimited words. |
| 4 | Remove all the words that constitute part of a comment. |
| 5 | The result variable starts as an empty list. All WAT instructions will be strings that get appended to this list. |
| 6 | Append to result all the code that goes at the start of a WAT source code. |
| 7 | Find all the variables used in the program and declare them at the beginning of the exported _start function. |
| 8 | Translate every word from the source code into its equivalent WAT instructions and append them to result. This is the core of the compiler. |
| 9 | Append to result all the code that goes at the end of a WAT source code. |
| 10 | Join all the strings of WAT instructions in result, delimiting individual instructions with newlines. |
| 11 | Remove the extension from the user provided input file name in order to create, in the next couple of steps, two new files using the same name but with different extensions. |
| 12 | Create a text file with the WAT code. |
| 13 | Create a binary file with the WASM code. To do this, we use the wat2wasm function from the wasmer Python package, which basically does the same thing as the WABT wat2wasm tool we saw before. |
| Wasmer-Python is a complete and mature WebAssembly runtime for Python based on Wasmer. |
6.2.2. The OPERATION dictionary
Most of the chiqui_forth words that represent some kind of operation that needs to be translated into WAT are stored in a dictionary called OPERATION. This is how this dictionary is defined:
OPERATION dictionaryOPERATION = {
'*': ['i32.mul'],
'+': ['i32.add'],
'.': ['call $print'],
'emit': ['call $emit'],
'input': ['call $input'],
'nl': [
'i32.const 10',
'call $emit'
],
}
As can be observed, every dictionary key is a string that represents a chiqui_forth word, and its associated value is a list of one or more strings of WAT instructions. The functions $print, $emit, and $input are imported functions and their implementation will be provided by the runtime environment as explained later.
6.2.3. The code_generation function
As mentioned before, the core of the compiler is the code_generation function.
code_generation functiondef code_generation(result, words): (1)
for word in words: (2)
if is_number(word): (3)
result.append(INDENTATION + f'i32.const {word}')
elif word in OPERATION: (4)
for statement in OPERATION[word]:
result.append(INDENTATION + statement)
elif is_var_name(word): (5)
result.append(INDENTATION + f'local.get ${word}')
elif word[-1] == '!' and is_var_name(word[:-1]): (6)
result.append(INDENTATION + f'local.set ${word[:-1]}')
else: (7)
raise ValueError(f"'{word}' is not a valid word")
| 1 | All the strings that represent WAT instructions will be placed in the result list. The words list contains all the program words as strings. |
| 2 | Iterate over all the program words. |
| 3 | If the current word is an integer number, add to result the WAT instruction that pushes said number into the stack. |
| 4 | If the current word is a key in the OPERATION dictionary, add to result all the associated WAT instructions. |
| 5 | If the current word is a variable (not ending in !), add to result the WAT instruction that pushes the variable’s value into the stack. |
| 6 | If the current word is a variable ending in !, add to result the WAT instruction that pops the stack and assigns the resulting value to the variable. |
| 7 | Any word not recognized will produce an error. |
INDENTATION is a string with four spaces. It’s concatenated before each WAT instruction in order to make the resulting code more legible.
|
6.3. The Execution Script
The file forth/execute.py is a Python script that must be used when running the WASM code produced by our compiler. This is so because WASM doesn’t directly support any I/O facilities. These have to be provided by the runtime system according to our needs. We use the wasmer Python package mentioned earlier to handle the required steps to instantiate our WASM module, call its _start function, and also import the functions $print, $emit, and $input which our generated code depends on. These functions, which are actually quite small and simple, are written in Python and can be found in this script.
6.4. Putting Everything Together
Let’s see how to compile and run the chiqui_forth program.
First, make forth our current working directory. At the terminal type:
cd /workspace/pycon2022-wasm/forth
Now, to run the compiler, type ./chiqui_forth.py at the terminal followed by the name of a chiqui_forth source code file. The forth/examples directory contains several chiqui_forth programs, a couple of them should work with the current version of our compiler. At the terminal type:
./chiqui_forth.py examples/numbers.4th
This creates two new files contained in the forth/examples directory: numbers.wat and numbers.wasm. You can open the forth/examples/numbers.wat file in the editor to inspect the generated WAT code. Use the execution script to run the WASM binary code. At the terminal type:
./execute.py examples/numbers.wasm
The numbers program expects the user to type in two numbers and then prints the result of adding and multiplying them together.
Follow these same steps to compile and execute the forth/examples/hello_world.4th program.
6.5. Exercise D ★
Modify the OPERATION dictionary in forth/chiqui_forth.py so that the compiler supports all the operation words presented in the following table.
| chiqui_forth Word | Description |
|---|---|
\(\texttt{-}\) |
Subtraction
\[\begin{matrix}
y \gets \textit{pop} \\
x \gets \textit{pop} \\
\textit{push}(x - y)
\end{matrix}\]
WAT instruction: |
\(\texttt{/}\) |
Division
\[\begin{matrix}
y \gets \textit{pop} \\
x \gets \textit{pop} \\
\textit{push}(x \div y)
\end{matrix}\]
WAT instruction: |
\(\texttt{=}\) |
Equal
\[\begin{matrix}
y \gets \textit{pop} \\
x \gets \textit{pop} \\
\textrm{if} \; x = y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0)
\end{matrix}\]
WAT instruction: |
\(\texttt{<>}\) |
Not Equal
\[\begin{matrix}
y \gets \textit{pop} \\
x \gets \textit{pop} \\
\textrm{if} \; x \ne y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0)
\end{matrix}\]
WAT instruction: |
\(\texttt{<}\) |
Less Than
\[\begin{matrix}
y \gets \textit{pop} \\
x \gets \textit{pop} \\
\textrm{if} \; x < y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0)
\end{matrix}\]
WAT instruction: |
\(\texttt{<=}\) |
Less or Equal
\[\begin{matrix}
y \gets \textit{pop} \\
x \gets \textit{pop} \\
\textrm{if} \; x \le y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0)
\end{matrix}\]
WAT instruction: |
\(\texttt{>}\) |
Greater Than
\[\begin{matrix}
y \gets \textit{pop} \\
x \gets \textit{pop} \\
\textrm{if} \; x > y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0)
\end{matrix}\]
WAT instruction: |
\(\texttt{>=}\) |
Greater or Equal
\[\begin{matrix}
y \gets \textit{pop} \\
x \gets \textit{pop} \\
\textrm{if} \; x \ge y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0)
\end{matrix}\]
WAT instruction: |
Test your code by compiling and executing the forth/examples/operators.4th program. The expected output should be:
Everything is working fine!
6.6. Exercise E ★
The do and loop words will allow our chiqui_forth programs to have a repetition construct similar to Python’s while statement. It’s syntax is as follows:
The \(\textit{condition}\) part is first evaluated. If the result is zero (false) the looping construct ends and program execution continues after the loop word. Otherwise the \(\textit{body}\) is executed, the \(\textit{condition}\) is evaluated once again, and the whole process is repeated. Note that the question mark (?) word is required to establish where \(\textit{condition}\) ends and \(\textit{body}\) begins.
The following example shows how to display the numbers 1 to 10.
( File: 1_to_10.4th )
( Display numbers 1 to 10, each number its own line. )
1 x! ( Initialize x with 1. )
do
x 10 <= ? ( Continue in loop while x is less than or equal to 10. )
x . nl ( Print current value of x on its own line. )
x 1 + x! ( Increment in one the value of x. )
loop
Modify the OPERATION dictionary in forth/chiqui_forth.py so that the compiler supports the do, ?, and loop words as detailed in the next table:
| chiqui_forth Word | Corresponding WAT Code |
|---|---|
\(\texttt{do}\) |
|
\(\texttt{?}\) |
|
\(\texttt{loop}\) |
|
Check the documentation for the block, loop, br, and i32.eqz instructions to understand how they work.
The chiqui_forth do/loop construct creates and uses a new independent stack. This new stack is initially empty and must be empty when the construct ends. Otherwise you’ll get a validation error from the WebAssembly runtime.
|
After solving exercise D and exercise E compile and execute the following three programs from the forth/examples directory to make sure everything works as expected:
-
1_to_10.4th
-
triangle.4th (type at the prompt a value from 5 to 20)
-
pow2.4th (type at the prompt a value from 5 to 20)
7. Additional Resources
7.1. Books
-
What Is WebAssembly?
By Colin Eberhardt
O’Reilly Media, Inc., 2019. -
WebAssembly: The Definitive Guide
By Brian Sletten
O’Reilly Media, Inc., 2021.
ISBN: 978-1492089841
7.2. Online Resources
8. Acknowledgements
Special thanks to the Tecnológico de Monterrey students from the Development and Implementation of Software Systems course, sections 501 and 502 of the 2022 spring semester, for reviewing these notes and providing valuable feedback.
9. License and Credits
-
Copyright © 2026 by Ariel Ortiz.
-
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License CC BY-NC-SA 4.0.
-
Free use of the source code presented here is granted under the terms of the GPL version 3 License.
-
This document was prepared using the AsciiDoctor text processor.
-
Icons by Flaticon by Magnific.
-
Free use of the source code presented here is granted under the terms of the GPL version 3 License.
-
The author utilized Gemini, a large language model by Google, for drafting assistance and technical review of these tutorial notes.