Crafting Your Own Compiler: From Python Logic to High-Speed WebAssembly

Tutorial @ EuroPython 2026. July 13, 2026. Kraków, Poland.

The most up to date version of this document is available at:

https://arielortiz.info/europython2026/

The source code repository for this tutorial is available on GitHub at:

https://github.com/ariel-ortiz/europython2026

1. Tutorial Overview

1.1. General Description

In this tutorial, participants will learn how to build in Python a compiler that translates a scaled-down programming language called chiqui_forth into executable WebAssembly (Wasm) code.

Wasm is a relatively new technology that can be used to create client-side and standalone applications that are efficient, secure, and portable. It’s an intermediate language for a stack-based virtual machine that uses a just-in-time (JIT) compiler to produce native machine code. Wasm provides a portable compilation target for languages such as C/C++, Rust, C#, and many others.

Although writing a compiler for a full blown programming language is a daunting task, it’s possible to write swiftly a Wasm targeting compiler for a fairly small language, chiqui_forth in this case. This tiny language is a subset of the Forth programming language, which first appeared in 1970. It’s a stack-oriented language that is programmed using Reverse Polish Notation (RPN), which happens to be quite easy to convert into Wasm code.

1.2. Outcomes

By the end of this session you’ll have a new mental model for software:

The Compiler Pipeline: Implement the full flow from raw text to executable binary.
Wasm Fundamentals: Gain a practical understanding of WebAssembly, the technology powering modern browser-based video editors, games, and serverless clouds.
Custom Tooling: Leave with a working Python-based compiler ready for your own extensions and experiments.

1.3. Prerequisites

Python Proficiency: Comfort with variables, loops, lists, dictionaries, and file I/O.
Terminal Basics: Ability to navigate folders and run scripts from a command line.

No prior knowledge of compiler design, Wasm, or web development is required.

2. Introduction to WebAssembly

2.1. What is WebAssembly?

WebAssembly is a binary format designed for a stack machine, serving as a virtual Instruction Set Architecture (ISA) that defines a standardized, abstract machine language for a sandboxed CPU. Engineered from its inception as a compiler target rather than a handwritten language, it allows developers to compile software from various high-level source languages into a highly portable format.

In client-side web environments, WebAssembly is not intended to replace JavaScript, but rather to complement it by offloading computationally heavy tasks while leaving standard scripting and DOM orchestration to the browser’s native engine. This collaborative design allows a host runtime to seamlessly translate the portable bytecode directly into a physical processor’s native instructions, powering high-performance applications across both client-side browsers and secure, server-side infrastructures.

2.2. Main Features

Secure: Wasm executes code inside a safe, sandboxed environment that isolates it from the underlying host system.
Portable: The compiled binary format runs smoothly across different operating systems, devices, and architectures without modifications.
Performant: It delivers near-native execution speeds by utilizing a compact binary layout that loads and parses almost instantly.
Open Standard: It is developed by a W3C community group to ensure it remains a royalty-free, transparent technology for everyone.

2.3. Limitations

I/O Isolation: It cannot directly access the host file system or network sockets without a standardized interface layer like WASI.
Data Marshalling Overhead: Passing complex data across the linear memory boundary requires manual serialization, creating performance bottlenecks with host languages.
Absence of Hardware Memory Mitigations: Standard OS-level memory protections are missing inside Wasm modules, leaving internal data vulnerable to buffer overflows.
Immature Tooling and Ecosystem: High-level debugging, profiling, and cross-language dependency management remain fragmented compared to mature native environments.
Binary Size and Cold Start Latency: Bundling runtimes or large standard libraries increases file sizes, slowing network transmission and server initialization times.

2.4. Ecosystem

2.4.1. Languages Targeting Wasm

Various compilers and languages compile down to Wasm bytecode:

2.4.2. WASI

The WebAssembly System Interface (WASI) provides a standardized API framework for Wasm modules.

It enables WebAssembly binaries to run entirely outside of a web browser sandbox.
Grants sandboxed modules secure, direct access to host system resources, including:
- File Systems
- Networking layers
- System Time clocks

While we will not be using WASI in this tutorial, it is a crucial piece of the WebAssembly ecosystem worth keeping on your radar for standalone and server-side runtimes.

2.5. Key Milestones

Date	Milestone
June 2015	WebAssembly is initially announced to the public.
March 2017	The design for the Minimum Viable Product (MVP) is officially concluded.
November 2017	Native support is achieved across all 4 major desktop browsers as well as mobile platforms for Android and iOS: Google’s Chrome Apple’s Safari Microsoft’s Edge Mozilla’s Firefox
March 2019	Mozilla introduces WASI (WebAssembly System Interface), extending Wasm’s sandboxed execution model beyond the browser to stand-alone server and CLI runtimes.
December 2019	The W3C officially announces WebAssembly as the fourth language of the open Web, joining HTML, CSS, and JavaScript.
February 2024	WASI Preview 2 launches, introducing the stable WebAssembly Component Model to enable high-performance, polyglot software interoperability.
December 2024	WebAssembly 2.0 is officially ratified as a W3C standard, introducing native SIMD (Vector types), Bulk Memory operations, Reference Types, and Multiple Memories.
September 2025	WebAssembly 3.0 is ratified by the W3C, bringing native Garbage Collection (WasmGC) for high-level languages, 64-bit linear memory addressing (Memory64), and first-class Exception Handling.
February 2026	Global browser adoption reaches an estimated 96% of all installed browsers worldwide.

Date

Milestone

June 2015

WebAssembly is initially announced to the public.

March 2017

The design for the Minimum Viable Product (MVP) is officially concluded.

November 2017

Native support is achieved across all 4 major desktop browsers as well as mobile platforms for Android and iOS:

Google’s Chrome
Apple’s Safari
Microsoft’s Edge
Mozilla’s Firefox

March 2019

Mozilla introduces WASI (WebAssembly System Interface), extending Wasm’s sandboxed execution model beyond the browser to stand-alone server and CLI runtimes.

December 2019

The W3C officially announces WebAssembly as the fourth language of the open Web, joining HTML, CSS, and JavaScript.

February 2024

WASI Preview 2 launches, introducing the stable WebAssembly Component Model to enable high-performance, polyglot software interoperability.

December 2024

WebAssembly 2.0 is officially ratified as a W3C standard, introducing native SIMD (Vector types), Bulk Memory operations, Reference Types, and Multiple Memories.

September 2025

WebAssembly 3.0 is ratified by the W3C, bringing native Garbage Collection (WasmGC) for high-level languages, 64-bit linear memory addressing (Memory64), and first-class Exception Handling.

February 2026

Global browser adoption reaches an estimated 96% of all installed browsers worldwide.

2.6. Wasmtime

2.6.1. What is Wasmtime?

Wasmtime is a specialized tool designed to run WebAssembly programs outside of a web browser. While WebAssembly was originally created to let developers run high-speed code inside web pages, tools like Wasmtime bring that same power to your regular desktop or server. Think of Wasmtime as a tiny, fast virtual machine or “sandbox”. It takes a compiled Wasm file — which could be written in languages like C++, Rust, Go, or the chiqui_forth language we will build in this tutorial — and safely runs it at near-native speed on your operating system, keeping the code completely isolated so it cannot harm your computer.

Wasmtime can be integrated into Python via the wasmtime-py package. Normally, Python is known for being easy to write but a bit slow for heavy math or intense processing. By installing this package, you can use Python to load a fast WebAssembly module and run its functions seamlessly right inside your Python script. This gives you the best of both worlds: you can write your main application using Python’s simple syntax, while instantly outsourcing heavy-duty tasks to lightning-fast WebAssembly modules without ever leaving your Python environment.

2.6.2. Installing wasmtime-py

For most users, the standard package manager pip is the way to go. Open your terminal or command prompt and run:

pip install wasmtime

Sometimes the pip command doesn’t work. Use this specific command if the one above fails:

python -m pip install wasmtime

If the standard commands fail, try using pip3 or python3. Some environments, particularly on macOS and Linux, use the 3 suffix to distinguish the modern interpreter from legacy system versions and prevent version conflicts.

Run the following command at the terminal to verify that Wasmtime was successfully installed:

python -c "import wasmtime; print('Wasmtime installed successfully')"

If everything is OK, you should see the message Wasmtime installed successfully printed in your terminal.

Wasmtime Platform Support

Wasmtime is designed for 64-bit operating systems and supports the following environments:

macOS: Fully supported on x86_64 (Intel) and aarch64 (Apple Silicon).
Linux: Fully supported on x86_64 (Intel/AMD) and aarch64 (ARM64).
Windows: Fully supported natively on x86_64 (Intel/AMD).
Native (bare-metal) Windows ARM64 (for example, Qualcomm Snapdragon) is not currently supported. However, Wasmtime does work perfectly on Windows ARM64 devices when run inside WSL (Windows Subsystem for Linux), as it leverages the fully supported Linux aarch64 architecture.

Note that Wasmtime does not support legacy 32-bit platforms (such as 32-bit ARM or x86).

3. Hand coding WebAssembly

WebAssembly Text Format, or WAT, is the textual assembly language corresponding to Wasm’s binary format. There is a direct, one-to-one correspondence between the two formats, meaning WAT simply acts as a human-readable representation of a Wasm module. Source files written in this format typically use the .wat file extension.

As previously noted, WebAssembly is designed to be generated by compilers rather than written by hand. However, learning to code manually in WAT is an invaluable step for anyone wanting to effectively grasp this technology: it provides the necessary clarity to understand exactly how the low-level stack engine operates, making it the ultimate tool for mastering Wasm’s inner workings.

3.1. S-expressions

In both the binary and textual formats, the fundamental unit of code in WebAssembly is a module. In the text format, a module is represented as one big S-expression. S-expressions are a long-established, simple textual format for representing trees, allowing us to view a module as a tree of nodes that describe the module’s structure and its code.

S-expression stands for “symbolic expression”, and it is commonly encountered in the different Lisp language dialects such as Common Lisp, Scheme, Racket, and Clojure. These programming languages use s‑expressions to represent the computer program and also the program’s data.

Let’s see what an S-expression looks like. Each node in the tree goes inside a pair of parentheses: ( … ). The first label inside the parentheses tells you what type of node it is, and after that there is a space-separated list of either attributes or child nodes. For example:

(module
  (func
    (export "myFunction")
    return))

In this example the node types are module, func and export. The root of the s-expression is the module node. This node has func as its only child. The func node has two children: the export node and the return attribute. Finally, the export node has only one child, which is the "myFunction" attribute. Indentation is used here to clarify the tree-structure of the s-expression.

3.2. Comments

WAT supports two types of comments:

Line comments: Start with ;; and continue to the end of the line.
Block comments: Delimited by (; and ;).

Examples:

(;
    Block comment:
    It can span
    multiple lines.
;)
(module) ;; Line comment: This is the smallest module we can have.

Comments are ignored by the Wasm assembler and are used strictly for code documentation.

3.3. Data Types

WebAssembly programs natively support multiple categories of data types. The core numeric types are:

i32: 32-bit integer
i64: 64-bit integer
f32: 32-bit float
f64: 64-bit float

These four core types are used as prefixes in most WebAssembly instructions. They are also used when declaring variables, parameters, and function return types.

Advanced vector and reference types also exist under Wasm 3.0 but are outside the scope of this tutorial.

3.4. The Stack

A stack is a data structure with two operations: push and pop. Items are pushed (inserted) onto the stack and subsequently popped (removed) from the stack in last in, first out (LIFO) order.

A real-life example is a stack of pancakes: you can only take a pancake (at least easily) from the top of the stack, and you can only add a pancake to the top of the stack.

Figure 1. A stack of pancakes. Source: www.freevector.com

WebAssembly execution is defined in terms of a stack machine where the basic idea is that every instruction pushes and/or pops a certain number to/from a stack. For example, to add the numbers 7 and 5, the following three steps need to be carried out on the stack:

Push the 7.
Push the 5.
Pop the two top values (5 and 7), perform the addition (7 + 5), and finally push the result (12).

An illustration of the three steps described above can be seen in Figure 2 .

Figure 2. An example of using a stack to add two numbers.

The actual WAT code to do this arithmetic operation would look like this:

i32.const 7    (1)
i32.const 5    (2)
i32.add        (3)

1	Push the 32-bit integer constant 7 onto the stack.
2	Push the 32-bit integer constant 5 onto the stack.
3	Pop the top two values, add them together, and push the result (12) back onto the stack.

3.5. Functions

In WebAssembly, functions serve as the primary execution blocks. A function executes its logic by pulling arguments from the stack, performing its operations, and leaving the results on the stack for subsequent instructions.

3.5.1. Function Declarations and Parameters

Functions are defined using the func keyword inside an S-expression. To reference a parameter by name, you assign it an identifier that begins with a dollar sign (e.g., $x).

Parameters are declared using the param keyword followed by one of the core data types.

3.5.2. Working with Variables

WebAssembly treats function parameters as mutable local variables. You interact with these variables using two key instructions:

local.get: Copies the value of a specific parameter or local variable and pushes it onto the top of the stack. This instruction does not alter the variable itself.
local.set: Pops the top value off the stack and writes it directly into the specified variable, completely replacing its previous contents.

3.5.3. Returning Values

You declare the expected data type of a function’s return value using the result keyword in the function’s signature. To successfully return data, the stack must contain a value matching that type when the function exits.

WebAssembly handles the exit in two ways:

Implicit Return: When execution naturally reaches the function’s closing parenthesis, it exits automatically, consuming whatever final value is left on top of the stack as the return value.
Explicit Return: You can use the return instruction to immediately halt execution and exit the function early. Just like an implicit return, whatever value is currently on top of the stack will be popped and used as the return value.

3.5.4. A Floating-Point Average Example

The following example demonstrates how to declare a function, define its parameters, and execute the arithmetic steps required to compute and return the average of three floating-point values:

File: example.wat

(;===================================================================
    File: example.wat

    WebAssembly text format (WAT) source code example.
=====================================================================;)
(module
  ;; Compute the average of three values
  (func                  (1)
    (export "average")   (2)

    (param $x f64)       (3)
    (param $y f64)
    (param $z f64)

    (result f64)         (4)

    local.get $x         (5)
    local.get $y
    f64.add              (6)

    local.get $z         (7)
    f64.add              (8)

    f64.const 3.0        (9)
    f64.div              (10)
  )
)

1	Define a function.
2	Export the function as `"average"`.
3	Declare parameters `$x`, `$y`, and `$z` as 64-bit floats. Note that, in WAT, variable names must begin with a dollar sign.
4	Declare that the function returns a 64-bit float.
5	Push `$x` and `$y` onto the stack.
6	Pop the top two values, add them together, and push the result back onto the stack.
7	Push `$z` onto the stack.
8	Pop the top two values, add them together, and push the result back onto the stack.
9	Push 3.0 onto the stack.
10	Pop the top two values, divide them, and push the result back onto the stack; this sole remaining value is the function’s result.

3.6. Invoking Wasm from Python

Executing a Wasm module within Python using the wasmtime runtime requires a structured lifecycle: establishing an isolated execution environment, compiling the module, and instantiating it. Serving as a continuation of the previous example, the following code demonstrates this complete workflow by loading the resulting WAT file, instantiating it without external host imports, extracting its exported function, and safely marshaling Python arguments and results across the runtime boundary.

This dense low-level configuration is standard Wasmtime boilerplate. Feel free to copy and paste this wrapper function directly into your code to start executing Wasm files without getting bogged down in the runtime setup.

File: wat_launcher.py

# ==============================================================================
# File: wat_launcher.py
#
# A generic host script for loading, compiling, and executing arbitrary
# WebAssembly modules from within Python using the Wasmtime runtime.
# ==============================================================================

from wasmtime import Store, Module, Instance


def call_wat_fun(file_name, fn_name, *args): (1)
    """Loads, compiles, and executes a WebAssembly function.

    Args:
        file_name (str): The path to the .wat or .wasm file.
        fn_name (str): The name of the exported Wasm function to invoke.
        *args: Variable length argument list to pass to the Wasm function.

    Returns:
        Any: The marshaled result returned from the WebAssembly function execution.
    """
    store = Store() (2)
    module = Module.from_file(store.engine, file_name) (3)
    instance = Instance(store, module, []) (4)
    function = instance.exports(store).get(fn_name) (5)
    return function(store, *args) (6)


if __name__ == '__main__': (7)
    print(call_wat_fun('example.wat', 'average', 1.0, 2.0, 6.0)) (8)

1	Handles loading, compiling, and executing a WebAssembly function.
2	An isolated sandbox environment holding runtime state (globals, memory, instances).
3	Reads the `.wat` or `.wasm` file and compiles it into machine code via Wasmtime.
4	Binds the compiled module to the store runtime state. The empty array `[]` is for imports.
5	Extracts the explicitly exported function by its string name from the instance.
6	Invokes the function. Wasmtime requires passing the `store` context as the first argument.
7	Ensures the following code only runs if the script is executed directly.
8	Passes three Python floats across the Wasm boundary to calculate an average and prints the result.

3.7. ★ Exercise A: Temperature Conversion

Add to the example.wat file a new function called fah_to_cel that takes as argument a 64-bit floating point value that corresponds to a temperature $F$ in degrees Fahrenheit and converts it to degrees Celsius.

If $F$ is a temperature in degrees Fahrenheit, to convert it to $C$ degrees Celsius you should use the following formula:

\[C = \frac{(F - 32.0) \times 5.0}{9.0}\]

Add the necessary code to the wat_launcher.py file to check that the computed results are correct using the following values:

°F	°C
\(212.0\)	\(100.0\)
\(32.0\)	\(0.0\)
\(-40.0\)	\(-40.0\)

°F

°C

$212.0$

$100.0$

$32.0$

$0.0$

$-40.0$

To solve this exercise use these Wasm instructions:

3.8. ★ Exercise B: Quadratic Equation

Add to the example.wat file a new function called quadratic_root that takes three 64-bit floating point values corresponding to the coefficients $a$, $b$, and $c$ of the quadratic equation $ax^2+bx+c = 0$, and computes its first root.

To find the first root $x$ given the coefficients $a$, $b$, and $c$, use the following formula:

\[x = \frac{ -b + \sqrt{b^2 - 4 a c} }{ 2 a }\]

Add the necessary code to the wat_launcher.py file to check that the computed results are correct using the following test cases (assume all inputs yield real roots):

\(a\)	\(b\)	\(c\)	Expected \(x\)
\(2.0\)	\(4.0\)	\(2.0\)	\(-1.0\)
\(1.0\)	\(0.0\)	\(0.0\)	\(0.0\)
\(4.0\)	\(5.0\)	\(1.0\)	\(-0.25\)

$a$

$b$

$c$

Expected
$x$

$2.0$

$4.0$

$2.0$

$-1.0$

$1.0$

$0.0$

$4.0$

$5.0$

$1.0$

$-0.25$

To solve this exercise, you will need the previous math instructions along with these specific Wasm operations:

4. Building a Compiler

4.1. Compilers vs. Interpreters

Because computers cannot natively understand human-written code, you must use either a compiler or an interpreter to translate and execute any program written in a high-level language.

A compiler is a program that reads the entire source code all at once and converts it into a separate, standalone file — such as machine code — that is ready to run at maximum speed. In contrast, an interpreter works on the fly. It reads, translates, and executes your code one instruction at a time, in real time, right while the program is running.

To understand the difference, imagine you need to read a book written in Polish, but you only speak English:

The Compiler Approach: You hand the entire Polish book to a professional translator. She translates every single page and hands you back a completely new, standalone book written entirely in English. You can now read it smoothly from start to finish — and if you want to read it again, you can just pick up that English book and read it instantly.
The Interpreter Approach: You sit next to a live interpreter. As you flip through the Polish pages, she translates each sentence aloud to you in English one by one. It requires less prep work up front, but the reading process itself is much slower. Furthermore, if you want to read the book a second time, the interpreter has to sit down and translate the exact same sentences aloud to you all over again from scratch.

For this tutorial, our sole focus will be on designing and writing a compiler.

4.2. Why Build a Compiler?

Building a compiler is incredibly rewarding because it completely strips away the “magic” behind how programming languages work. By writing one yourself, you get to see exactly how your lines of text are broken down, organized, and turned into something a computer can run. This behind-the-scenes perspective provides a clearer understanding of language behavior and deepens your insight into software execution. Ultimately, it shifts your perspective from simply writing software to fundamentally understanding how things work underneath.

4.3. The chiqui_forth Language

Although writing a compiler for a full-blown programming language like C++, Java, or Rust can be a daunting task that could take months or even years, it is entirely feasible to build a WebAssembly-targeting compiler for a very small language in the time allocated for this tutorial. This tiny language is called chiqui_forth, a subset of the Forth programming language invented by Chuck Moore in 1970.

Chiqui (pronounced like “cheeky”) is an informal word in Spanish that refers to something small or tiny. In this case a tiny version of the Forth language.

4.3.1. Overview

Forth is a language unlike most others. It’s not functional or object oriented, it doesn’t have type-checking, and it basically has zero syntax. It was written in the 70s, but is still used today mainly in embedded systems and system controllers.

Forth is a stack-oriented language that is programmed using the Reverse Polish Notation (RPN) — also known as Postfix Notation — which is fairly easy to convert into WebAssembly.

Since we are here in Poland, it is worth noting that Reverse Polish Notation gets its name from the brilliant Polish logician and philosopher Jan Łukasiewicz, who invented its precursor, Prefix Notation, in the 1920s. It was originally called “Łukasiewicz Notation” but because English speakers had a hard time pronouncing his surname, the world settled on “Polish Notation” instead. Decades later, computer scientists inverted his system to create RPN — proving that Polish logic was the perfect fit for stack-based computing!

Image Source: mathshistory.st-andrews.ac.uk

The syntax of Forth is extremely straightforward, consisting of a series of space-delimited tokens where everything that isn’t a number is called a word (a named operation or command). Forth implements a stack machine, just like WebAssembly. As these tokens are processed from left to right, any number found is pushed onto the stack. When a word is found, some values from the top of the stack are popped, the operation is performed with those values, and the result (if any) is pushed back onto the stack.

While standard Forth handles data as untyped values, the compiler we will be building in this tutorial targets WebAssembly’s native i32 (32-bit integer) type to represent all numbers. This keeps our type system simple and straightforward.

Using a formal notation, the behavior of the $\texttt{+}$ word can be expressed like this:

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textit{push}(x + y) \end{matrix}\]

And the $\texttt{*}$ word can be expressed like this:

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textit{push}(x \times y) \end{matrix}\]

A more elaborate example:

1 2 + 4 5 + * .

This is what is happening:

Push 1.
Push 2.
Pop two values (2 and 1), add them, push result (3).
Push 4.
Push 5.
Pop two values (5 and 4), add them, push result (9).
Pop two values (9 and 3), multiply them, push result (27).
The dot (.) word pops a value and prints it.

Because the dot word pops the final value to print it, the stack is left completely empty. This is the hallmark of a well-behaved Forth program—and as you will see during the compiler’s code generation phase, keeping the stack balanced is a strict requirement to pass WebAssembly’s runtime validation.

The above Forth program is equivalent to the following Python code:

print((1 + 2) * (4 + 5))

Given that Forth relies on this stack-based RPN, an important advantage is that no parentheses are required for grouping subexpressions. In traditional Infix Notation, operators are placed between operands (for example: $1 + 2$), a design that requires complex parsing rules that Forth completely eliminates. Under this system, operator precedence is used to decide which operation happens first (such as evaluating multiplication before addition) and associativity to break ties when operators have the same priority (such as evaluating left-to-right).

To see how RPN completely eliminates the need for both rules and parentheses, consider how those same infix expressions translate directly into Forth:

Concept Traditional Infix Forth (RPN) Equivalent Result

Concept	Traditional Infix	Forth (RPN) Equivalent	Result
Basic Expression	\(1 + 2\)	`1 2 +`	\(3\)
Operator Precedence	\(1 + 2 \times 3\)	`1 2 3 * +`	\(7\)
Explicit Grouping	\((1 + 2) \times 3\)	`1 2 + 3 *`	\(9\)
Left-Associativity Tie-Breaker	\(1 - 2 - 3\)	`1 2 - 3 -`	\(-4\)
Right-Associativity Tie-Breaker	\(1 - (2 - 3)\)	`1 2 3 - -`	\(2\)

Basic Expression

$1 + 2$

1 2 +

$3$

Operator Precedence

$1 + 2 \times 3$

1 2 3 * +

$7$

Explicit Grouping

$(1 + 2) \times 3$

1 2 + 3 *

$9$

Left-Associativity Tie-Breaker

$1 - 2 - 3$

1 2 - 3 -

$-4$

Right-Associativity Tie-Breaker

$1 - (2 - 3)$

1 2 3 - -

$2$

By forcing the programmer to explicitly arrange the order of operations via the stack, RPN bypasses these parsing headaches entirely. As you can see in the table, changing the evaluation order from multiplying first to adding first doesn’t require introducing parentheses; it simply requires changing the sequence of the tokens. The notation itself encodes the exact sequence of execution, meaning a Forth compiler doesn’t need a single rule for precedence or associativity to accurately evaluate an expression.

4.3.2. Comments

Anything contained between a pair of opening and closing parentheses is a comment and it is therefore ignored by the compiler.

( This is a comment. )

4.3.3. Variables

A variable name must start with a letter and can be followed by any additional letters or digits. When using a variable in a program, the variable name by itself pushes the variable’s value into the stack. A variable name that ends with an exclamation mark (!) pops a value from the top of the stack and assigns it to the said variable. For example:

1 2 + x! (1)
x x + . (2)

1	Push 1. Push 2. Pop two values (2 and 1), add them, push result (3). Pop value (3) and assign it to variable `x`.
2	Push the value of `x` twice. Pop two values (3 and 3), add them, push result (6). Pop value (6) and print it.

Variables have a default value of 0 in case they’re used before being assigned for the first time.

4.3.4. Input/Output

These are the I/O related words provided by chiqui_forth:

Word Description

Word	Description
\(\texttt{.}\)	Pops the integer value at the top of the stack and prints it on the standard output followed by a single space.
\(\texttt{emit}\)	Similar to the dot word (`.`), but instead of printing a numeric value it prints a character with a Unicode code point equal to the value popped from the stack. No space is added afterwards.
\(\texttt{nl}\)	Print a newline character on the standard output. This is exactly the same as: `10 emit`
\(\texttt{input}\)	Reads from the standard input an integer value and pushes into the stack. Pushes a zero if the value read is not a valid integer.

$\texttt{.}$

Pops the integer value at the top of the stack and prints it on the standard output followed by a single space.

$\texttt{emit}$

Similar to the dot word (.), but instead of printing a numeric value it prints a character with a Unicode code point equal to the value popped from the stack. No space is added afterwards.

$\texttt{nl}$

Print a newline character on the standard output. This is exactly the same as:

10 emit

$\texttt{input}$

Reads from the standard input an integer value and pushes into the stack. Pushes a zero if the value read is not a valid integer.

4.3.5. Keeping the Stack Clean

The chiqui_forth language enforces a strict, clean-stack contract for your entire program. When your program finishes executing, it must leave absolutely zero values on the stack.

If a program calculates a number but fails to completely consume it — for example, leaving a trailing value on the stack instead of printing it with . or saving it to a variable with ! — the execution runtime will throw a validation error and refuse to run it. Always ensure your code cleanly clears the stack before reaching the end of the file.

For instance, consider the following faulty program:

File: faulty.4th

( File: faulty.4th )
5 10 +

While this program correctly computes 15, it leaves that result sitting on the stack when the file ends. Because the 15 is never consumed, compiling and attempting to run this code will trigger a validation error.

To fix it, you must explicitly handle the value, such as printing it:

File: corrected.4th

( File: corrected.4th )
5 10 + .

In a similar vein, attempting an operation when the stack does not contain as many items as that operation expects will also trigger a validation error.

4.4. Compiler Architecture

The following blueprint outlines the execution flow of what will become the chiqui_forth compiler’s main() function. Acting as the entry point, this function will manage the lifecycle of the compilation process by coordinating the transition from the analytical front end to the generative back end. By following this sequential framework, it will handle everything from initial environmental checks and token sanitization to memory mapping, code synthesis, and the final generation of executable target artifacts.

Front End (Analysis)

Validate and Retrieve Input Path: Verify that the correct command-line arguments were provided and extract the file path to the source program.
Tokenize Source Text: Read the raw source file and split it into a sequence of individual words or tokens based on whitespace.
Filter Out Non-Executable Text: Process the raw tokens to strip away comments, leaving behind only the active, executable tokens.

Back End (Synthesis)

Map Memory and Variables: Scan the active tokens to identify every unique variable used by the program, and convert them into low-level local variable declarations.
Generate Target Instructions: Translate the active tokens sequentially into the target WAT instructions.
Emit Target Artifacts: Combine the isolated variable declarations and the instruction body into a predefined module template and write the resulting code to both a WAT text source file and a compiled Wasm binary executable.

The previous tasks are summarized in Figure 3, where you can see the pipeline for the custom chiqui_forth compiler, mapping out how raw source code passes through a Front End validation and analysis stage before a Back End synthesis stage generates the final WebAssembly target outputs.

Figure 3. Architecture for the chiqui_forth Compiler

The diagram in Figure 3 differs significantly from a typical textbook compiler architecture. Standard compiler literature separates the pipeline into numerous distinct, highly specialized phases — such as lexical analysis, syntactic analysis, abstract syntax tree (AST) generation, semantic analysis, intermediate code generation, machine-independent optimization, code generation, and machine-dependent optimization. In contrast, our minimalist chiqui_forth compiler completely eliminates these complex sub-phases for the sake of simplicity, bypassing them entirely to create a streamlined minimalist compiler.

4.5. ★ Exercise C: Validate and Retrieve Input Path

Now that we have mapped out the compiler’s high-level architecture, it is time to start building the Front End. Your first task is to implement the command-line interface gatekeeper.

This component is responsible for verifying that the user provided the correct number of arguments when invoking the compiler from the terminal, and extracting the target file path.

Create a Python script called chiqui_forth.py and implement a main() function alongside a helper function named get_source_filepath() that complies with the following:

Argument Count Validation: The get_source_filepath() function must inspect the incoming command-line arguments. The compiler expects exactly one argument: the path to the chiqui_forth source file. Including the script name itself, the total number of arguments available to the program must be exactly 2.
Error Handling: If the user fails to provide an argument, or provides too many arguments, the function must immediately terminate execution with an exit code of 1 after printing the following error message to the standard error:
```
Please specify the name of the chiqui_forth source file.
```
Success Path: If the validation passes, the function must return the file path string to the caller.
Main Execution: The main() function must serve as the entry point of your script. It should call get_source_filepath(), capture its return value, and print that file path string to standard output. Ensure your script uses the standard if __name__ == '__main__': block to invoke main().

By importing the built-in sys module, you can access command-line arguments via sys.argv, print error messages directly to standard error using print(…, file=sys.stderr), and cleanly terminate program execution with a non-zero exit status (such as an error code of 1) using sys.exit(1).

Once you have implemented the functions and wired them into your main script execution block, you can verify your work directly from the terminal. Test your script with the following three scenarios:

Missing Argument: Run the script without passing a file name. It should print the error message and exit immediately.
```
python chiqui_forth.py
```
Expected output:
```
Please specify the name of the chiqui_forth source file.
```
Too Many Arguments: Run the script with multiple arguments. It should print the same error message and exit immediately.
```
python chiqui_forth.py prog1.4th prog2.4th
```
Expected output:
```
Please specify the name of the chiqui_forth source file.
```
Correct Argument: Run the script with exactly one file name argument. The program should run successfully, printing the retrieved file path back to the terminal.
```
python chiqui_forth.py prog1.4th
```
Expected output:
```
prog1.4th
```

4.6. ★ Exercise D: Tokenize Source Text

With our command-line gatekeeper in place, we can move to the next phase of the Front End: reading the source file and breaking its contents down into a stream of processable tokens. Since Forth is a space-delimited language, our tokenization process is wonderfully straightforward.

In your chiqui_forth.py script, implement a helper function named read_words(input_file_name) that complies with the following specification:

File Ingestion: The function must attempt to open and read the entire contents of the file specified by the input_file_name parameter.
Tokenization: It must split the raw text into individual, space-delimited words and return them as a list of strings to the caller.
Error Handling: If the file cannot be found, the function must catch the FileNotFoundError and immediately terminate execution with an exit code of 1 after printing the following error message to the standard error:
```
Oops! File not found: <input_file_name>
```
Main Integration: Update your main() function so that instead of printing the raw file path, it passes that path to read_words(), captures the resulting list of words, and prints the list to standard output.

Python’s built-in string method str.split(), when called without any arguments, automatically handles all whitespace configurations—including spaces, tabs, and newlines — and handles trailing/leading whitespace gracefully.

Once you have implemented the function and updated main(), create a sample chiqui_forth file called simple.4th with the following contents:

File: simple.4th

( a very simple example )
1 2 + .

Now verify your work directly from the terminal:

Missing File: Run the script passing a non-existent file name. It should print the error message and exit immediately.
```
python chiqui_forth.py missing.4th
```
Expected output:
```
Oops! File not found: missing.4th
```
Existing File: Run the script passing your valid sample chiqui_forth file. The program should run successfully, printing the list of extracted tokens.
```
python chiqui_forth.py simple.4th
```
Expected output:
```
['(', 'a', 'very', 'simple', 'example', ')', '1', '2', '+', '.']
```

4.7. ★★ Exercise E: Filter Out Non-Executable Text

Now that we have a clean list of words from our source code, we need to address comments. In Forth, comments are enclosed within parentheses and separated by whitespace. For example, ( this is a comment ) tells the compiler to ignore everything inside.

To clean our token stream before synthesis, your next task is to write a parser component that strips out these comments completely while ensuring the delimiters are perfectly balanced.

In your chiqui_forth.py script, implement a helper function named remove_comments(tokens) that complies with the following specification:

State Tracking: The function must iterate through the incoming list of strings (tokens) and keep track of whether it is currently reading text inside a comment block or outside of it.
Comment Activation: If the function is outside a comment and encounters the string '(', it must switch to a state indicating it is now inside a comment. The token '(' itself must not be added to the final output.
Comment Deactivation: If the function is inside a comment and encounters the string ')', it must switch back to its normal state. The token ')' itself must not be added to the final output.
Content Filtering: While inside a comment state, all encountered tokens must be ignored. While outside a comment state, tokens must be appended to a new results list.
Unmatched Closing Error: If the function is outside a comment state and encounters a closing parenthesis ')', it means a comment was closed without ever being opened. It must print the following error message to the standard error and immediately terminate execution with an exit code of 1:
```
Error: Unmatched closing parenthesis ')' found outside of a comment block.
```
Unclosed Comment Error: If the function finishes processing all tokens but remains stuck in a comment state (meaning a closing delimiter was never found), it must print the following error message to the standard error and immediately terminate execution with an exit code of 1:
```
Error: End of input reached while searching for closing ')' delimiter.
```
Return Value: If no errors are found, the function must return a new list containing only the executable, non-comment tokens.
Main Integration: Update your main() function so that the token list returned by read_words() is passed directly into remove_comments(). The resulting filtered list should then be printed to standard output instead of the unfiltered list.

Once you have implemented the function and updated main(), create two additional sample files called bad_comment1.4th and bad_comment2.4th alongside your valid simple.4th file to verify all three execution paths from the terminal:

File: bad_comment1.4th

( an unclosed comment
1 2 + .

File: bad_comment2.4th

1 2 + . )

Now verify your work directly from the terminal:

Valid File: Run the script passing your valid sample chiqui_forth file. The program should run successfully, showing that all text inside the parentheses (and the parentheses themselves) has been filtered out.
```
python chiqui_forth.py simple.4th
```
Expected output:
```
['1', '2', '+', '.']
```
Unclosed Comment File: Run the script passing your first broken sample file. It should catch the unclosed state at the end of the file, print the error, and exit.
```
python chiqui_forth.py bad_comment1.4th
```
Expected output:
```
Error: End of input reached while searching for closing ')' delimiter.
```
Unmatched Closing File: Run the script passing your second broken sample file. It should catch the stray closing delimiter immediately, print the error, and exit.
```
python chiqui_forth.py bad_comment2.4th
```
Expected output:
```
Error: Unmatched closing parenthesis ')' found outside of a comment block.
```

4.8. ★ Exercise F: Identify Variable Names

With our token stream completely sanitized of comments, the compiler needs to distinguish between the different types of words (named operations or commands) present in the source code. In chiqui_forth, a word can either be a predefined operation or a user-defined variable name. To lay the groundwork for our symbol table and memory allocation, we must implement a way to validate whether a given word is a valid identifier.

To map these named operations to their actual target code, the compiler utilizes a global lookup dictionary named OPERATION. This dictionary acts as our predefined instruction table: the keys represent the Forth words the compiler looks for, while the values are lists containing the exact WebAssembly instructions that must be emitted to execute that operation.

Define this OPERATION mapping at the global level of your chiqui_forth.py script:

OPERATION = {
    '*': ['i32.mul'],
    '+': ['i32.add'],
    '.': ['call $print'],
    'emit': ['call $emit'],
    'input': ['call $input'],
    'nl': [
        'i32.const 10',
        'call $emit'
    ],
}

Next, implement a helper function named is_var_name(token) that complies with the following specification:

Initial Character Check: The function must verify that the first character of the string (token) is a valid alphabetical letter.
Alphanumeric Validation: It must ensure that the entire string consists solely of alphanumeric characters (letters and digits).
Keyword Exclusion: It must confirm that the string does not conflict with any predefined language operations by verifying that it is not present as a key in the OPERATION dictionary.
Return Value: The function must return True if the word passes all three conditions, indicating it is a valid variable name, and False otherwise.

Python strings provide built-in methods like str.isalpha() and str.isalnum() that make character-level validation straightforward.

Once you have implemented the definition and the function, you can verify its behavior by importing your script into an interactive Python REPL session and running a few test cases:

>>> from chiqui_forth import is_var_name
>>> is_var_name('abc')
True
>>> is_var_name('x123')
True
>>> is_var_name('1abc')
False
>>> is_var_name('a_b')
False
>>> is_var_name('+')
False

4.9. ★ Exercise G: Identify Numbers

Now that the compiler can identify valid variable identifiers, it needs to handle the other fundamental category of tokens: numeric literals. According to our syntax definition, any token that represents a valid number is automatically pushed onto the evaluation stack. Before we can generate code for these values, we must write a helper function to reliably detect them.

In your chiqui_forth.py script, implement a helper function named is_number(token) that complies with the following specification:

Integer Validation: The function must return True if the string (token) represents a valid positive or negative integer, and False otherwise.
Sign Handling: It must correctly handle an optional leading minus sign (-) without falsely validating a lone hyphen as a number.

Instead of manually slicing the string or checking signs with conditional logic, you can use Python’s built-in str.removeprefix('-') method. This strips a leading minus sign if it exists (leaving an empty string if the word was just '-'), allowing you to cleanly call the str.isdigit() method to check the result.

Once you have implemented the function, you can verify its behavior using an interactive Python REPL session with the following test cases:

>>> from chiqui_forth import is_number
>>> is_number('123')
True
>>> is_number('-45')
True
>>> is_number('abc')
False
>>> is_number('-')
False
>>> is_number('12a3')
False

4.10. ★ Exercise H: Discover Variables

Now that your compiler can identify both words and numbers, it needs to analyze the clean token stream to discover which variable names are actually used by the programmer. In Forth, variables are modified using the store operator (!), which is attached directly to the end of a variable’s identifier name (for example, x!). To set up our dynamic memory allocator in a later step, we must first scan the source code and compile a unique collection of all declared variable names.

In your chiqui_forth.py script, implement a helper function named find_vars_used(tokens) that complies with the following specification:

Set Initialization: The function must initialize an empty collection that guarantees all gathered variable names remain distinct and unique.
Store Operator Stripping: As it processes each token, it must check if the token ends with the store operator (!). If it does, the function must strip that trailing character off to isolate the raw identifier name.
Identifier Validation: It must pass the isolated name through your is_var_name() helper function to verify that it is a valid variable identifier.
Collection Assembly: If the name is valid, it must be added to your set.
Return Value: After processing all tokens, the function must return the final set of variable names.

Python’s set object is ideal here because it automatically prevents duplicate values. You can insert elements into it using the set.add() method. You can check the last character directly using a negative index: token[-1] == '!', and to isolate the identifier name, you can easily strip the final character of a string using a slice like token[:-1].

Once you have implemented the function, you can verify its behavior by opening an interactive Python REPL session and running a few test cases:

>>> from chiqui_forth import find_vars_used
>>> find_vars_used(['a', 'b', '+', 'c!'])
{'a', 'c', 'b'}
>>> find_vars_used(['10', 'x!', 'x', 'y', '+', 'y!', '5', 'z!'])
{'y', 'x', 'z'}

4.11. ★ Exercise I: Map Memory and Variables

With our compiler now able to extract the complete set of unique variable names from the token stream, we can begin the code-generation phase for memory allocation. In WebAssembly, variables are treated as function-local storage. Before they can be used within our main program logic, they must be explicitly declared at the top of the function structure using the (local $name type) syntax.

In your chiqui_forth.py script, implement a helper function named declare_vars(vars) that complies with the following specification:

Result List Initialization: The function must initialize an empty list to accumulate the generated WebAssembly Text (WAT) instruction strings.
Declaration Assembly: It must iterate through the set of variable names (vars) in alphabetical order and generate a string declaration for each one. Each generated string must be indented with four leading spaces for clean code alignment and must declare the variable as a 32-bit integer (i32) using the exact format: (local $name i32).
Return Value: The function must return the final list containing all the generated WAT declaration strings.

Once you have implemented the function, you can verify its behavior by opening an interactive Python REPL session and running a few test cases:

>>> from chiqui_forth import declare_vars
>>> declare_vars({'y', 'x'})
['    (local $x i32)', '    (local $y i32)']
>>> declare_vars({'b', 'c', 'a'})
['    (local $a i32)', '    (local $b i32)', '    (local $c i32)']

4.12. ★★ Exercise J: Generate Target Instructions

Now that our compiler can declare variables and recognize numbers, predefined operations, and identifiers, we have arrived at the heart of the translation pipeline: the code generator. This component iterates through the clean token stream and synthesizes the corresponding executable WebAssembly Text (WAT) instructions for the stack machine.

In your chiqui_forth.py script, implement a function named code_generation(tokens) that complies with the following specification:

List Initialization: The function must initialize an empty list to accumulate the generated WAT instruction strings.
Translation Loop: It must iterate through the incoming list of tokens and translate each word based on its syntactic category:
- Numbers: If the token is a numeric literal, it must generate a WAT command to push that constant onto the stack: i32.const value.
- Predefined Operations: If the token exists in the OPERATION table, it must loop through all the WAT instructions associated with that key and append each one, ensuring they are indented with four leading spaces.
- Variable Fetching: If the token is a valid variable name, it must generate a WAT command to read the value from local storage and push it onto the stack: local.get $name.
- Variable Storing: If the token ends with the store operator (!) and the prefix is a valid variable name, it must generate a WAT command to pop the top value off the stack and store it in that variable: local.set $name.
Error Handling: If the function encounters any token that does not fit into any of the above categories, it must print the following error message to the standard error and immediately terminate execution with an exit code of 1:
```
Error: '<token>' is not a valid word.
```
Return Value: The function must return the final list containing all the synthesized WAT instruction strings.

Pay close attention to indentation. Every generated instruction string must be explicitly prefixed with four leading spaces to maintain clean, human-readable structure inside the final WebAssembly module output.

Once you have implemented the function, you can verify its behavior by opening an interactive Python REPL session and running a few test cases:

>>> from chiqui_forth import code_generation
>>> code_generation(['2', '3', '*', '.'])
['    i32.const 2',
 '    i32.const 3',
 '    i32.mul',
 '    call $print']
>>> code_generation(['10', 'x!', 'x', '5', '+', '.', 'nl'])
['    i32.const 10',
 '    local.set $x',
 '    local.get $x',
 '    i32.const 5',
 '    i32.add',
 '    call $print',
 '    i32.const 10',
 '    call $emit']

4.13. ★ Exercise K: Emit Target Artifacts

We have arrived at the final stage of our compiler pipeline: emitting the target files. Up until now, our synthesized WebAssembly Text (WAT) instructions have only existed as strings inside a Python list. To make our code runnable, we must inject these instructions into a complete WebAssembly module template, save it as a .wat text file, and compile it into a raw binary .wasm file using the wat2wasm() function from the wasmtime module.

At the global level of your script, define the following string template that represents our bare-bones WebAssembly module structure:

WAT_TEMPLATE = ''';; chiqui_forth compiler WAT output

(module
  (import "forth" "emit" (func $emit (param i32)))
  (import "forth" "input" (func $input (result i32)))
  (import "forth" "print" (func $print (param i32)))
  (func (export "_start")
{}
  )
)'''

This template defines a WebAssembly module that hooks into three host functions (emit, input, and print) — which we will define later in our runtime environment — to grant our compiled code native access to basic terminal input and output operations. The empty curly braces {} act as a structural format placeholder where your merged variable declarations and instruction body lines will be dynamically injected.

Next, implement a helper function named create_target_files(source_path, compiled_lines) that complies with the following specification:

Base Name Extraction: The function must take the incoming source_path and use Python’s os.path.splitext() utility to strip away the original file extension, isolating the clean base path name.
Template Merging: Utilize the str.join() method to concatenate the strings in compiled_lines (which contains both variable declarations and instructions) using newlines as delimiters. Then, use str.format() to inject the resulting string into the {} placeholder of WAT_TEMPLATE.
WAT Generation: It must create a text file using the isolated base name appended with the .wat extension, writing the full WebAssembly Text source code into it.
Wasm Generation: It must compile that same WAT string into binary format using the wasmtime.wat2wasm() function, and save the resulting bytes into a binary file using the base name appended with the .wasm extension.

Remember that when writing the text file, you should use standard 'w' mode, but when writing the compiled .wasm file, you must open it in binary write mode ('wb') since wat2wasm() returns a raw sequence of bytes rather than plain text.

Once you have implemented the function, you can verify its behavior by opening an interactive Python REPL session and running a test case:

>>> from chiqui_forth import create_target_files
>>> lines = ['    (local $x i32)', '    i32.const 42', '    local.set $x']
>>> create_target_files('test_program.4th', lines)

After executing the function, check your current directory from the terminal. You should see two brand-new files generated: test_program.wat (which you can open in any text editor) and test_program.wasm (the binary module ready for execution).

4.14. Putting It All Together

If you have successfully completed all exercises C through K then you have built every individual component required for our translation pipeline. Congratulations! Now, it is time to assemble these isolated pieces into a cohesive, working compiler.

To achieve this, we will provide a new version of the main() function that will serve as the central orchestrator, defining a clear boundary between the Front End (which analyzes and cleans the source code) and the Back End (which synthesizes and emits the executable WebAssembly code).

In your chiqui_forth.py script, overwrite the main() function with the following code:

def main():
    """Control all the steps carried out by the compiler."""

    # === FRONT END ===
    source_path = get_source_filepath() (1)
    raw_tokens = read_words(source_path) (2)
    tokens = remove_comments(raw_tokens) (3)

    # === BACK END ===
    variable_declarations = declare_vars(find_vars_used(tokens)) (4)
    instruction_body = code_generation(tokens) (5)
    compiled_lines = variable_declarations + instruction_body (6)
    create_target_files(source_path, compiled_lines) (7)

if __name__ == '__main__':
    main()

1	Retrieves the input file path supplied via the command-line arguments.
2	Reads the contents of the source file and splits them into an initial stream of raw, whitespace-separated text tokens.
3	Sweeps through the token stream to filter out all comments, leaving behind only actionable code words.
4	Scans the clean tokens to identify all unique variables and maps them to formatted WAT `(local …)` allocation statements.
5	Translates the operational tokens into their corresponding stack-machine WAT instructions.
6	Concatenates the structural variable definitions and the executable instructions into a unified list of output lines.
7	Injects the unified lines into our module template, saving the human-readable `.wat` file and generating the final executable `.wasm` binary.

4.15. The chiqui_forth Runtime

Now that our compiler can generate valid binary .wasm files, we need a runtime environment to execute them and provide the implementation for our host I/O functions. The following complete Python script, chiqui_forth_rt.py, leverages the wasmtime library to instantiate our compiled WebAssembly modules, link the required external host functions (emit, input, and print) directly to native Python operations, and trigger the exported _start function to run the program from the terminal.

File: chiqui_forth_rt.py

# File: chiqui_forth_rt.py
# Copyright (C) 2026 Ariel Ortiz
# SPDX-License-Identifier: GPL-3.0-or-later

""" The chiqui_forth runtime

To run, at the terminal type:

    python chiqui_forth_rt.py some_program.wasm
"""

from sys import argv, stderr, exit
from wasmtime import Engine, Store, Module, Linker, FuncType, ValType


def create_linker_and_store(engine):
    """Create a Linker and Store, and register the Python functions
    that will be callable from the WASM module.
    """
    store = Store(engine)
    linker = Linker(engine)

    #----------------------------------------------------------------
    # Functions to be imported from the WASM module.

    def _emit(x):
        print(chr(x), end='')

    def _input():
        try:
            return int(input())
        except ValueError:
            return 0

    def _print(x):
        print(x, end=' ')

    #----------------------------------------------------------------

    # Define the signatures and define them in the "forth" module
    linker.define_func("forth", "emit", FuncType([ValType.i32()], []), _emit)
    linker.define_func("forth", "input", FuncType([], [ValType.i32()]), _input)
    linker.define_func("forth", "print", FuncType([ValType.i32()], []), _print)

    return linker, store


def create_instance(file_name, engine):
    """Use wasmtime API to take care of all the details required to
    instantiate a module contained in a WASM file.
    """
    module = Module.from_file(engine, file_name)
    linker, store = create_linker_and_store(engine)

    # Instantiate the module using the linker and the store
    instance = linker.instantiate(store, module)
    return instance, store


def check_args():
    """Verify that there is one command line argument; if not, display
    an error message and exit.
    """
    if len(argv) != 2:
        print('Please specify the name of a Wasm binary file.',
              file=stderr)
        exit(1)


def main():
    """Control the steps to execute a Wasm module."""
    check_args()

    # Initialize the global Wasmtime Engine
    engine = Engine()

    instance, store = create_instance(argv[1], engine)

    # Look up and run the exported _start function
    start_func = instance.exports(store).get("_start")
    if start_func:
        start_func(store)
    else:
        print("Error: '_start' function not found in WASM module.", file=stderr)
        exit(1)


if __name__ == '__main__':
    main()

4.16. Complete chiqui_forth Examples

With both your compiler orchestrator (chiqui_forth.py) and your host runtime environment (chiqui_forth_rt.py) fully completed, you are ready to test the entire compilation and execution loop. Below are two sample program listings designed to test character output, structural comment stripping, dynamic variable allocation, and interactive terminal input.

Save the following code blocks into individual files within your active development directory, then follow the terminal steps to compile and execute them.

4.16.1. Example: The Ubiquitous Greeting

This program tests your tokenizer’s ability to cleanly strip multi-line comments and verifies that your code generator maps raw numerical character ASCII codes directly to the imported host emit function.

File: hello_world.4th

( File: hello_world.4th )
( This program displays "Hello, world!" in the screen )

72 emit 101 emit 108 emit 108 emit 111 emit 44 emit 32
emit 119 emit 111 emit 114 emit 108 emit 100 emit 33 emit
nl

To compile this program, pass the source file path to your compiler via the terminal:

python chiqui_forth.py hello_world.4th

Verify that your current working directory now contains two brand-new compilation artifacts: hello_world.wat and hello_world.wasm. Now, pass the compiled binary file into your wasmtime runtime script to execute it:

python chiqui_forth_rt.py hello_world.wasm

Expected output:

Hello, world!

4.16.2. Example: Variables and Mathematical Operations

This program tests your backend’s capability to parse custom variable names (x and y), declare them automatically as WebAssembly locals, safely handle interactive numeric inputs, and perform basic arithmetic stack operations (+ and *).

File: numbert.4th

( File: numbers.4th )
( Adds and multiplies two user provided numbers. )

62 emit 32 emit ( Print first prompt. )
input x!

62 emit 32 emit ( Print second prompt. )
input y!

( Print: x + y = result )
x . 43 emit 32 emit y . 61 emit 32 emit x y + . nl

( Print: x * y = result )
x . 42 emit 32 emit y . 61 emit 32 emit x y * . nl

Run the chiqui_forth compiler for this interactive script:

python chiqui_forth.py numbers.4th

Open the newly generated numbers.wat file in your text editor. You should observe that your declare_vars logic successfully synthesized two lines reading (local $x i32) and (local $y i32) directly at the beginning of the _start function block.

Finally, run the interactive program. The runtime will output a prompt character (>) and halt execution to await your console inputs:

python chiqui_forth_rt.py numbers.wasm

Expected output:

> 7
> 5
7 + 5 = 12
7 * 5 = 35

4.17. ★ Exercise L: Expanding the Operator Vocabulary

Now that your compiler can cleanly translate and execute basic chiqui_forth programs, it is time to expand its computational capabilities. Right now, our code generator only recognizes a tiny handful of operations. To build a truly practical stack machine, we need to support a complete set of arithmetic and relational operations.

Modify the OPERATIONS mapping dictionary in your chiqui_forth.py script so that the compiler supports all the operation words detailed in the following table. Pay close attention to WebAssembly’s sign-extension suffixes (_s) required for signed integer math and comparisons.

chiqui_forth Word Description

chiqui_forth Word	Description
\(\texttt{-}\)	Subtraction \[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textit{push}(x - y) \end{matrix}\] WAT instruction: `i32.sub`
\(\texttt{/}\)	Division \[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textit{push}(x \div y) \end{matrix}\] WAT instruction: `i32.div_s`
\(\texttt{=}\)	Equal \[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x = y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\] WAT instruction: `i32.eq`
\(\texttt{<>}\)	Not Equal \[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x \ne y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\] WAT instruction: `i32.ne`
\(\texttt{<}\)	Less Than \[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x < y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\] WAT instruction: `i32.lt_s`
\(\texttt{<=}\)	Less or Equal \[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x \le y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\] WAT instruction: `i32.le_s`
\(\texttt{>}\)	Greater Than \[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x > y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\] WAT instruction: `i32.gt_s`
\(\texttt{>=}\)	Greater or Equal \[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x \ge y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\] WAT instruction: `i32.ge_s`

$\texttt{-}$

Subtraction

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textit{push}(x - y) \end{matrix}\]

WAT instruction: i32.sub

$\texttt{/}$

Division

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textit{push}(x \div y) \end{matrix}\]

WAT instruction: i32.div_s

$\texttt{=}$

Equal

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x = y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\]

WAT instruction: i32.eq

$\texttt{<>}$

Not Equal

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x \ne y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\]

WAT instruction: i32.ne

$\texttt{<}$

Less Than

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x < y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\]

WAT instruction: i32.lt_s

$\texttt{<=}$

Less or Equal

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x \le y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\]

WAT instruction: i32.le_s

$\texttt{>}$

Greater Than

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x > y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\]

WAT instruction: i32.gt_s

$\texttt{>=}$

Greater or Equal

\[\begin{matrix} y \gets \textit{pop} \\ x \gets \textit{pop} \\ \textrm{if} \; x \ge y \; \textrm{then} \; \textit{push}(1) \; \textrm{else} \; \textit{push}(0) \end{matrix}\]

WAT instruction: i32.ge_s

Once you have updated your operational dictionary, verify your compiler’s extended vocabulary by compiling and executing the provided validation program:

File: operators.4th

( File: operators.4th )
( Tests all the operator words from exercise L. )
( Prints "Everything is working fine!" if everything is working fine. )

5  x!
12 y!

x y -
y x -
x x -

x y /
y x /
x x /

x y =
y x =
x x =

x y <>
y x <>
x x <>

x y <
y x <
x x <

x y <=
y x <=
x x <=

x y >
y x >
x x >

x y >=
y x >=
x x >=

68 + emit 117 + emit 101 + emit 114 + emit 120 + emit 116 + emit
103 + emit 105 + emit 109 + emit 103 + emit 32 + emit 104 + emit
115 + emit 31 + emit 118 + emit 110 + emit 114 + emit 107 + emit
104 + emit 108 + emit 103 + emit 32 + emit 95 + emit 112 + emit
110 emit 101 emit 33 emit
nl

Compile the source program:

python chiqui_forth.py operators.4th

Run the program:

python chiqui_forth_rt.py operators.wasm

If your dictionary mappings are structurally sound and mapping to the correct target WAT instructions, the evaluation suite will complete successfully and print the following confirmation message to your console:

Everything is working fine!

4.18. ★ Exercise M: Introducing Repetition

Our stack machine needs a mechanism to repeat execution paths. To achieve this, we will implement a do/loop repetition construct. Its syntax and behavior mirror Python’s structured while loops, providing a clean way to perform conditional iteration:

\[\texttt{do} \; \textit{condition} \; \texttt{?} \; \textit{body} \; \texttt{loop}\]

When this construct executes, the $\textit{condition}$ expression is evaluated first. If the resulting value on top of the stack is zero (representing false), execution immediately breaks out of the construct and jumps to the code following the loop keyword. If the condition is non-zero (true), the $\textit{body}$ code runs, the $\textit{condition}$ is evaluated once again, and the process repeats.

Note that a question mark (?) word is mandatory in this syntax to explicitly mark the boundary where the $\textit{condition}$ block ends and the executable $\textit{body}$ begins.

The following program example demonstrates how to use this construct to count from 1 to 10:

File: 1_to_10.4th

( File: 1_to_10.4th )
( Display numbers 1 to 10, each number on its own line. )

1 x!          ( Initialize x with 1. )
do
    x 10 <= ? ( Continue in loop while x is less than or equal to 10. )
    x . nl    ( Print the current value of x. )
    x 1 + x!  ( Increment x by 1. )
loop

Modify the OPERATIONS mapping dictionary in your chiqui_forth.py script so that the compiler supports the do, ?, and loop words by translating them to their corresponding WebAssembly structured control flow expressions as detailed below:

chiqui_forth Word Corresponding WAT Code Implementation

chiqui_forth Word	Corresponding WAT Code Implementation
\(\texttt{do}\)	`block loop`
\(\texttt{?}\)	`i32.eqz br_if 1`
\(\texttt{loop}\)	`br 0 end end`

$\texttt{do}$

block
loop

$\texttt{?}$

i32.eqz
br_if 1

$\texttt{loop}$

br 0
end
end

To gain a deeper understanding of how these target instructions manipulate structural execution frames, consult the official WebAssembly MDN reference documentation for the block, loop, br, and i32.eqz primitives.

WebAssembly control structures enforce strict stack rules. Any temporary values pushed to the stack inside your do/loop construct (during the condition or the body) must be completely consumed — either stored in a variable or printed — before the loop repeats or ends. If you leave trailing, unconsumed values on the stack, the WebAssembly runtime will reject your program with a validation error.

Once your dictionary mappings are in place, verify the pipeline by compiling and running the test case:

python chiqui_forth.py 1_to_10.4th

python chiqui_forth_rt.py 1_to_10.wasm

Expected output:

4.19. Advanced chiqui_forth Examples

To thoroughly evaluate your compiler’s brand-new looping capabilities, try compiling and running the following two sophisticated test programs. Both scripts rely on nested loop structures to compute and render patterns dynamically based on interactive terminal input.

Save these listings to your local directory and test them using any input value from 5 to 20.

4.19.1. Example: Text-Based Geometric Rendering

This program reads an integer from the user and leverages a nested do/loop control structure to output a right-angled triangle composed of asterisks.

File: triangle.4th

( File: triangle.4th )
( Draws a triangle of stars of a size provided by the user. )

62 emit 32 emit ( Print prompt. )
input n!
1 i!
do
    i n <= ?
    i j!
    do
        j 0 > ?
        42 emit ( Print * )
        j 1 - j!
    loop
    nl
    i 1 + i!
loop

Once you compile it and run it, the expected output is:

> 5
*
**
***
****
*****

4.19.2. Example: Tabulating Powers of Two

This program computes exponential mathematical sequences using iterative multiplication loops, cleanly formatting the calculation steps directly to the screen.

File: pow2.4th

( File: pow2.4th )
( Prints all the powers of two from 0 to a user provided value. )

62 emit 32 emit ( Print prompt. )
input n!
0 i!
do
    i n <= ?
    1 r!
    1 j!
    do
        j i <= ?
        r 2 * r!
        j 1 + j!
    loop
    2 . 94 emit 32 emit i . 61 emit 32 emit r . ( Print: 2 ^ i = r )
    nl
    i 1 + i!
loop

Compile and run the tabulator. The expected output is:

5. Additional Resources

5.1. Books

What Is WebAssembly?
By Colin Eberhardt
O’Reilly Media, Inc., 2019.
WebAssembly: The Definitive Guide
By Brian Sletten
O’Reilly Media, Inc., 2021.
ISBN: 978-1492089841

5.2. Online Resources

6. License and Credits

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License CC BY-NC-SA 4.0.
Free use of the source code presented here is granted under the terms of the GPL version 3 License.
This document was prepared using the AsciiDoctor text processor.
Icons by Flaticon.
Free use of the source code presented here is granted under the terms of the GPL version 3 License.
The author utilized Gemini, a large language model by Google, for drafting assistance and technical review of these tutorial notes.