best is a statically typed Lisp interpreter. Every construct is an
s-expression; a mandatory declare form immediately before every defun
carries the function's full type signature. A static type checker runs as a
separate pass over the AST before any evaluation begins, so type errors are
caught at compile time and cannot occur at runtime in a well-typed program.
(declare factorial (int) int)
(defun factorial (n)
(if (== n 0)
1
(* n (factorial (- n 1)))))
(print (factorial 10)) ; => 3628800| Tool | Minimum version | Purpose |
|---|---|---|
| Rust | 1.70 | compiles all hand-written code |
| flex(1) | 2.6 | generates the lexer from best.l |
| bison(1) | 3.0 | generates the parser from best.y |
The cc crate (a Cargo build-dependency) compiles the C files that flex(1) and
bison(1) emit -- no separate C toolchain setup is needed beyond having gcc(1) or clang(1)
on PATH, which is the default on Debian.
sudo apt install flex bison build-essential
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shClone the repository and run make(1):
git clone <repo-url>
cd best
make # debug build -> target/debug/best
make release # release build -> target/release/bestmake(1) is a thin wrapper around cargo build. The build script (build.rs)
invokes flex --posix and bison --yacc automatically -- you do not need to
run them by hand. Generated C files are written to Cargo's OUT_DIR and are
never committed to the repository.
To run all bundled example programs:
make examplesTo verify the project compiles without producing a binary:
make checkTo remove all build artifacts:
make clean./target/debug/best examples/factorial.best # 3628800
./target/debug/best examples/fibonacci.best # 55
./target/debug/best examples/while_loop.best # 10 9 8 ... 1
./target/debug/best examples/strings.best # Hello, World / 12| Type | Literals |
|---|---|
int |
0, 42, 1000 |
float |
3.14, 0.5 |
bool |
true, false |
str |
"hello", "world" |
void |
(no literal; returned by statements) |
; Variable declaration (immutable type, mutable value)
(let x int 5)
; Assignment
(assign x 10)
; Conditional -- both branches must have the same type
(if (> x 0) (print x) (print 0))
; Loop -- body is one or more expressions
(while (> x 0)
(print x)
(assign x (- x 1)))
; Print any non-void value
(print x)
(print "hello")A declare form carrying the full type signature must immediately precede
the corresponding defun. Any other form between them is an error.
(declare add (int int) int)
(defun add (a b)
(+ a b))Function bodies are a single expression (use if, while, or nested calls to
compose multiple steps).
| Category | Operators |
|---|---|
| Arithmetic | + - * / |
| Comparison | == != < > <= >= |
| Boolean | and or not |
| String | concat length |
; begins a comment that runs to the end of the line.
S-expression syntax was chosen for four concrete reasons.
-
Homoiconicity. Code and data share the same representation. This makes macros, code generation, and metaprogramming natural extensions of the language rather than bolted-on mechanisms.
-
No ambiguity by construction. S-expressions are fully parenthesised. There is no grammar ambiguity to resolve, no shift/reduce conflicts to suppress, and no need for precedence or associativity declarations. The grammar has five rules and generates a conflict-free LALR(1) automaton without any tuning.
-
No operator precedence.
(+ 1 (* 2 3))is explicit about evaluation order. There is nothing to memorise and nothing to get wrong. -
The AST is the source. Because the written form and the parsed form are structurally identical, a pretty-printer, a source formatter, or a code-generating tool is trivial to write -- just serialise the AST back to s-expressions.
The grammar is written in strict POSIX Yacc format (IEEE Std 1003.1). This is a
deliberate portability choice: a POSIX Yacc .y file is a self-contained,
formally standardised artifact. Anyone with any POSIX-compliant yacc(1) tool --
Berkeley yacc(1), byacc(1), bison(1) in yacc-compatibility mode -- can take best.y and
regenerate the parser independently of this project. No vendor lock-in, no
proprietary grammar extensions.
The .y file therefore contains no bison(1)-specific directives. There is no
%define, no %code, no %language, no %locations, no %skeleton, and no
%empty. The yyerror function is defined by hand as required by the POSIX
standard. The empty alternative in the program rule is expressed as a bare |
rather than the bison(1)-specific %empty keyword.
The --posix flag instructs flex(1) to enforce compliance with the POSIX
1003.2-1992 lex specification. The best.l file therefore:
- Contains no
%optiondirectives (a flex(1) extension). - Defines
yywrap()explicitly to return 1 at end-of-input, as required by POSIX lex. - Uses only constructs defined in the POSIX lex standard: named character-class
definitions in the definitions section (
DIGIT,ID_CHAR), string-literal rules, bracket expressions, and the+and*quantifiers.
The result is a lexer specification that is portable to any POSIX lex implementation.
The --yacc flag puts bison(1) into strict POSIX compatibility mode. It produces
the standard output file names y.tab.c and y.tab.h, and rejects any grammar
construct that is not valid POSIX Yacc.
The parsing algorithm is LALR(1) -- the algorithm mandated by the POSIX Yacc standard. LALR(1) has three properties that matter here:
-
Static ambiguity detection. Any ambiguity in the grammar produces a shift/reduce or reduce/reduce conflict that bison(1) reports at grammar-compile time and refuses to silently resolve. The grammar is proven unambiguous before the parser is ever run on source code. PEG parsers, by contrast, use ordered choice to silently discard ambiguous alternatives, giving no static guarantee.
-
Linear time. LALR(1) parsers run in O(n) time with no backtracking.
-
Left-recursion support. The
expr_listrule is left-recursive (expr_list : expr_list expr | expr), which is the natural form for a sequence. LALR(1) handles left recursion directly. Recursive-descent and LL parsers cannot.
LALR(1) vs canonical LR(1). LALR(1) is strictly weaker than canonical LR(1): there exist grammars that are LR(1) but not LALR(1). The Dragon Book gives a concrete example (Example 4.44 in the first edition, 4.58 in the second): a grammar over
{a, b, c, d, e}that generates exactly four strings. An LR(1) automaton builds two distinct states with identical item sets but different lookaheads -- no conflicts. The LALR algorithm merges states whose item sets are the same, collapsing those lookaheads and introducing a reduce/reduce conflict that did not exist before. The grammar is LR(1) but not LALR(1). In practice this gap is irrelevant for best: a five-rule s-expression grammar cannot produce the kind of overlapping lookahead sets that trigger it. POSIX Yacc mandates LALR(1), and we accept that constraint knowing the grammar is well within it.
The grammar itself is minimal -- five rules -- because s-expression syntax is structurally uniform. All syntactic structure is expressed through nesting, not operator precedence or associativity rules. There are no shift/reduce conflicts.
| Algorithm | Ambiguity | Backtracking | Left recursion | Standard |
|---|---|---|---|---|
| LALR(1) | Static error | None | Native | POSIX |
| PEG | Silent discard | None | Requires rewrite | No |
| LL(*) | Runtime detection | Possible | Requires rewrite | No |
| Earley | Runtime detection | N/A (chart) | Native | No |
ANTLR4 (Adaptive LL(*)) was considered and rejected: it defers ambiguity detection to runtime, has no stable officially maintained Rust target, and is not standardised.
For a thorough academic treatment of these trade-offs see Laurence Tratt, Which Parsing Approach? (2020).
All hand-written code outside best.l and best.y is Rust. The only C in the
project is generated by flex(1) and bison(1); it is never touched by hand. This gives
memory safety, a strong type system, and zero-cost abstractions for the AST,
type checker, and evaluator without requiring a C runtime in the final binary.
The semantic actions in best.y call Rust functions declared extern "C". The
AST nodes are allocated on the Rust heap using Box::into_raw and passed
through the grammar as opaque void * pointers. Ownership is reclaimed on the
Rust side with Box::from_raw when the node is consumed. No C data structures,
no C logic beyond what bison(1) requires syntactically.
build.rs invokes flex(1) and bison(1) as build steps, then compiles the generated C
files using the cc crate (a build-time dependency only -- zero runtime cost).
The final binary is a standalone executable with no shared-library dependencies
beyond libc.
The type checker is a separate pass over the AST that runs before any evaluation begins. This separation is intentional:
- Correctness. All type errors are reported before the program produces any output or side effects. A program either type-checks completely or does not run at all.
- Clarity. Type errors carry source locations and precise messages. The evaluator can then assume well-typedness and omit redundant checks.
The checker does a pre-pass to collect all declare signatures before type-
checking any bodies. This allows mutual recursion and forward calls between
functions regardless of definition order, while still requiring declare to
immediately precede its defun.
best is Turing-complete via the combination of while (unbounded
iteration), assign (mutable state), if (conditional branching), and
recursion. Recursive functions achieve the same expressive power as general
recursion.
best/
best.l POSIX lex lexer grammar
best.y POSIX yacc parser grammar
build.rs invokes flex and bison; compiles generated C via cc crate
Cargo.toml cc as build-dependency only
Makefile convenience targets: build, release, examples, clean
examples/
factorial.best
fibonacci.best
while_loop.best
strings.best
src/
main.rs entry point; calls parse_file, type checker, evaluator
ast.rs Expr type; all extern "C" FFI constructors
error.rs BestError with source location
env.rs TypeEnv and EvalEnv
typecheck.rs static type checker pass
eval.rs tree-walking evaluator