ats is a language that is a
little infamous among type system likers, for its complexity, its weird
syntax, and its thin documentation. well i won’t be beat, so here’s a
little implementation of simply-typed lambda calculus. right now it
doesn’t actually use many ats-specific features, since i just didn’t end
up needing them.
it does leak memory without gc though, so maybe i should have. oh
well one thing at a time
installation, et cetera
instead of trying to install ats myself, i just used nix, since i already have it installed as
the least annoying package manager for idris. i just run
nix shell nixpkgs#ats2
and i’m in a shell with
patscc
and patsopt
available. if you want to
try building it yourself, that’s, uh, your choice.
to compile the program, i wrote a makefile. just a small one!
ATSCC := patscc
ATSFLAGS := \
-DATS_MEMALLOC_GCBDW -lgc \
-atsccomp \
'gcc -I$${PATSHOME} -I$${PATSHOME}/ccomp/runtime \
-L$${PATSHOME}/ccomp/atslib/lib -std=gnu99'
%: %.dats
$(ATSCC) $< $(ATSFLAGS) -o $@
(in some installs the compiler is just called atscc
for
some reason.)
the first line of ATSFLAGS
tells it which allocator to
use. in this case, libgc
. the rest tells it to use
-std=gnu99
, so that it can find alloca()
; the
rest is the same as the default.
one last thing. the first two lines of the program have to be this. i
didn’t look into it too hard but they seem to define… something… and if
you forget to include them the c output by ats isn’t even syntactically
correct. ouch.
#include "share/atspre_define.hats"
#include "share/atspre_staload.hats"
types
let’s start by defining the syntax for types. it’s going to be a
single base type, and functions.
\[
\newcommand\Typ{\mathit{typ}}
\newcommand\OR{\mathrel|}
\Typ ::= \bullet \OR \Typ_1 \to \Typ_2
\]
so obviously, my first attempt was to just write this:
datatype typ = Base | Arr of (typ, typ)
i’m going to need to check two types for equality, so let’s do that
next.
fun eq_typ (a : typ, b : typ) .<a>. :<> bool =
case (a, b) of
| (Base(), Base()) => true
| (Arr(a1, a2), Arr(b1, b2)) => eq_typ(a1, b1) && eq_typ(a2, b2)
| (_, _) => false
alas: (it is talking about the .<a>.
)
/home/niss/ats/lam.dats: 149(line=7, offs=33) -- 150(line=7, offs=34):
error(2): the static identifier [a] is unrecognized.
surprise! as far as i can tell ats only allows termination checking
to be based on static (compile time) information. so i need to
index the datatype with, uh, something. when we define functions on
syntax like this, they basically always recurse on a subterm. so the
height of the syntax tree will do. what i actually have are
these more messy definitions.
datatype typ(int) =
| Base(1)
| {h1, h2 : nat} Arr(1 + max(h1, h2)) of (typ(h1), typ(h2))
typedef typ = [h : nat] typ(h)
the base type \(\bullet\) has height
1, and an arrow type has a height one more than the max height of its
subterms. nat
is a version of int
constrained
to be at least zero, rather than a separate type, so that’s why it says
int
one place and nat
the other. the
typechecker has a linear arithmetic solver for reasoning about
inequalities and expressions like 1 + max(h1, h2)
.
i don’t want to be talking about “types of height \(h\)” unless i really have to, though, so
there’s a definition for just a type too. the syntax
[x : t1] t2
is an existential type that is automatically
wrapped and unwrapped, so in this case it is a typ(h)
for
some, unknown, h
. also, names can be overloaded by their
arity so typ(h)
and typ
can both be defined in
the same scope.
so now eq_typ
has to have a more intricate type to talk
about height. the definition is still the same though.
fun eq_typ {ha : nat} .<ha>. (a : typ(ha), b : typ) :<> bool =
case (a, b) of
| (Base(), Base()) => true
| (Arr(a1, a2), Arr(b1, b2)) => eq_typ(a1, b1) && eq_typ(a2, b2)
| (_, _) => false
overload = with eq_typ
since i only actually need to care about one of the arguments’ height
for the termination proof, the other can still just be a
typ
. the last line lets me write a = b
instead
of eq_typ(a, b)
, which is nice.
next, i want to be able to print types. ideally this would be a
function that returns a string, but string stuff involves pointer
weirdness i haven’t worked out yet. so it just prints. the only two
precedence levels in these types are “left of an arrow” and “not left of
an arrow”, so that can just be a bool
. unfortunately
print
has no information about its termination behaviour so
there’s no point bothering with that stuff here either. 😕
fun print_typ (parens : bool, t : typ) : void =
case t of
| Base() => print "•"
| Arr(a, b) => begin
if parens then print("(");
print_typ(true, a);
print(" → ");
print_typ(false, b);
if parens then print(")");
end
fn print_typ (t : typ) : void = print_typ(false, t)
overload print with print_typ
terms
types are done, now let’s move on to terms. we just want names,
lambdas, and application.
\[
\newcommand\Term{\mathit{term}}
\newcommand\Var{\mathit{var}}
\Term ::= \Var \OR \lambda \Var : \Typ. \Term \OR \Term_1 \; \Term_2
\]
for the same reason as before, terms are indexed by their height, but
also by the number of variables in scope, since i want de
bruijn indices to be intrinsically well-scoped. why even have a type
system otherwise. so a term(n, h)
has \(n\) variables and height \(h\).
datatype term(int, int) =
| {n, i : nat | i < n} Var(n, 1) of int(i)
| {n, h : nat} Lam(n, 1 + h) of (typ, term(1 + n, h))
| {n, h1, h2 : nat} App(n, 1 + max(h1, h2)) of (term(n, h1), term(n, h2))
typedef term(n : int) = [h : nat] term(n, h)
a variable is a de bruijn
index, which is a natural less than n
. to express this
we need to bridge the static world, where n
lives, and the
dynamic, where Var
’s field lives. so i’m using a singleton
type int(i)
, whose only value is i
itself.
with this type, you can constrain the static i
, and in turn
the value of the field.
contexts
typedef ctx(n: int) = list(typ, n)
a typing context is just a list of types of the appropriate length.
since there’s no binding in types here, that’s all we need.
looking up into a context is just list lookup, with some constraints
to make it work.
fun lookup {n, i : nat | i < n} .<n>. (g : ctx(n), i : nat(i)) :<> typ =
let val+ list_cons(x, g) = g in
if i = 0 then x else lookup(g, i - 1)
end
from the constraints \(n \ge 0, i \ge 0, i
< n\), the compiler can see that, as \(n \ge 1\), the list is nonempty and
list_cons
is the only case. it can also propagate
the \(i \ne 0\) condition from the
else
to see the recursive call is ok too, and this function
is total.
typechecker
i’m going to write the type checker using exceptions for error
handling, because using an error monad without do
notation
was just too annoying. but from the outside it is still going to use a
result
type.
datatype result(t@ype, t@ype) =
| {a, e : t@ype} Ok(a, e) of a
| {a, e : t@ype} Err(a, e) of e
the funky looking t@ype
refers to a type of unknown
size, as opposed to type
which can only be instantiated
with pointer-sized types. that will not come up again here but
“t@ype
” is definitely the funniest part of ats’s
syntax.
the errors we need here are:
- a function takes an argument of type \(A\), but is given one of type \(B\);
- a term of a non-arrow type is being applied.
datatype err =
| Clash of (typ, typ)
| NotArr of typ
fn print_err (e : err) : void = … // incredibly uninteresting
overload print with print_err
typedef typ_res = result(typ, err)
fn print_res (r : typ_res) : void = …
overload print with print_res
so onto the typechecker. like i said, it’s going to use a local
exception for errors, but then catch it on the outside so that it has a
pure type.
fn typ_of {n : nat} (g : ctx n, t : term(n)) :<> typ_res =
let
exception ERR of err
fun go {n, h : nat} .<h>. (g : ctx(n), t : term(n, h)) :<!exn> typ =
case t of
| Var(i) => lookup(g, i)
| Lam(a, t) => Arr(a, go(list_cons(a, g), t))
| App(s, t) => begin
case go(g, s) of
| Arr(a1, b) =>
let val a2 = go(g, t) in
if a1 = a2 then b else $raise ERR(Clash(a1, a2))
end
| b => $raise ERR(NotArr(b))
end
in
$effmask_exn(try Ok(go(g, t)) with ~ERR(e) => Err(e))
end
fn typ_of (t : term(0)) :<> typ_res = typ_of(list_nil, t)
most of this is fine. a bit noisy i guess. but there are a couple of
new things:
- in the type of
go
, the :<!exn>
part
means it can throw exceptions, but has no other effects.
$effmask_exn(e)
hides the exn
effect in
e
, because ats doesn’t distinguish different exceptions in
effects so it can’t tell that ERR
is the only one that is
ever raised.
- exceptions are linear, and the pattern
~ERR(e)
is
consuming the outermost constructor.
- exceptions are the only reason we needed
alloca
in this
program 🥴
the end
ok, a quick main
function, and that’s it!
implement main0() = begin
print("type of [λx:•. x] is: ");
print(typ_of(Lam(Base, Var(0))));
print("\ntype of [λx:•. x x] is: ");
print(typ_of(Lam(Base, App(Var(0), Var(0)))));
print("\n")
end
and it works, at least on these two examples.
type of [λx:•. x] is: • → •
type of [λx:•. x x] is: expected an arrow type, but got [•]
uh. thank’s for reading. maybe i’ll write another one if i think of
another nice little program that actually uses more of ats’s
weirdness.
full code
#include "share/atspre_define.hats"
#include "share/atspre_staload.hats"
datatype typ(int) =
| Base(1)
| {h1, h2 : nat} Arr(1 + max(h1, h2)) of (typ(h1), typ(h2))
typedef typ = [h : nat] typ(h)
fun eq_typ {ha : nat} .<ha>. (a : typ(ha), b : typ) :<> bool =
case (a, b) of
| (Base(), Base()) => true
| (Arr(a1, a2), Arr(b1, b2)) => eq_typ(a1, b1) && eq_typ(a2, b2)
| (_, _) => false
overload = with eq_typ
fun print_typ (parens : bool, t : typ) : void =
case t of
| Base() => print "•"
| Arr(a, b) => begin
if parens then print("(");
print_typ(true, a);
print(" → ");
print_typ(false, b);
if parens then print(")");
end
fn print_typ (t : typ) : void = print_typ(false, t)
overload print with print_typ
datatype term(int, int) =
| {n, i : nat | i < n} Var(n, 1) of int(i)
| {n, h : nat} Lam(n, 1 + h) of (typ, term(1 + n, h))
| {n, h1, h2 : nat} App(n, 1 + max(h1, h2)) of (term(n, h1), term(n, h2))
typedef term(n : int) = [h : nat] term(n, h)
typedef ctx(n: int) = list(typ, n)
fun lookup {n, i : nat | i < n} .<n>. (g : ctx(n), i : int(i)) :<> typ =
let val+ list_cons(x, g) = g in
if i = 0 then x else lookup(g, i - 1)
end
datatype result(t@ype, t@ype) =
| {a, e : t@ype} Ok(a, e) of a
| {a, e : t@ype} Err(a, e) of e
datatype err =
| Clash of (typ, typ)
| NotArr of typ
fn print_err (e : err) : void =
case e of
| Clash(a, b) => begin
print("clash between types [");
print(a);
print("] and [");
print(b);
print("]");
end
| NotArr(t) => begin
print("expected an arrow type, but got [");
print(t);
print("]");
end
overload print with print_err
typedef typ_res = result(typ, err)
fn print_res (r : typ_res) : void =
case r of Ok(t) => print(t) | Err(e) => print(e)
overload print with print_res
fn typ_of {n : nat} (g : ctx n, t : term(n)) :<> typ_res =
let
exception ERR of err
fun go {n, h : nat} .<h>. (g : ctx(n), t : term(n, h)) :<!exn> typ =
case t of
| Var(i) => lookup(g, i)
| Lam(a, t) => Arr(a, go(list_cons(a, g), t))
| App(s, t) => begin
case go(g, s) of
| Arr(a1, b) =>
let val a2 = go(g, t) in
if a1 = a2 then b else $raise ERR(Clash(a1, a2))
end
| b => $raise ERR(NotArr(b))
end
in
$effmask_exn(try Ok(go(g, t)) with ~ERR(e) => Err(e))
end
fn typ_of (t : term(0)) :<> typ_res = typ_of(list_nil, t)
implement main0() = begin
print("type of [λx:•. x] is: ");
print(typ_of(Lam(Base, Var(0))));
print("\ntype of [λx:•. x x] is: ");
print(typ_of(Lam(Base, App(Var(0), Var(0)))));
print("\n")
end