Lisp interpreter
- Outline
- Memory
- Parsing
- Printing
- Wrapper
- Eval
  - Variable dereferencing
  - Function calls
  - Builtins :
    Quote | Cond | Defun | Lambda | Car | Cdr | Cons | Atom | Eq | Arithmetic helpers | Arithmetic primitives | Foreign functions
  - Closure calls
- Addons

Lisp interpreter

We know enough about how Lisp works and what we could use it for. It sure would be grand if we actually had something to make that happen. I’ll explain how to write one, plus a suite of modifications we could implement to varying effects (type tagging, garbage collectors, foreign calls, etc).

The interpreters will be in C. It is a language high-level enough as to be portable and conveniently usable (don’t worry, we’ll deal with assembly in the next article), whilst still being simple and hardware-oriented enough as to require us to think about the specifics of the implementation (such as parsing input, encoding data, and memory management) - details that most evaluators tend to glance over.

Edit 20250727: Yeah so I was mostly done with this file before I came up with a tiny tweak I could add to the interpreter, and now it’s been over a year and there’s a lot more stuff going on. The concepts page will be fun to amend, but I’ll probably have to rewrite this file from scratch yet again. Yet, it’s the most recent and complete version I have, so here it is.

Edit 13092025: Restructured it to keep most of the contents, and I appended the new stuff at the end. Note to self, please pick a consistent timestamp format and fucking stick to it, this is impossible to deal with.

Outline

Before we get started, we should know what exactly it is we will be making. This will be a Lisp interpreter, meaning our program will take Lisp code as input and do whatever that code specifies. This will require us to keep track of data, evaluation, and environments. Being homoiconic (‘code and data are the same thing in our language’), we don’t need much to store stuff - the expressions we are evaluating and the data they operate on can use the same memory, and with a bit of care, so can the environments they use. We need said environments to keep track of the values of variables (a problem that gets significantly more complicated in the compiler). The evaluation is easy - the execution order of the interpreter mirrors that of the statement being interpreted, meaning we don’t need to think much about it (in practical terms, this means the C callstack is the one used to accomplish recursion etc).

We will not be tagging pointers nor collecting garbage in the initial implementation, to keep it simple. We will have separate memory for atoms and lists - this works out well, since we will be implementing a purely functional subset of Lisp (ie no set! or similar). We will kludge together some arithmetic primitives to show the principle, but it will not be a particularly powerful language in that regard - type tagging would allow us to do a lot more there.

Memory

Well, let’s get started. We will need some standard libraries, some memory for atoms and lists, and pointers (rather, indices) to that memory telling us how much of it is still available. Implementing pointers as array indices is convenient, because it allows us to use a simple int type for all internal data (well, except atoms - we could, but we’d be wasting a lot of memory, and would need extra conversion routines):

#include<stdio.h>
#include<stdlib.h> ; exit() on memory overflows
#define MEMSIZE 10000
int lm,ls,mem[MEMSIZE];
char str[MEMSIZE],la;

We have memory, so let’s write some routines for using it - implemented as define macros, since it makes little sense to write entire functions for simple parts like this. We’d like car and cdr for list traversal, iscons and isstr telling us whether an element is an atom or a list, and getstr that takes our internal representation and returns a C-style null-terminated ASCII string (since our atomic memory is already implemented with those, that’s just a matter of returning a valid pointer). Since we don’t tag pointers (technically, everything inside the list memory is a pointer), we’ll need some other way of figuring out what they point to. Since we have a fixed amount of memory anyway, we could say that any value less than MAX is pointing to something inside list memory, and anything above that is inside the atomic memory - as if the two were laid out one after another in memory (which they also happen to be, but trying to make use of that fact will require some nasty code tricks that are not worth it here). As for car and cdr, a pointer to a concell points to its first element, and the second is placed right after - meaning, implementing the two takes a simple dereferencing.

#define car(x) mem[x]
#define cdr(x) mem[x+1]
#define iscons(x) (0<x&&x<MEMSIZE)
#define isstr(x) (MEMSIZE<x&&x<2*MEMSIZE)
#define getstr(x) (str+x-MAX)

We’ll need a way to make use of this memory. Lists are a bit easier - the cons primitive allocates memory for a concell (via the lm index), then puts new values inside that memory. We can conveniently also check for memory overflows here:

int cons(int car,int cdr){
    mem[lm++]=car;mem[lm++]=cdr;
    if(lm>MEMSIZE){printf("out of cons memory!\n");exit(-1);}
    return lm-2;
}

Atomic memory is harder - appending a new string to our existing memory is easy enough, however the signifying property of atoms is that they only occur once - meaning, if we try inserting the atom defun-macro when it is already present in memory, we should return a reference to the previous instance instead of making a new one.

To do this, we will put the string we want to internalize (technical term for the above) at the end of our string memory, without advancing the memory pointer (the ls in the first piece of code that tells us how much atomic memory we’ve used up). intern() can then go through all previous atoms and try matching it (essentially reimplementing strcmp). If we did not find the new string, we advance our memory pointer to capture it, making it a proper atom. If we did find a match, however, we return a reference to that without advancing the pointer - meaning that it is not considered part of memory, and gets overwritten the next time we try interning a string. The code for this was inspired by SectorLISP and looks a bit messy:

int intern(){
    for(int =0;i<ls;i++){
        char c=str[i++];
        for(int j=0;;j++){
            if(c!=str[ls+j])break;
            if(!c)return i-j-1+MEMSIZE;
            c=str[i++];
        }while(str[i++]);
    }

    ; no match found, intern new atom
    int ret=ls;while(str[ls++]);
    if(ls>MEMSIZE){printf("out of atomic memory!\n");exit(-1);}
    return ret+MEMSIZE;
}

Speaking of atomic memory, we’d like it to contain atoms for the built-in functions when our interpreter is started. This is technically not necessary (they will get filled in as we encounter them during execution), but putting them in ourselves on a blank slate means we know exactly where they will be in memory - which makes it a lot easier to check for them in the body of the interpreter. While we’re at it, let’s combine this with some more general initialisations like initializing memory and the pointers:

void init(){
    const char initmem[71]="nil\0t\0defun\0lambda\0cond\0atom\0"
        "eq\0car\0cdr\0cons\0quote\0print\0mem\0+\0-\0*\0/\0%\0";
    lm=2,ls=71;
    for(int i=0;i<MAXMEM;i++)mem[i]=str[i]=0;
    for(int i=0;i<ls;i++)str[i]=initmem[i];
}

In Lisp, nil is a special value meaning ‘end of list’. We’d have to encode this somehow. Since it is supposed to be nothing, it would be convenient if we could encode it as a pointer whose value is zero. But oh noes, that is still a valid pointer! What if there was something in that memory?! Once again, the solution to this is stupidly simple: don’t put anything in that memory. Our memory is initialised to all zeroes (ie everything points) to nil, and by setting the initial memory pointer to 2 (ie skipping the first concell - since concells each store two values, they take two indices), we make sure the nil concell remains empty (well okay, we actually end up putting stuff there later on, but as long as we make sure to check for nil, it all works out). Likewise, we make “nil” (meaning, the letters N then I then L then a null byte) the first atom into the atomic memory - thus, if we made an atom with zero offset, it’ll be treated as the atom nil (meaning, it’ll be mapped to the string “nil” - ideally we’d keep the two concepts separate, but it’s helpful to have a meaningful value for both. Also, I was not arsed to recalculate the hardcoded magic constants below.)

Parsing

The code/data we are to operate on is supplied as a string. We’ll need a method of converting it into the nested lists of atoms that Lisp internally works with. For this we need a parser. Usually this is split into a tokenizer and the like, but we won’t need that for a simple syntax like ours. We will, however, need some helper functions:

FILE*fd;
char getnext(){return la=getc(fd);}
int iswhite(char a){return a==' '||a=='\n'||a=='\t';}
char white(){while(iswhite(la))getnext();return la;}

The char la we declared all the way at the beginning is the ‘lookahead character’ - ie, we store the last byte we read so we don’t have to keep asking for it. getnext is the function that actually asks for it, using a file pointer - we set that one in our main function to point to wherever our code is. The unfortunately named iswhite function tells us whether a character is whitespace - spaces, tabs, and newlines, all characters we don’t actually consider parts of our code. white uses that function to skip over them. Here is a slight modification to the last function from the ExBot source to also skip over comments (denoted with ;):

char white(){
    while(1){
    if(iswhite(la))getnext();else
    if(la==';')while(la!=' ')getnext();
    else break;}
}

Now, for the actual parser. Technically it also includes a tokenizer - however, the only tokens we care about are “parenthesis” and “not that”, so that part’s comparatively easy.

If the next token is a closing paren, the list we are currently reading is over - we return zero to nil-terminate it. We will also treat EOF (end-of-file) as a closing paren in order to close off any lists that might still be open by the end of input. Otherwise, we allocate memory for a concell - whatever else we parse, it will be part of a list, thus it will need a list entry.

If the next token is an opening paren, the next thing is a list. Fortunately, we can simply call the parse function recursively to parse it and get a pointer. We allocated memory first because otherwise it would’ve been overwritten by the recursive call.

Everything else we treat as an atom. We fetch until the next whitespace or parenthesis (that’s the tokenizer part), putting input at the end of string memory, then calling intern from before to take care of the rest.

Regardless of what we parsed in, we put it in our allocated list entry, and call parse recursively to get us the rest of the list in the cdr of the entry. Then we return said entry. In code, all that looks as follows:

int parse(){
    if(lm>=MEMSIZE||ls>=MEMSIZE){printf("out of parse memory!\n");exit(-1);}
    white();if(la==')'||la==EOF){getnext();return 0;} // nil-terminate

    int ret=lm;lm+=2; // allocate one concell
    if(la=='(')getnext(),mem[ret]=parse(); // recurse for lists
    else{char*s=str+ls; // atomic - put here
        while(!iswhite(la)&&la!='('&&la!=')'&&la!=EOF&&ls<MAX)
            *s++=la,getnext();
        *s=0;mem[ret]=intern(); // null-terminate, then intern
    }
    mem[ret+1]=parse(); // recurse for rest of list
    return ret;
}

Printing

We can convert input, would be nice if we also had output. In this case it’s slightly more complicated than usual since we have environments, which in some cases can end up being circular. This can easily be handled by just passing the environment as well, and adding some code to make sure we don’t output it. We’ll also use a wrapper to add a newline at the end of input, since we output everything on a single line, and that would otherwise mess up output.

void pprint_(int expr,int env){
    if(!i){printf("<nil>");return;}
    if(isstr(i)){printf("<%s>",getstr(i));return;}
    if(i==env){printf("<env %i>)",i);return;}
    if(i<env)env=i; // to make sure closures dont loop
    putchar('('); XXX
}
void pprint(int expr,int env){pprint_(expr,env);putchar(10);}

merging code from lisp/v1 and spc/lisp, see to it that it makes sense also, wrapper func for newline, and see if i didnt have indenting code somewhere

Wrapper

We’ll jump ahead a bit, assume we have written the eval function, and see what main looks like. This way, we’ll be done with everything else and can focus entirely on the implementation of eval for the rest of the article.

The main function is the one that puts it all together and lets us actually use the code. It calls init to set up memory, does its own little set-up routine, and drops into a REPL - ‘read-eval-print-loop’, which reads one unit of code, evaluates it, prints its result, and repeats for the next one (or exits if there’s no more input). The setup in main keeps track of the current ‘global’ environment and the current expression, and it also conveniently handles input, allowing us to either run interactively in a console, or executing code from a file.

int main(int argc,char**argv){
    init();
    fd=argc>1?fopen(argv[1],"r"):stdin;
    if(!fd){printf("cannot read file\n");return -1;}
    while(1){
        if(argc==1)printf("> "),fflush(0);
        if(feof(fd))break;
        getnext();white();getnext();
        pprint(
            eval(
                parse(),
                mem[0]),
            mem[0]);
    }fclose(fd);putchar(10);return 0;
}

Let’s take a look at that mem[0] up there. The paragraph on ‘variable dereferencing’ below goes into more detail on how the values of variables are stored, but long story short, we use ‘environments’ for this, and every expression can only be evaluated in an appropriate environment. Since we made sure to never use the first two memory units (in order to encode nil), we have two unused units of memory - let’s assign the first of those the role of the ‘global environment’, containing all variables that are not otherwise contained in a function (meaning, all the stuff we defun) (more on how and why of environments in the paragraph on ‘closure calls’).

Eval

Ho boy here we go, finally some code that actually does something interesting. eval is the function that takes an expression and an environment, and figures out what the value of that expression is in that environment. My reference implementation is about a hundred lines of code- which is really not a lot (the Emacs source is 777.642 lines), but it is still a lot to take in at once, so I’ll be splitting it in parts.

Variable dereferencing

First off, environments. The ‘why’ should be pretty clear by now. The ‘what’ is a list of variables and their values (also known as ‘variable bindings’). This makes the ‘how’ pretty easy: we keep a list matching variable names to their values. There’s a couple ways to do this (William Byrd has done some fun stuff with those), but we’ll be keeping it easy - each entry in that list is a concell, its car is a variable name (ie an atomic symbol), its cdr that variable’s value.

Note: Byrd implements environments as nested closure functions. This is convenient in the sense that if you have closures, you don’t need extra code for environments, but it is inconvenient because it requires closures before the concept of an environment. We struggle around the distinction more in the compiler, but for now, we will be defining environments in a way that is easy to handle for the interpreter (ie with loops in C).

So, to begin, let us write an eval that takes a variable name and goes through the environment, returning its value if it finds one, or the variable name itself if not (since symbols are self-evaluating):

int eval(int expr,int env){
    if(!expr)return 0;
    if(isstr(expr)){ // variable dereferencing
        for(int bind=env;bind;bind=cdr(bind))
            if(car(car(bind))==expr)
                return cdr(car(bind));
        return expr; // if no binding, return self
    }

    // --- stuff for evaluating lists ---

    // if we cannot eval, complain and return nil
    printf("> wtf:");pprint(expr,env);return 0;
}

Well, that wasn’t hard! Unfortunately, the bulk of our code will be replacing that innocuous stuff for evaluating lists comment.

Function calls

How do we evaluate lists? The (very concise) syntax of Lisp tells us that in a list, the first expression is the operator (ie function to apply), and the remaining elements are the operands (ie the arguments to that function). So, the first thing we do is evaluate the first element of the list. The result could be a primitive operator (we manually inserted atoms for the built-ins to simplify checking for these), another variable (ideally, we’d keep dereferencing until we get something usable, but for now, we can also just abandon evaluation), or a compound lambda expression (aka a closure). lambdas are a bit special in how they are called, so we will leave those for later. In almost all cases, we have to also evaluate the arguments to the operator. We will first check whether we have one of the special cases and handle those manually, then we will evaluate all arguments and have all other functions work on those. In code, we have the following:

int eval(int expr,int env){
    // --- stuff for variable dereferencing ---

    int op=eval(car(expr),env);
    if(isstr(op)){ // builtins
        op-=MEMSIZE; // move index w/in string memory - easier to check

        // check special cases
        switch(op){
            case 45:break; // quote
            case  6:break; // defun
            case 12:break; // lambda
            case 19:break; // cond
        }

        // not a special case. eval args
        int args=cdr(expr),res;
        if(args){args=lm;
            for(int t=cdr(expr);t;t=cdr(t)){
                res=lm;lm+=2;
                if(lm>MAX){printf("out of evlist memory!\n");exit(-1);}
                car(res)=eval(car(t),env);
                cdr(t)=lm;
            }cdr(t)=0;
        }

        // check regular cases
        switch(op){
            case 51:break; // print
            case 57:break; // printmem
            case 40:break; // cons
            case 32:break; // car
            case 36:break; // cdr
            case 24:break; // atom
            case 29:break; // eq

            case 61:break; // add
            case 63:break; // mult
            case 65:break; // sub
            case 67:break; // div
            case 69:break; // mod
            default:printf("not primfn: %s\n",str+op);return 0;
        }
    }
    if(car(op)==12+MEMSIZE){ // lambda
        // --- handle closures ---
    }

    // --- fallthrough ---
}

Wonderful! Technically, we don’t even need all the breaks since the code there would return values directly, and we won’t get to fall through. Now we only need the code for all that.

Builtins

Quote

Let’s do quote first, since it’s easiest. To quote an expression, we simply return it:

case 45: // quote
    return car(cdr(expr));

I told you it was easy. We need the cdr to skip the quote keyword, and the car to only return the first argument. PicoLisp instead returns all arguments, which should not be hard to modify.

Cond

Recall that cond takes a list of pairs - the first element of a pair is a condition, and if that condition does not evaluate to nil, we return the (evaluated) second element. This is also relatively straight-forward to accomplish with a for loop:

case 19: // cond
    for(int cond=cdr(expr);cond;cond=cdr(cond))
        if(eval(car(car(cond)),expr))
            return eval(car(cdr(car(cond))),env);
    return 0;

The cars and cdrs get a bit more involved, but otherwise it’s pretty simple.

Defun

Now things get interesting. defun modifies the global environment, which we do not have access to. Fortunately (as mentioned previously), said global environment is always in the same place in our memory - meaning, we can write directly to it:

case  6: // defun
    mem[0] =
        cons(
            cons(
                car(cdr(expr)),
                eval(
                    car(cdr(cdr(expr))),
                    env)),
            mem[0]);
    return car(cdr(expr));

Lambda

Now things get complicated. lambda has two arguments, a list of variables, and a function body to evaluate to. The issue is that not every variable inside the function body is also a function argument. So-called ‘free variables’ would have to be looked up in the environment - however, that look-up would have to happen then we evaluate the lambda, not when we call the resulting closure! If this is confusing, the page regarding concepts of Lisp explains it in more detail (and with examples!). It suggests solving this by going through the function body and replacing all free variables with their values, then explains that this is very inefficient and stupid. Instead, it suggests storing the values separately, and storing a reference to that within the closure.

We do not collect garbage in this implementation. This is pretty bad, since it means we keep gobbling up memory we might not need, but here it ends up working out for us, since it means a variable we’ve defined never gets removed or redefined, merely overshadowed (ie we create a new binding that gets found first) (this is also why I opted to write a fully functional interpreter first - if we had assignment or a variable number of expressions in the body, this wouldn’t work either). This means that to create a closure, we put a reference to the current environment into the lambda. There’s clever ways to do this, but instead we will mangle our syntax a little - the first argument of a lambda are the names of its arguments, the second is the function body, and we will manually insert a third argument that is a reference to the environment it was created in. We can also save up on a bit of memory by replacing the nil that terminates the expression with that environment - it means we don’t need to allocate more memory, but it violates the principle that ‘lists are always nil-terminated’ - hence why we needed extra checks in the pprint function. Oh, also, if there already happens to be an environment there, we won’t do this - that would mean we’d already closed it before, and we should not overwrite its environment.

That was a lot of technical talk. The code to accomplish it is significantly shorter:

case 12: // lambda
    if(!cdr(cdr(cdr(expr))))
        cdr(cdr(cdr(expr)))=env;
    return expr;

Car

Now for the easier functions. car takes its first argument, and returns its car:

case 32: // car
    return car(car(args));

I told you these were easier.

Cdr

cdr does the same as car, but returns the cdr of its first argument:

case 36: // cdr
    return cdr(car(args));

Cons

cons takes its first two arguments, and conses them:

case 40: // cons
    return cons(car(args),car(cdr(args)));

Atom

atom tells us whether its argument is an atomic symbol (in most dialects of Lisp, it’s usually called something like symb?). Unfortunately, it is slightly more complicated than the previous ones, as the value that the isstr macro returns in C is not immediately usable in Lisp (well, it is, but we’d rather not pass arbitrary stuff). Fortunately, it is easy to make it passable by replacing it with a reference to the atom t:

case 24: // atom
    return isstr(car(args))?MEMSIZE+4:0;

Eq

Oh, the debates there have been on how equivalence should be defined in Lisp! We will be implementing the simplest pointer comparison, equivalent to Scheme’s eq?:

case 29: // eq
    return car(args)==car(cdr(args))?MEMSIZE+4:0;

Arithmetic helpers

We tossed in arithmetics to have something useful for our Lisper to operate on. Unfortunately, our current implementation does not actually distinguish whether its atoms are numeric (‘consisting of digits’) or not. The solution is easy: we stick our fingers in our ears, start loudly singing ‘Eternal Flame’, and naively assume that anything passed to an arithmetic primitive will in fact be a number. This will break in fascinating manners in a lot of cases, but it will do the right thing with the right inputs.

We will need two helper functions to convert between the digit-by-digit-based representation we use for atoms and actual numeric types - in other words, convert between ASCII and integers. Unfortunately, someone at MIT decided that atoi should be a reserved function in C for some unfathomable reason, so we shall have to name our function latoi and litoa.

Joke explanation: as you can imagine, C already features a function to accomplish precisely this for exactly the same reasons, and it’s called atoi (“Ascii TO Integer”) by the same logic. We will be reimplementing it ourselves because reimplementing standard libraries from scratch is easy and fun.

latoi (atom to integer) is the easier one. We successively take digits (defined as a byte between 48 (ASCII code for ‘0’) and 57 (ASCII code for ‘9’) inclusive). With each digit, the running result is multiplied by 10 (our base), and the value of our digit is added to it. We also check whether the first byte is 45 (ASCII code for ‘-’), and remember to invert the result if so.

int latoi(int index){
    char*s=getstr(index),inv=0;
    int ret=0;
    if(*s='-')inv=1,s++;
    while('0'<=*s&&*s<='9')
        ret=ret*10+*s++ -'0';
    return inv?-ret:ret;
}

litoa (integer to atom) is somewhat more complicated, because given a number, it is significantly harder to calculate the first digit (#include<math.h>\nfloor((double)x/exp(10.0,log10((double)x)))) than it is to calculate the last (x%10). We can circumvent this in two ways - we either store the number backwards (relative to the calculation - thus reversing it twice), or we store it as we calculate it and then invert it. Storing it backwards requires us to know the length of the number first, which requires either more inconvenient math (exp(10.0,log10((double)x))), or annoying code duplication to make sure we pass through the string correctly, not to speak of the actual conversion.

We’ll take the easier way of storing the string as we calculate it, and then using some esoteric shuffles to reverse the string in-place. Which looks as follows:

int litoa(int x){
    char*s=str+ls,swap,i;
    if(!x)*s++='0';
    while(x)*s++=(x%10)+'0',x/=10;
    *s=0,i=s-(str+ls);
    for(int j=0;j<i>>1;j++){
        swap=str[ls+j];
        str[ls+j]=str[ls+i-j-1];
        str[ls+i-j-1]=swap;
    }return intern();
}

Arithmetic primitives

With this, we can finally implement the arithmetic primitives in eval. To do so, we initialise a return variable to some value (see principles for an explanation of ‘which’ and ‘why’) (for the ‘return variable’, we will be clever and save on some lexical memory by reusing the int res that we used to evaluate arguments), then go through all arguments, using the helpers to convert them to numbers and combining with the return value appropriately. We then convert the return value to an atomic value that the lisper can understand, and return:

case 61: // add
    for(res=0;args;args=cdr(args))
        res+=latoi(car(args));
    return litoa(res);
case 65: // mult
    for(res=1;args;args=cdr(args))
        res*=latoi(car(args));
    return res;
// etc

Foreign functions

The keen would have noticed that we defined primitive atoms print and mem, and case-checks for them in eval. These aren’t ‘standard’ Lisp functions, and shouldn’t really be here (ie, feel free to skip them) - but they are somewhat convenient, so we’ll add them anyway. They also showcase how we could add bindings to external (relative to the program being interpreted) code.

print is a wrapper around pprint, and lets us output intermittent values:

case 51: // print
    pprint(car(args),env);
    return car(args);

In case our interpreter breaks or misbehaves, it is very useful to be able to take a look at its memory and see what exactly it is doing. mem does exactly this, outputting memory and letting us figure out what is happening:

case 57: // mem
    printf("%i cons, %i atoms, env is %i\n",lm,ls,env);
    for(res=0;res<lm;res++){
        printf("%i:",i);
        if(isstr(mem[i]))printf("'%s' ",getstr(mem[i]));
        else printf("%i ",mem[i]);
    }printf("\n\n");res=0;
    while(str[res]&&res<ls){
        printf("%i:<",res);
        while(str[res])putchar(str[res++]);
        printf("> ");res++;
    }printf("\n\n");return car(args);

Closure calls

Phew, that was something! One more block of code, and we’re done! We explained how we create closures, but we haven’t figured out how to call them.

Closures are user-defined functions, ie we make them via lambda. This means that any closure has an associated environment, letting us reference the values of variables as they were at the time the closure was created (once again, I’ve written more about why we are doing this in other places). With that, all we need to do is extend this environment to include the ‘formal arguments’, binding them to the variable names the function specified, and evaluate the body in that new environment.

To help with the last paragraph, here’s an illustrative graphic, rendered in glorious ASCII art:

;               lambda function
;              /               \
;   function  /  names of       \  closure env
;     name   /  formal args      \ / wedged here
;       |   |         |    body   X
;       v   |        / \   /   \ v |
(defun add2 (lambda (x y) (+ x y)  ))
(add2 4 5) ; <- function call
;  ^  \ /
;  |   \ function arguments
;  |
; function name

And here’s the code for it:

if(car(op)==12+MEMSIZE){ // lambda application
    int closenv=cdr(cdr(cdr(op))); // the env 'lambda gave us
    int names=car(cdr(op)); // the argument names the function wants
    int vals=cdr(exp); // the arguments we pass it
    while(names){
        closenv=
            cons( // extend closure env
                cons( // by new binding:
                    car(names), // bind this name
                    eval(car(vals),env)), // to this value
                closenv);
        names=cdr(names);vals=cdr(vals);
    }
    return eval(car(cdr(cdr(op))),closenv);
}

Addons

Well, believe it or not, the above is all we need for a functioning Lisp interpreter. It’s not very efficient, nor capable, but it does the right thing more often than not. That being said, we’ll now look at some ways to extend that basic layout to make it just a bit more useful.

Unfortunately, implementing these will occasionally require large-scale edits to the existing code. Instead of writing it out all over again, I’ll be showing small examples on how to integrate it, and leaving the rest implied.

I will be showing three additions to the current code. - First is type tagging. The current distinction between strings and lists is all we need to get this working, but it sure could use some more - numbers, for example, are very obvious in their inefficiency. - Second are tail calls. As you might have noticed, we do not have much regarding flow control - most of it is accomplished via recursion. It would do us well to make recursion efficient. In fact, tail calls are one of the few hard requirements in the Scheme reports. - Last but most important, garbage collection. All computation we do will take up more memory, yet once we are done with said computation, we will no longer be needing most of it. Garbage collection is the process of automatically reclaiming those bytes. It is described last, as it requires knowledge of types, and is quite neat to integrate once we have tail calls working.

Type Tagging

The data we store is inherently referential - a concell that stores a string (or atom or symbol or whatever they are for our current purposes) does not store the actual bytes that make up the string, instead it tells us where those bytes are. A minor but crucial detail is that the data in that concell also tells us that it is in fact a string it is storing. The latter is the type information. Currently we are brute-forcing it by specifying some cut-off value, below which the value is a concell and above which it is a string, but this is quite unwieldy - the cut-off value itself is arbitrary, it does not allow resizing the memory segments (which we won’t be doing anyway, but it’s a useful concept to plan for), and is very impractical if we have more than two types. In addition, numbers are an efficiency nightmare. Can we do better?

If we thought ahead a bit and picked some more meaningful value, like 32768, we would have a slightly better distinction - for any value, if the highest bit is set, it is a string, otherwise a concell (32768 in binary is 0b1000000000000000). This simplifies the check a bit - can we extend it to more types? What if we used the lowest bits instead - anything whose lowest bits are 0b…10 is a string? With two bits, we could distinguish four different types. If we were clever, we could say that one of those isn’t a type, but indicates that we should use the next two bytes to figure out the type - giving us seven types! In theory we could do even more, but at that point it ends up being more useful to implement compound types using Lisp code inside our programs.

We will be using five types, as listed below. The first three are easy and common enough. The ones ending in 11 are the extended types - vectors (cdr-coded lists) and strings (vectors but in string memory), and two error values for the garbage collector. I could have used three bits for the extended types and removed the error values, but they are just so practical in figuring out what is happening.

Last bits: 00    01     10    0011   0111  1011 1111
Type:     pair number symbol vector string err1 err2

To figure out where a symbol is, instead of subtracting the previous cutoff value, we remove the last two bits - this is almost trivial on a computer via bitshifting. Pairs (aka concells) are slightly more complicated - we only remove one bit instead of two. This means pairs can only point to every second element in the concell memory - however, since a pair always has two elements, this suits us quite well (and happens to double the amount of concells we can have). Numbers are special - they are the only non-referential type, meaning, the bytes that ‘store the number’ do in fact store the number itself, and not some reference telling us where to find it. Hurray for efficiency! Symbols and vectors work similarly to their simpler siblings - the difference is in what functions we call on them (note: I am in the process of working out an entire memory management system for those, so the details of this are intentionally left vague). The error values are not supposed to be used - if the program happens upon them during normal execution, we know something went wrong, and can use them to figure out what.

(note 08092025: while I was writing the above, I realized I don’t actually need vector types - since those have custom access functions, I could reuse the regular types, and put all the vector-specific code in those access functions. We’ll see how that goes - I might have to rewrite large parts of the interpreter again…)

(note: not sure I ever pointed it out because it seemed obvious, but a “vector” type means it’s a thing that contains a bunch of data you are allowed to change, and that data is laid out sequentially in memory (“one bit after another”). Compare concells being pairs of references to data (ie there’s always two of them), vs vectors being a list of arbitrary many references.)

Implementing those types is mainly a matter of adding a bunch more macros at the beginning of the file that take care of the bitshifting:

#define iscons(x) (x&&((x&3)==0)) // nil not considered a valid concell
#define isnum(x) ((x&3)==1)
#define issym(x) ((x&3)==2)
#define isvec(x) ((x&15)==3)
#define isstr(x) ((x&15)==7)
#define tagnum(x) ((x<<2)|1)
#define tagsym(x) ((x<<2)|2)
#define tagvec(x) ((x<<4)|3)
#define tagstr(x) ((x<<4)|7)
#define getsym(x) ((x>>2)+str)
#define getstr(x) ((x>>4)+str) // XXX length check? +1 so it skips size?
#define car(x) mem[x>>1] // neat!
#define cdr(x) mem[(x>>1)+1]

The code for using types is largely the same - the main thing that changes is that we have a separate thing in eval for handling numbers (they always evaluate to themselves), and a few more type checks. Importantly, we must also parse numbers:

int intern(); // as before

int chkstr(){
    int ret=0;
    for(int i=ls; str[i]; i++){ // go through the last token we parse()d
        ret = ret*10 + str[i]-'0'; // adjust ret - see latoi
        if(str[i]<'0' || '9'<str[i]) // if theres something thats NOT a digit
            return intern(); // defer to intern - as before
    }
    return tagnum(ret); // tag result as a number, and return
}

int parse(){
    // memory management stuff
    if(its a concell){
        the usual;
    }else{ // string
        char*c=str+lh;
        while(la!=' ')*c++=la,getnext();
        head=chkstr(); // XXX the important bit - this was intern() before!
    }
    if(flags)return head;
    return ret;
}

Tail calls

Most control flow in Lisp is accomplished via recursion. Instead of a for-loop, we write a function that takes a control variable as an argument, and if it is below some value, it calls itself, with the control variable one larger than before. In our current implementation, this is accomplished by having the function that evaluates the first iteration call itself to evaluate the second iteration - in other words, the stack frames (environments, temporary variables, etc) that the Lisp code is using are implemented as actual stack frames in the interpreter.

This is not really a good idea. For the above example, it works well enough, but the majority of practical applications feature endless loops, or very long for-loops. With our current approach, neither of those will be possible, since every new iteration will create a new stack frame, until we run out of memory. Ideally, we’d prefer to reuse stack frames - if the only thing a function does is call itself with slightly different arguments (as a ‘lispified’ for-loop would), we don’t really need to store the current arguments anymore, do we?

There are multiple interesting ways to accomplish this in assembly (that I will probably list when I get around to writing the self-hosting repl, but that’s gonna take a couple myriads at this rate), but in an interpreter, this is best accomplished with a trampoline function. If the only thing a piece of code does is to call another piece of code, instead of using recursion in the interpreter, we could have that code return the code it wants to call (isn’t homoiconicity great?), and somehow tell the interpreter that the return value is not really a return value, but more code it should evaluate. Thus, the interpreter can discard everything it had saved for the current piece of code (which is where the garbage collector will come in), and focus entirely on the new piece of code. This is accomplished by wrapping a so-called trampoline function around the previous evaluator - it is called a ‘trampoline’ because the evaluator can bounce off function calls from it, cleaning up in between.

Implementation details to watch out for. What if the new function is to be executed in a new environment, like let does? That one’s pretty easy - it’s just another thing we need to tell the trampoline about. If it gets a new environment, it keeps it around until it’s done bouncing, then cleans it up as well. We allow imperative code - what if the return value is a list of instructions, like begin does? This one actually has two cases: that list can be in a tail call context - in which case we add a thingy to the trampoline so it executes a list of instructions in a tail call context -, or it can be just a regular list of instructions that we’d like to call one after another, where every one of those could itself contain a tail call. The latter is best solved with a separate helper function that executes them in a list - and that one recursively calls the trampoline, replicating the original behavior we had.

The only thing left to do is to figure out which primitives fall in which group. As with almost all problems in life, this one is best solved with an annotated list. Here’s a one of all regular tail calls: - the last expression in a lambdas body - the consequents in if/cond/case

Here are the iterated lists: - body of begin

And here are the iterated lists that are also tail calls: - body of let

XXX fairly sure the above is wrong. Copied it from notes_20250720, check how its handled in the current codebase.

Here is the code that makes the magic happen. Please note that the code from the previous eval function is now in evsing (‘evaluate a single expression’), and the new eval implements the trampoline that keeps calling evsing to get either a new expression to evaluate (potentially in a new environment), or the end result that it should return. Like with the previous implementation of eval, it receives the environment as a pointer argument (ie instead of telling it what the environment is, we tell it where we stored what the environment is) - this is because eval could change its environment (eg via defun). Likewise, evsing receives the trampoline argument as a pointer. In practice, it serves as an additional return argument.

The same trick is used if we need to also specify a new environment for the new code to be evaluated in, with a small caveat - unlike the environment we were called with, this new environment is specific only to the current code, thus once the current code is done with it, we should remove it. For that, we keep a separate reference to it, which is later handled by the garbage collector.

int evsing(int,int*,int*),eval(int,int*);

int evlist(int expr, int*env){
    while(cdr(expr)){ // while theres instructions (plural) left
        eval(car(expr),env); // evaluate the first one
        expr=cdr(expr); // move on to the rest
    }
    return eval(car(expr),env); // evaluate the last one, and return its value
}

int eval(int expr, int*env){
    // the unfortunately named 'tramp' is the trampoline argument, telling us
    //  whether the current expression is a tail call and/or list of exprs.
    //  initialised to 1 since we have to eval at least once.
    // envptr points to the environment to evaluate stuff in
    // tmpenv stores a reference to a temporary environment, in case it is not
    //  the one we were called in (relevant for the garbage collector)
    int tramp=1,tmpenv=0,*envptr=env;
    while(tramp){
        if(tramp==1){ // if tail call
            tramp=0; // reset flag
            expr=evsing(expr,envptr,&tramp); // do the tail call
            if(iscons(tramp)){ // if function returned an environment
                tmpenv=tramp;envptr=&tmpenv; // store env, adjust pointer
                tramp=2; // all primitives that create a tmpenv implicitly also do an
                //  iterated tailcall
            }
        }

        if(tramp==2){ // if above resulted in an interated tailcall
            while(cdr(expr)){ // as evlist()
                eval(car(expr),envptr);
                expr=cdr(expr);
            }
            // last expression is always a regular tail call
            expr=car(expr);
            tramp=1;
        }
    }
    return expr;
}

int evsing(int expr, int*env, int*tramp){ // previously known as `eval`
    if(isstr(op){
        switch(op){
        case "begin":
        case "quote":
        case "if":
        case "let":
        }

        int args=0; // XXX etc
        // since we already evaluated all arguments, nothing that comes here is
        //  relevant for tail calls - code is unchanged
    }

    if(car(op)=="lambda"){
        // create tmpenv as before, but call via trampoline as we did with `let`
        // XXX bindings
        *tramp=cloj;
        return body;
    }

}

XXX pretty sure the code is wrong and/or incomplete

Garbage collection

At long last, the feature I’ve been teasing for so long. As mentioned multiple times, some operations (mainly let and function calls) create temporary environments (also relevant for defun/set). Once those operations are done, those environments are no longer valid - meaning, we do not (and, if implemented correctly, can not) use them, but they are still littering around and wasting our memory. Fortunately, the trampoline function provides a convenient wrapper of evaluation, which greatly simplifies figuring out when and where to clean up (seriously, the code before that was a mess). Unfortunately, garbage collection is still very much not trivial.

The core realization that makes the current system possible is that the memory use of temporary environments (and also in general) is equivalent to a stack. This has a very technical and complicated meaning, but the short version is that if we have two objects, the one that was created first will be first in memory (think about how the implementation of cons functions). This means that, if we store the location of the last object created when a function is called, we know that when the function returns, everything between that object and the current last one must have been made by the function. Meaning, that’s what we have to clean up.

(note: I’m currently reworking memory management for a more “proper” implementation of strings. The tricky part is that they do not follow that pattern (when a string of some length is required, it is put in the first memory slot that it can fit in), which makes matters a lot more complicated.)

As usual, this is a good starting point, but it is somewhat simplified. For example, the return value of a function is created after the function has been called (duh), therefore it is contained in the segment we are removing. Moreover, since that return value is usually a list, its contents are not contiguous - meaning, it is not a single value we have to keep track of, but a recursive data structure we have to extract. What do?

As mentioned in the article on fundamentals, the solution for this was inspired by SectorLisp’s ABCollector. Let’s take another look at the memory layout when the garbage collector is called:

 stuff from before the    stuff the function       unused
  function was called    allocated (clean up)      memory
|######################|-----1-----2--3-----45|.............|
                             ^     ^  ^     ^^
                    these make up the return value

We may not touch stuff from before the call, we want to remove all the dashes the function no longer needs, but somehow still keeps the numbers. What if we just moved them to that segment labeled “unused memory”?

|######################|-----1-----2--3-----45|12345........|
                             ----- shwoop ----->

Then, remove the segment in the middle, and move the ‘return value’ directly after the “stuff from before”:

|######################|12345.................|.............|
                        <--------- fwip -------

And voila, the only additional memory taken up after the function returns is the return value - which happens to be the only thing we actually wanted to keep!

If you have been paying attention until now, you are probably a bit doubtful about this solution. “Concells are referential,” you think, “we can’t just move them? The references would still be to the old memory location!” You would be correct - most of the complexity in this solution is in the “moving” part, and explaining the code is not exactly trivial. Generally speaking, to ‘move’ a concell, we will allocate a new one at the end (which functions as usual), store the values of the concell we want to copy in the new one, replace the values of the original with error codes (so we can tell if we have already cleaned up a cell instead of crashing when we encounter circular data), then replace the values of the new concell by ‘moving’ the concells it points to. This is accompanied by case checks (string references, nils, numbers, and data in the “stuff from before” segment are left as they are, cells that have already been cleaned are not cleaned again), some layout trickery (instead of replacing the values to refer to where their contents were moved, we replace them with where their values will be after the ‘fwip’), and a wrapper that handles environment values (more on that later - it gets complicated).

Implementation-wise, the garbage collector is done as a function that takes as arguments a concell to keep (ie the return value) and an index telling it how much memory was in use before the function call, and returns the new location of the value. It looks like this:

int _gc(int keep, int before, int memsize){
    // helper function. keep is element to collect, before is ptr to beginning
    // of garbage memory, and memsize is the size of the memory area the function
    // used.
    if(keep<before || issym(keep) || isnum(keep) || isstr(keep))
        return keep; // symbols, strings, numbers, and anything from before the
        // function call is left alone
    if(keep && 0xff == 0xff)exit(-1); // never should've come here! somehow, this
    // cell was cleaned up in a *previous* cycle. in other words, we don goofed.
    if(car(keep) == 0x7fff) // this concell has already been processed
        return cdr(keep); // return where we put it

    int t = cons(car(keep), cdr(keep)); // allocate new concell
    car(keep) = 0x7fff; cdr(keep) = t-max; // replace current concell with a
    // code telling us we already handled it, and a reference to the new location

    car(t) = _gc(car(t),before,memsize); // collect any children we might have
    cdr(t) = _gc(cdr(t),before,memsize);

    return t-memsize; // second phase will shift this cell back by memsize units
}

int collect(int keep,int before){
    if(before==lm)return keep; // function call did not use up memory
    int after=lm; // current top of used memory. clean from before to here.
    keep = _gc(keep,before*2,(after-before)*2); // collect return value. args are
    // multiplied by two, since one concell takes up two slots in memory.

    while(lm>after)mem[before++]=mem[after++];
    for(int i=before;i<after;i++)mem[i]=0xffff; // replace cells we cleaned up
    // with something recognizable - see second 'if' in _gc
    lm=before; // relocate top of memory
    return keep;
}

XXX bindings to the trampoline

Memory guards

This does as advertised. Unfortunately, it doesn’t work - using it in anything but the simplest tests will either crash in the “concell cleared in previous pass” check, or mangle values and then crash. The reason is that Scheme (and our dialect of it) are not purely functional - the set primitive can modify memory outside of the current stack frame to point to data inside it. For example:

(defun ret (+ 2 3))

This creates a new binding, and sets its value to 5. The creation of the binding ret and the calculation of its value are ‘atomic’ (inseparable) in Lisp, but not so in the implementation of defun. We can also explicitly make them separate in Lisp:

(defun ret)
(set ret (+ 2 3))

Once again, we create a new binding, and set its value to 5. However, this time the creation of the binding and the calculation of its value are separate. Crucially, this means that the binding is not the return value of set - therefore, our garbage collector gladly removes it, and ret now refers to an invalid memory location. We could fix this by making set return the binding it modifies, unfortunately this does not fix the problem:

(defun ret)
(defun mangle (lambda () (set ret (+ 2 3)) ()))
(mangle)

What do? It is not a coincidence that I named the variable ret - just like the pointer arguments to the trampoline function, we can use set to treat ‘global’ (here meaning ‘external to the current stack frame / temporary environment’) as additional return values. The garbage collector should account for that. This is accomplished via so-called memory ‘guards’ - every time set modifies a binding, we store a reference to it, and adapt the garbage collector to also account for those while cleaning up.

This is messier than it sounds, but not for the obvious reasons. The naive implementation won’t work - it would shuffle around the new value, but not update the binding, therefore the guards refer to the bindings themselves, not simply their values. If a guard is inside the memory segment we are about to delete, the program sets the value of a binding inside the current temporary environment, therefore we just delete the binding as well as its guard.

The mess arises regarding where those guards are stored exactly. We cannot preallocate memory for them, as we do not know how many we will need (we could in theory first run a code analysis telling us how many times set could be used, but that’s the realm of code optimization and analysis, and I am not dealing with that), we cannot store them in the global environment, as we would have to expand it in order to do so, and expanding the global environment is the reason we need guards in the first place. We could allocate space for one guard along with every binding we use - in practice, this is equivalent to collecting the entire environment on each function return (instead of just the temporary environment that the function used), meaning the “stuff from before the function was called” segment above shrinks to zero - and at that point, why even bother? The relocation step (professionally known as “the shwoop and fwip”) will take up half our memory to clean up the other half - at that point, not cleaning up memory is almost more efficient.

The solution I came up with was to allocate memory for guards from the other end of our memory.

 stuff from before the    stuff the function      ~u~us~d~
  function was called    allocated (clean up)     ~me~o~y~
|######################|-----1-----2--3-----45|.........gggg|
                             ^     ^  ^     ^^           ^
                    these make up the return value      guards! guards! arrest this value!

This has the obvious disadvantage of mangling our memory model even further, and the not so obvious disadvantage that bounds checking gets more complicated (only slightly though). Unfortunately, this solution actually works pretty well in practice, and on a more baremetal implementation like the self-hosting repl, it mirrors quite well how the memory is handled internally by the computer.

(note: we need an extra value to keep track of how many guards we’ve set, akin to lm and ls all the way at the beginning. I called it lg, it’s initialized to MAX.)

So how do we implement this? If set finds a binding, we first remove any guards it might have (we no longer need its contents), set its new value, and then create a guard for it. When the GC does its stuff, besides extracting the return value, it also goes through all the guards and extracts their values as well.

Some notes on how all of the above works: - Removing guards is recursive - the value of a binding might have a guard somewhere inside it that we cannot immediately access, therefore we have to traverse its contents recursively. As with the GC helper function, this can lead to infinite loops, therefore we put values aside before recursively cleaning them. Unlike the GC helper function, we cannot allocate extra memory during the guard processing, so instead we will store the values in C variables (essentially, we are using the C callstack as a temporary replacement for the Lisp callstack). - ‘Removing’ guards means they are no longer valid, but unfortunately we are using an array instead of a linked list (aforementioned mangling of our memory model), therefore the guard does not just disappear, it leaves a hole in the guard list. If we don’t take care of that, the list of guards will grow longer and longer until we run out of memory, despite the majority of it being garbage as well. Fortunately, compacting an array is very easy - we go through it, and if we encounter a hole, we shift all other values to fill it. Instead of shifting them at once, we will keep track of how many holes we have encountered (ie how far we have to shift), and shift each value when we get to it. The code for this looks a bit complicated because the guard list expands backwards into memory, therefore we also process it backwards (again, memory is a bit mangled). - The more important issue here is when to compact the list. I went back and forth on this a couple times, and decided to do it when creating guards. The choice sounds weird because this doesn’t seem related, but I noticed guard removal (ie the thing that makes holes) is always followed by guard creation (the thing that fills them) not long after (see implementation of set). The first draft had it as part of GC - which made sense, since we are cleaning out garbage, and calculating the value of a binding requires a function call, ie the garbage collector is always called after guard removal in this scenario as well. However, that solution displeased me, as GC is done way more frequently than guard management - therefore, the code for testing for holes and compacting them would almost always run in situations where we already know there are none. - To avoid merely repeating the previous problem, a guard does not point to some value that shouldn’t be deleted, but it points to the memory cell that contains the reference the binding uses to refer to its value. This sounds a lot more complicated than it is - “we guard bindings, not values” (in C terms, “instead of int*guard we use int**guard). This way, after extracting the values, the GC can also update the bindings that need them. - It also happens to conveniently take care of another issue - if the binding itself was part of the temporary environment we are deleting, the GC would delete it. The guard would then point to an error value that we can check for. - To complicate matters a little bit - if you think back to how the trampoline works, the function could return the temporary environment, in which case we explicitly do not want to delete it! This is also rather easily solved - we pass the environment that should be valid after the GC as an argument, and collect it like we do with the guards. Because of the cutoff in the helper function (it doesn’t handle anything below the current frame), this does in fact only handle any additions from the current frame instead of the entire environment.

All right, let’s get coding! First off, set. As mentioned above, we do some weird dereferencing for the guards - I’ll use a macro for that to keep the code a bit easier to read:

#defun gcdr(x) ((car(x)>>1)+1)

int evsing(int expr,int*env,int*tramp){
    // stuff
    case 77: // set
        for(int bind=*env;bind;bind=cdr(bind)){ // go thru env until we find a
            if(car(car(bind))==car(cdr(expr))){ // binding for the current var
                removeguard(gcdr(bind)); // remove all guards for it
                cdr(car(bind))=evlist(cdr(cdr(expr)),env); // assign new value
                guard(gcdr(bind)); // make new guard for it
                return car(car(bind)); // return variable name (convenient ig?)
            } // if we did not find a match, we fall through to 'defun', which then
        } // *creates* a binding in the current tmpenv
    case 6: // defun
        // etc
    // other stuff
}

Guard deletion, with aforementioned “store in C variable before recursing” trick:

void removeguard(int guard){
    if(!iscons(mem[guard]))return; // not a binding, ignore
    for(int at=lg;at<MAX;at++) // go thru guard list, remove any that match
        if(mem[at]==guard)mem[at]=0;
    // we replace the values w/ zero before calling recursively to avoid
    // endless loops, then restore them from a C variable.
    int t=car(guard);car(guard)=0;
    passguards(t);car(guard)=t;
    t=cdr(guard);cdr(guard)=0;
    passguards(t);cdr(guard)=t;
}

Guard creation, including the guard list compaction mechanism:

void guard(int guard){
    if(!iscons(mem[guard]))return; // dont need to guard non-volatile data
    for(int at=lg;at<MAX;at++) // check if guard already exists
        if(mem[at]==guard)return; // if so, dont do anything
    int shift=0;
    for(int i=MAX-1;i>lg;i--){
        if(shift)mem[i]=mem[i-shift]; // shift guard by however many slots we freed
        while(!mem[i] && i>lg) // check if guard invalid, if so increase shift
            shift++,lg++,mem[i]=mem[i-shift];
    }
    mem[--lg]=guard; // create new guard
}

And the updated garbage collector, with the new environment argument:

int collect(int keep,int*env,int before){
    if(before==lm)return keep; // function call did not use up memory
    int after=lm; // current top of used memory. clean from 'before' to here.
    keep = _gc(keep,before*2,(after-before)*2); // collect return value. args are
    // multiplied by two, since one concell takes up two slots in memory.
    *env = _gc(*env,before*2,(after-before)*2); // collect env as well

    for(int at=lg;at<MAX;at++){
        if(mem[at]==0); // guard was deleted. ignore (not worth collecting here)
        else if(mem[at&~1]==0x7fff) // binding was moved, adjust guard accordingly
            mem[at]=at+1;
        else if(mem[mem[at]]==0xffff) // binding was deleted, invalidate guard
            mem[at]=0;
        else if(mem[mem[at]]==0x7fff) // binding *value* was moved, adjust
            mem[at]=(mem[at]>>1)+1;
        // otherwise, guard is still valid, but the binding was not collected
        // (this is the scenario we made guards for). collect manually.
        else mem[mem[at]]=garbage(mem[mem[at]],before*2,(after-before)*2);
    }

    while(lm>after)mem[before++]=mem[after++];
    for(int i=before;i<after;i++)mem[i]=0xffff; // replace cells we cleaned up
    // with something recognizable - see second 'if' in _gc
    lm=before; // relocate top of memory
    return keep;
}

XXX we might wanna elaborate on how to call this in the trampoline