PoprC Runtime Operation Part 1: Cells and Closures

Posted on November 1, 2013

Because Popr is a functional language, execution of a Popr expression can be thought of as a series of reductions to reach the final result. PoprC relies on functions in the runtime (rt.c) to handle the reduction of Popr expressions. It can reduce functions with static (known) arguments. If this were all PoprC could do, it would be an interpreter, and not a compiler. It can also reduce functions with variables as arguments, producing a trace which is converted into LLVM IR. This results in code with all static parts fully reduced, and only the dynamic (not known in advance) portions implemented by using calls to the runtime.

The runtime is also responsible for memory management. Unallocated memory is arrange as a free ring (a doubly-linked list with no head or tail) of cells. Each cell stores data required to implement a closure, which is a function and its arguments, but a cell also has more information required for memory management, alternatives, reduced values, alternatives, and a temporary pointer used during graph copying.

Here is exactly what is stored in the cell type (cell_t):

typedef bool (reduce_t)(cell_t **cell, type_rep_t type);
struct __attribute__((packed)) cell {
  reduce_t *func;
  cell_t *alt;
  cell_t *tmp;
  uint32_t n;
  uint16_t size;
  union {
    uint16_t type;
    uint16_t out;
  };
  union {
    /* unevaluated */
    cell_t *arg[3];
    /* reduced */
    struct __attribute__((packed)) {
      alt_set_t alt_set;
      union {
        intptr_t val[2]; /* value */
        cell_t *ptr[2];  /* list */
      };
    };
    /* unallocated */
    struct __attribute__((packed)) {
      cell_t *prev, *next;
    };
  };
} __attribute__((aligned(4)));

In the description of PoprC, the term closure is used to refer to a function which may be missing arguments, which can be reduced. Closures are stored in one or more cells.

cell_t is complicated, because closures have a life cycle with three stages, and to minimize memory usage, some parts of cells are reused to store different information in some stages.

The first stage is the unallocated cell; it only requires a previous (prev) and next (next) pointer, and a way to indicate that it is unallocated is useful to prevent errors, which is done by setting func to 0. When the runtime is initialized, it builds a ring of unallocated cells in a statically allocated array (cells). This could later be extended to allow dynamically growing and shrinking the memory pool, or if the code is embedded, all heap memory could be consumed by the cell ring at startup.

The next two stages share some common fields. The func field stores a pointer to the function that the closure will execute on reduction. alt stores the next alternative. tmp is a temporary pointer used to implement graph copying. n is a reference count. size indicates the size of the arg array, and consequently is one greater than the size of the val and ptr arrays.

The second stage is the unevaluated closure. In this stage, in addition to the common fields, the out field indicates how many of the arguments are outputs instead of inputs, and the arg array stores pointers to other cells, which are the arguments. In this stage, during building (when the graph is being constructed, before reduction), some arguments might be missing. Arguments are filled in from the left (c->arg[c->size - 1]), with the position of the next argument stored in c->arg[0]. Bit 0 of func is set to indicate that the closure is not ready.

The third stage is the reduced closure. In this stage, the type field indicates a type (and also has some other bits that store metadata), alt_set is a bit field that controls the interaction of alternatives (which I will describe in another post.) The reduced cell could be either a vector of unboxed values stored in the val array, or a list, which stores pointers to its members in the ptr array.

With all that out of the way, you might wonder what happens if I want a list of more than two items. Multiple contiguous cells can be used together to extend the arg, val, and ptr arrays.

So now you should understand cell_t and closures, upon which the runtime is built. In the next article in this series, I’ll explain the allocator and memory management functions in the runtime.