The FSB Virtual Machine

The FSB Virtual Machine (combined reference manual and source code) is Copyright © 2008, Thomas Lord (lord@emf.net, Berkeley CA.)

See the Copyright and License Information appendix for more information.

Chapters C Code
Strategy:

There is also a Detailed Table of Contents

Implementation:
Appendices:

This reference manual contains snippets of C code illustrating parts of the API and how the API can be implemented. Pieced together in the right order, these fragments add up to a complete reference implementation. [disclaimer: "will add up to" -t]

You can view the pieced-together C program You can also follow links to view the fragments in the order they are put together:


<The Whole Program>
from:

<Copyright and Disclaimer>
<Needed System Header Files>
<What is Being Built?>
<The Abort Path>
<Declare Memory Types>
<Type Code Definitions>
<Define Memory Management Constants>
<GC-Bit Values>
<Low-level Page Code>
<Object Page Indexes>
<Low-Level Object Header Access>
<Low Level Object Accessors>
<Declare Register Constants>
<Declare Free Object Constants>
<fsbx__new_regs>
<Growing Stacks>
<Low Level Stack Addressing>
<Exceptional Object Representations>
<Shrinking Stacks.>
<Routines for Manipulating Unallocated Objects>
<Allocating Objects>
<Grabbing Free Large Objects>
<Releasing Large Objects.>
<Register Allocation Test Program>
            


Links (in Lieu of a Proper Bibliography, in No Particular Order)
  • The Lua Programming Language
    Inspired me that, after all, simplicity was still achievable.
  • Splay Trees
    These are very useful, especially in a "largely functional" (i.e., "mostly non-mutating") programming model.
  • Skew Heaps
    Used here to hold sets of free (unallocated) large objects. The heap property is used to ensure that allocations (of large objects) are taken from the lowest possible address.


Strategy:

Architectural Overview

(top)

The FSB virtual machine emulates a hypothetical computer designed to host dynamically typed, high-level programming languages. Some of its highlights include:

Memory Subsystem Introduction

Object Oriented Memory

(top)

The FSB features an object oriented memory in several senses:

Page Objects and Sub-page Objects

An FSB heap contains up to 216 page objects, each of which is 64K bytes in size. Thus, the FSB machine has a 232 bit address space.

Smaller objects are carved from pages, by dividing the pages into equally sized pieces (except for a page header, in some cases):

Obj
Size
Code
Heap Page Formats
12 64K obj
11 32K obj 32K obj
10 16K obj 16K obj 16K obj 16K obj
9 8K obj 8K obj 8K obj 8K obj 8K obj 8K obj 8K obj 8K obj
8 hdr (4k) 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj 4K obj
7 hdr (4k) ... 30 x 2K byte objects ...
6 hdr (4k) ... 60 x 1K byte objects ...
5 hdr (4k) ... 120 x 512 byte objects ...
4 hdr (4k) ... 240 x 256 byte objects ...
3 hdr (4k) ... 480 x 128 byte objects ...
2 hdr (4k) ... 960 x 64 byte objects ...
1 hdr (4k) ... 1920 x 32 byte ...
0 hdr (4k) ... 3840 x 16 byte objects ...

All pointers in an FSB system point to an object, or to an internal field within an object. Given a pointer, even a pointer to an internal field, programs can quickly locate both the object referred to by the pointer, and the page on which the object resides.

Registers and Register Pages

(top)

The FSB features three object registers, each of which always contains a pointer to a (special) register object. Register objects are page-size objects (64K bytes each) and comprise the "register files" of the FSB virtual CPU. 3-layer memory hierarchy: registers, direct pages, other pages

The FSB additionally includes a single, 4-bit task selector register. This is used for the cooperative multi-tasking features described below.

Register Format and Cooperative Multi-tasking

(top)

Each of the three register objects contains:

Each FSB instruction cycle begins by examining the four bit task selector register and using that value to select one of the three registers and, from that register, one of the four program counters and code-block pointers. The instruction in that code-block, at the offset indicated by that program counter, is the next instruction to execute.

thread selector value which instruction to run next illustration
0x0 reg[0].pc/block[0]
0x1 reg[0].pc/block[1]
0x2 reg[0].pc/block[2]
0x3 reg[0].pc/block[3]
0x4 reg[1].pc/block[0]
0x5 reg[1].pc/block[1]
0x6 reg[1].pc/block[2]
0x7 reg[1].pc/block[3]
0x8 reg[2].pc/block[0]
0x9 reg[2].pc/block[1]
0xA reg[2].pc/block[2]
0xB reg[2].pc/block[3]
0xC unused, reserved
0xD unused, reserved
0xE unused, reserved
0xF unused, reserved

The Format of Memory Within Objects

(top)

Every object is one of the 13 standard object sizes and thus is some multiple of 16 bytes. The (aligned) 16-byte regions within an object are called quads. Quads are further divided into pairs, words, half-words, and bytes:

object memory
quad[0] (first 16 bytes) etc...
pair[0] pair[1] etc...
word[0] word[1] word[0] word[1] etc...
half-word[0] half-word[1] half-word[0] half-word[1] half-word[0] half-word[1] half-word[0] half-word[1] etc...
byte[0] byte[1] byte[0] byte[1] byte[0] byte[1] byte[0] byte[1] byte[0] byte[1] byte[0] byte[1] byte[0] byte[1] byte[0] byte[1] etc...
Alignment Note:
Machine values (values which can be stored in the memory of an object) come in sizes ranging from 1 to 16 bytes, in power-of-2 increments. A machine value of size 2k bytes can be stored only at a location that is aligned on a 2k boundary relative to the 0th byte of an object.

Object Internals and Reference Counting

(top)

The minimum size of an object in memory is 16 bytes. The maximum size is 64K bytes. Heap objects can be allocated at any power-of-two number of bytes in between those sizes.

"Ordinary" objects, called referencable objects, are reference counted and the FSB assists in reclaiming objects whose reference count drops to 0. Cycles among referencable objects must be explicitly broken by applications. (Note, though, that applications can also create classes of object that are collected by graph-tracing garbage collection.)

"Limb" objects are objects with the invariant property that exactly one object -- a referencable object -- contains a pointer to the limb. The referencing object also contains type and size information about the limb. The lifetime of a limb is identical to the lifetime of the (unique) object that points to the limb. When the referencing object is reclaimed, the limb is reclaimed as well.

Every referencable object in memory begins with a mandatory two-word header. (Limbs do not require headers because they are not reference counted and their type information is stored elsewhere.) Here is the format of the two-word header found in all referencable objects:

(top)
Small Object Headers (all referencable heap objects)
quad[0] of the object
pair[0]:
the object header
pair[1]:
varies
word[0]:
the object's (machine) type*
word[1]:
reference count and gc bits**

* The type code encodes information sufficient to infer what machine type of value is stored in each location within the object. (The type code stored in an object can vary over the object's lifetime.) Type codes are described in detail elsewhere in this document.

** word[1] of the object header contains a 28-bit reference count and set of 4 GC bits that govern the use of the reference count (these are described in greater detail, elsewhere). Objects are normally reference counted with a requirement that cycles must be explicictly broken to reclaim an object. Objects which are managed instead by graph-tracing collection can be created by a separate mechanism (also described elsewhere).

Objects larger than 16 bytes (four words) have a larger object header: a four word (single quad) header:

(top)
Tagged Object Headers (referencable heap objects 32 bytes or larger)
quad[0] of the multi-quad object
pair[0]:
object header
pair[1]:
additional object header
word[0]:
the object's (machine) type
word[1]:
reference count and gc bits
word[0]:
user tag and object size*
word[1]:
object hash value**

* An object size code is a four bit value. The object is 2(size-code + 3) bytes in size. The user tag is an application-controlled 28-bit tagging field.

** The object hash value is a cache containing either 0 or a hash value for the object. As an exception, in free, small objects the hash slot holds a freelist pointer linking free objects on the same page.

(top)

Objects 64-bytes and larger are called huge objects and are further subdivided into a head (not to be confused with an "object header") and a tail. The head of a huge object is exactly one quad in size. The tail of a huge object is the oversall size of the object, minus two quads:

Huge Object Format
quad[0]
object header
quad[1]
object head
quad[2 .. (obj-size / 16) - 2]
object tail
Note: The division of objects into head and tail is largely a convenience of applications however it plays an important role in the type tagging of objects, explained elsewhere in this document.
(top)

Here is an illustration giving an overview of the preceding object format tables:

C Declarations for the Memory System

(top)

In this section, we provide C code defining a low-level C interface to an implementation of an FSB memory subsystem.

C Code

<Declare Memory Types>
from: <The Whole Program>

<Declare the Integer Types>
<Declare the Character Type>
<Declare the Floating Point Types>
<Declare Pointer Types>
<Declare Bytes and Half Words>
<Declare Words>
<Declare Pairs>
<Declare Quads>
<Declare Register Sets>
          


Code: Integer Types

(top) (code section)
C code

Porting Alert

This code makes unwarranted assumptions about the byte-sizes of various C types. It may need modification if the program is ported.


<Declare the Integer Types>
from: <Declare Memory Types>

typedef char fsb_int8_t;
typedef unsigned char fsb_uint8_t;
typedef int fsb_int16_t;
typedef unsigned int fsb_uint16_t;
typedef int fsb_int32_t;
typedef unsigned int fsb_uint32_t;
typedef int fsb_int64_t;
typedef unsigned int fsb_uint64_t;
          


(code section)

The C programming language lacks integer types of guaranteed sizes. In these declarations we make reasonable guesses. The primitive C types used must be large enough to hold the indicated number of bits.

In theory, this reference implementation of FSB should work correctly even if the primitive C types used are "too large" -- contain extra bits. However, little work has been done to make sure that is the case so bugs are likely to be found.

Code: The Character Type

(top) (code section)
C code

<Declare the Character Type>
from: <Declare Memory Types>

typedef fsb_uint32_t fsb_codept_t;
          


(code section)

The FSB character set has 21 low-order bits defined by Unicode, plus 11 "bucky bits".

Code: Floating Point Types

(top) (code section)
C code

<Declare the Floating Point Types>
from: <Declare Memory Types>

typedef float fsb_float_t;
typedef double fsb_double_t;
          


(code section)

We assume that these types are IEEE 32 and 64 bit (single and double) precision floating point numbers.

Code: Pointer Types

(top) (code section)
C code

<Declare Pointer Types>
from: <Declare Memory Types>

union fsb_byte;
union fsb_half_word;
union fsb_word;
union fsb_pair;
union fsb_quad;
            
typedef union fsb_byte_ptr fsb_byte_ptr_t;
union fsb_byte_ptr
{
  union fsb_byte * fsb_bytes;
};

typedef union fsb_half_word_ptr fsb_half_word_ptr_t;
union fsb_half_word_ptr
{
  union fsb_byte * fsb_bytes;
  union fsb_half_word * fsb_half_words;
};

typedef union fsb_word_ptr fsb_word_ptr_t;
union fsb_word_ptr
{
  union fsb_byte * fsb_bytes;
  union fsb_half_word * fsb_half_words;
  union fsb_word * fsb_words;
};

typedef union fsb_pair_ptr fsb_pair_ptr_t;
union fsb_pair_ptr
{
  union fsb_byte * fsb_bytes;
  union fsb_half_word * fsb_half_words;
  union fsb_word * fsb_words;
  union fsb_pair * fsb_pairs;
};

typedef union fsb_quad_ptr fsb_quad_ptr_t;
union fsb_quad_ptr
{
  union fsb_byte * fsb_bytes;
  union fsb_half_word * fsb_half_words;
  union fsb_word * fsb_words;
  union fsb_pair * fsb_pairs;
  union fsb_quad * fsb_quad;
};
          


(code section)

All FSB pointers are pointers into virtual machine pages and always point at or into an FSB object. The FSB memory model implies that such pointers always point to a (virtual machine) quad, pair, word, half-word, or byte.

Code: Bytes and Half Words

(top) (code section)
C code

<Declare Bytes and Half Words>
from: <Declare Memory Types>

#define FSB_BYTE_FIELDS(MULT) \
  fsb_int8_t fsb_int8[MULT]; \
  fsb_uint8_t fsb_uint8[MULT]

typedef union fsb_byte fsb_byte_t;
union fsb_byte
{
  FSB_BYTE_FIELDS (1);
};

#define FSB_HALF_WORD_FIELDS(MULT) \
  FSB_BYTE_FIELDS (2 * (MULT)); \
  fsb_byte_t fsb_byte[2 * (MULT)]; \
  fsb_int16_t fsb_int16[MULT]; \
  fsb_uint16_t fsb_uint16[MULT]

typedef union fsb_half_word fsb_half_word_t;
union fsb_half_word
{
  FSB_HALF_WORD_FIELDS (1);
};
          


(code section)

The FSB machine's memory model is defined in terms of virtual bytes, half-words, words, pairs, quads and pages. We can't, strictly speaking, rely on the C implementation using exactly those sizes in a convenient way. Therefore, it's helpful to wrap an abstraction barrier around those data types.

Code: Words

(top) (code section)
C code

<Declare Words>
from: <Declare Memory Types>

#define FSB_WORD_FIELDS(MULT) \
  FSB_HALF_WORD_FIELDS (2 * (MULT)); \
  fsb_half_word_t fsb_half_word[2 * (MULT)]; \
  fsb_int32_t fsb_int32[MULT]; \
  fsb_uint32_t fsb_uint32[MULT]; \
  fsb_codept_t fsb_codept[MULT]; \
  fsb_float_t fsb_float[MULT]; \
  fsb_byte_ptr_t fsb_byte_ptr[MULT]; \
  fsb_word_ptr_t fsb_word_ptr[MULT]; \
  fsb_pair_ptr_t fsb_pair_ptr[MULT]; \
  fsb_quad_ptr_t fsb_quad_ptr[MULT]

typedef union fsb_word fsb_word_t;
union fsb_word
{
  FSB_WORD_FIELDS (1);
};
          


(code section)

Code: Pairs

(top) (code section)
C code

<Declare Pairs>
from: <Declare Memory Types>

#define FSB_PAIR_FIELDS(MULT) \
  FSB_WORD_FIELDS (2 * (MULT)); \
  fsb_word_t fsb_word[2 * (MULT)]; \
  fsb_int64_t fsb_int64[MULT]; \
  fsb_uint64_t fsb_uint64[MULT]; \
  fsb_double_t fsb_double[MULT]

typedef union fsb_pair fsb_pair_t;
union fsb_pair
{
  FSB_PAIR_FIELDS (1);
};
          


(code section)

Pairs are the smallest memory type large enough to hold double precision floats and 64 bit integers.

Code: Quads

(top) (code section)
C code

<Declare Quads>
from: <Declare Memory Types>

#define FSB_QUAD_FIELDS(MULT) \
  FSB_PAIR_FIELDS (2 * (MULT)); \
  fsb_pair_t fsb_pair[2 * (MULT)]

typedef union fsb_quad fsb_quad_t;
union fsb_quad
{
  FSB_QUAD_FIELDS (1);
};
          


(code section)

All heap objects are some multiple number of quads in size. The smallest heap objects are exactly one quad. The object header of all objects 32-bytes in size or larger is a quad.

Code: Register Sets

(top) (code section)
C code

<Declare Register Sets>
from: <Declare Memory Types>

            
#define FSB_N_REGISTERS (3)

typedef struct fsb_registers fsb_registers_t;
struct fsb_registers
{
  /* An invariant is maintained:
   * Each register contains either NULL or 
   * a pointer to a page-sized register object.
   */
  fsb_quad_ptr_t fsb_reg[FSB_N_REGISTERS];
  unsigned int fsb_task_selector:4;
};
          


(code section)

See Registers and Register Pages and Register Format and Cooperative Multi-tasking

Execution Model Introduction

(top)

A single node FSB machine operates cyclicly, executing one instruction at a time. In each cycle, the task selector register is consulted to select one of the three registers, and within that register, one of the four program counter / block pairs. The block is an ordinary object; the program counter is an index into the tail of that object.

The tail location indexed by the program counter can contain any object. That indicated object is the next instruction to run. All objects can be executed as instructions.

The default behavior of objects, when they are executed as instructions, is to push a reference to themselves on stack[0] of the register indicated by the task selector register, and to increment the program counter indicated by the task selector.

A distinguished class of objects, called syms, has a different behavior. When a sym is executed, a corresponding built-in function (or primitive instruction; there is no distinction) is called. This function can perform any combination of computations and side effects permitted by the micro-code API.

Instruction execution is a series of atomic steps, transitioning the machine state from one consistent state to another. Interrupts may be delivered (in environment-specific circumstances) only between instructions. Interrupts are delivered, generally speaking, by saving the current task selector, program counter and block, and transferring control to an appropriate interrupt handler.

Note: Cooperative Tasking and Multiple Stacks

The FSB's multiple stacks, program counters, and block pointers combine nicely with the FSB's model of interrupt handling. For example, a system which wants to emulate a unix-like kernel with multiple "user processes" can cheaply use one FSB stack for interrupt services and/or kernel services, using the others for currently schedulable user processes.

FSB Restrictions on Modifying Memory

(top)

(top)

FSB instructions are constrained in what side effects they may perform in a single instruction cycle. In particular, they can directly modify registers and register objects without restriction. In some circumstances, they can modify objects directly referenced by register objects -- these are called first level objects. Under no circumstances can instructions modify other objects:

Illustration: Restrictions on Instruction Mutations

If an object has a reference count of 1, it is called a linear object. First level linear objects may be freely modified by instructions.

Only certain locations within non-linear objects can be modified. The machine type of an object distinguishes mutable locations within an object from immutable locations. Instructions are permitted only to modify explicitly mutable locations within first level, non-linear objects.

True Concurrency

(top)

The restrictions on side effects are key to the FSB approach to true concurrency: the question of what happens when an FSB machine is implemented atop a multi-processor shared memory system. The FSB design is based on the premise that communication between processors that share memory is relatively expensive: the more computation that takes place in memory private to a processor, the better.

Illustration: True Concurrency

(top)

The FSB architecture encourages as much work as practical to be done on register stacks, private to each processor. The architecture encourages mutations to be primarily limited to linear objects for which no locking is required.

Truly shared first level objects being modified require page-level locking. The rest of the shared heap is immutable. One interesting application of this property is to customize paging algorithms: the FSB will access no page that is not directly addressed by a register object. Paging algorithms, rather than responding to "random access" to pages, can observe when FSB programs move a pointer to a heap object into a register: a logical point in time at which to "page in" an object not resident in memory.

Type Codes

(top)

The FSB machine uses a system of dynamic type tags. The type of the value stored in each register and heap location can be computed at run-time, given only a pointer to that location. This section presents the type tagging system, and the details of object formats.

FSB type tagging is neither especially simple, or especially intuitive. It is carefully balanced to reduce the cost of a number of common operations. While the typing system is relatively surprising in its details, it is also designed to be easy to use effectively either "by hand" or via a compiler.

Diving in to the deep end, then, here is the low-level anatomy of an FSB type code

Type Quarks

Every referencable heap object includes at least a small object header which includes, in its first word, a 32-bit type code.

FSB type codes are regarded as a two 4-element lists of 4-bit values. The first list of 4-bit values is called the head type of the object, the second list of 4-bit values is called the tail type of the object. Roughly speaking, the head type of an object records the value types stored in the head of the object, the tail type the tail of the object however there are exceptions to that too-simple rule.

Each 4-bit value in a type code is called a type quark.

type code format
head type
(four quarks)
tail type
(four quarks)
quark 0
(four bits)
quark 1
(four bits)
quark 2
(four bits)
quark 3
(four bits)
quark 0
(four bits)
quark 1
(four bits)
quark 2
(four bits)
quark 3
(four bits)
C code

<Type Code Definitions>
from: <The Whole Program>

<Declare Type Quarks>
<Declare Type Pairs>
<Declare Type Quads>
<Declare Type Words>
          


Code: Declaring Type Quarks

(top)
C code

<Declare Type Quarks>
from: <Type Code Definitions>

enum fsb_ty_quark
{
  fsb_ty_n = 0,  /* pronounced "nil" -- must be 0 */           
  fsb_ty_b = 1,  /* pronounced "signed byte" */                
  fsb_ty_B = 2,  /* pronounced "unsigned byte" */              
  fsb_ty_s = 3,  /* pronounced "signed short" */               
  fsb_ty_S = 4,  /* pronounced "unsigned short" */             
  fsb_ty_i = 5,  /* pronounced "signed int" */                 
  fsb_ty_I = 6,  /* pronounced "unsigned int" */               
  fsb_ty_l = 7,  /* pronounced "signed long" */                
  fsb_ty_L = 8,  /* pronounced "unsigned long" */              
  fsb_ty_c = 9,  /* pronounced "character" */                  
  fsb_ty_f = 10, /* pronounced "float" */                      
  fsb_ty_d = 11, /* pronounced "double" */                     
  fsb_ty_o = 12, /* pronounced "object" */                     
  fsb_ty_x = 13, /* pronounced "cell" */                       
  fsb_ty_m = 14, /* pronounced "limb" */                       
  fsb_ty_y = 15, /* pronounced "sym" */
  /* No more!  There must be exactly 16 type quarks */      
};

typedef struct fsb_type_quark fsb_type_quark_t;
struct fsb_type_quark
{
  unsigned int fsb_type_quark:4;
};

#define fsb_ty_quark(N) ((fsb_type_quark_t){ (N) })
#define fsb_n fsb_ty_quark (fsb_ty_n)
#define fsb_b fsb_ty_quark (fsb_ty_b)
#define fsb_B fsb_ty_quark (fsb_ty_B)
#define fsb_s fsb_ty_quark (fsb_ty_s)
#define fsb_S fsb_ty_quark (fsb_ty_S)
#define fsb_i fsb_ty_quark (fsb_ty_i)
#define fsb_I fsb_ty_quark (fsb_ty_I)
#define fsb_l fsb_ty_quark (fsb_ty_l)
#define fsb_L fsb_ty_quark (fsb_ty_L)
#define fsb_c fsb_ty_quark (fsb_ty_c)
#define fsb_f fsb_ty_quark (fsb_ty_f)
#define fsb_d fsb_ty_quark (fsb_ty_d)
#define fsb_o fsb_ty_quark (fsb_ty_o)
#define fsb_x fsb_ty_quark (fsb_ty_x)
#define fsb_m fsb_ty_quark (fsb_ty_m)
#define fsb_y fsb_ty_quark (fsb_ty_y)
          


Type quarks are often but not always used to indicate a value of the type they are named after. For example, fsb_i is the usual tag for a word containing a signed 32-bit integer, but their are exceptions.

Code: Type Pairs

(top)
C code

<Declare Type Pairs>
from: <Type Code Definitions>


typedef struct fsb_type_pair fsb_type_pair_t;
struct fsb_type_pair
{
  unsigned int fsb_type_pair:8;
};

#define fsb_ty_pair_code(L,R)  (((L) << 4) | (R))
#define fsb_ty_pair(L,R)       ((struct fsb_type_pair){fsb_ty_pair_code ((L).fsb_type_quark, (R).fsb_type_quark)})
#define fsb_ty_pair_left(PT)   fsb_ty_quark (((PT).fsb_type_pair >> 4) & 0xf)
#define fsb_ty_pair_right(PT)  fsb_ty_quark ((PT).fsb_type_pair & 0xf)

#undef FSB_X
#undef FSB_Y
#undef FSB_Z
      
#define FSB_X(A,B) \
  static const fsb_type_pair_t fsb_ ## A ## B \
    = ((struct fsb_type_pair){(fsb_ty_ ## A << 4) | fsb_ty_ ## B})
      
#define FSB_Y(X) \
  FSB_X(X,n); FSB_X(X,b); FSB_X(X,B); FSB_X(X,s); \
  FSB_X(X,S); FSB_X(X,i); FSB_X(X,I); FSB_X(X,l); \
  FSB_X(X,L); FSB_X(X,c); FSB_X(X,f); FSB_X(X,d); \
  FSB_X(X,o); FSB_X(X,x); FSB_X(X,m); FSB_X(X,y)
      
#define FSB_Z \
  FSB_Y(n); FSB_Y(b); FSB_Y(B); FSB_Y(s); \
  FSB_Y(S); FSB_Y(i); FSB_Y(I); FSB_Y(l); \
  FSB_Y(L); FSB_Y(c); FSB_Y(f); FSB_Y(d); \
  FSB_Y(o); FSB_Y(x); FSB_Y(m); FSB_Y(y)

FSB_Z;
          


Having a short-hand notation for type pairs is handy.

Code: Type Quads

(top)
C code

<Declare Type Quads>
from: <Type Code Definitions>

typedef struct fsb_type_quad fsb_type_quad_t;
struct fsb_type_quad
{
  unsigned int fsb_type_quad:16;
};

#define fsb_ty_quad(L,R)         ((struct fsb_type_quad){((L).fsb_type_pair << 8) | (R).fsb_type_pair})
#define FSB_TY_QUAD(W,X,Y,Z)	 fsb_ty_quad (fsb_ ## W ## X, fsb_ ## Y ## Z)
#define fsb_ty_quad_left(QT)     ((struct fsb_type_pair){((QT).fsb_type_quad >> 8) & 0xff})
#define fsb_ty_quad_right(QT)    ((struct fsb_type_pair){((QT).fsb_type_quad >> 8) & 0xff})
#define fsb_ty_quad_left_l(QT)   fsb_ty_pair_left (fsb_ty_quad_left (QT))
#define fsb_ty_quad_left_r(QT)   fsb_ty_pair_right (fsb_ty_quad_left (QT))
#define fsb_ty_quad_right_l(QT)  fsb_ty_pair_left (fsb_ty_quad_right (QT))
#define fsb_ty_quad_right_r(QT)  fsb_ty_pair_right (fsb_ty_quad_right (QT))
          


Naturally, a type quad is two type pairs or, equivalently, four type quarks.

Code: Type Words

(top)
C code

<Declare Type Words>
from: <Type Code Definitions>

typedef struct fsb_type fsb_type_t;
struct fsb_type
{
  unsigned int fsb_type:32;
};

#define fsb_type(HD,TL)         ((struct fsb_type){((HD) << 16) | (TL)})

#define fsb_type_head(QT)       fsb_ty_quad (((QT) >> 16) & 0xffff)
#define fsb_type_tail(QT)       fsb_ty_quad ((QT) & 0xffff)

#define fsb_type_head_l(QT)     fsb_ty_quad_left (fsb_type_head (QT))
#define fsb_type_head_ll(QT)    fsb_ty_quad_left_l (fsb_type_head (QT))
#define fsb_type_head_lr(QT)    fsb_ty_quad_left_r (fsb_type_head (QT))
#define fsb_type_head_r(QT)     fsb_ty_quad_right (fsb_type_head (QT))
#define fsb_type_head_rl(QT)    fsb_ty_quad_right_l (fsb_type_head (QT))
#define fsb_type_head_rr(QT)    fsb_ty_quad_right_r (fsb_type_head (QT))
#define fsb_type_tail_l(QT)     fsb_ty_quad_left (fsb_type_tail (QT))
#define fsb_type_tail_ll(QT)    fsb_ty_quad_left_l (fsb_type_tail (QT))
#define fsb_type_tail_lr(QT)    fsb_ty_quad_left_r (fsb_type_tail (QT))
#define fsb_type_tail_r(QT)     fsb_ty_quad_right (fsb_type_tail (QT))
#define fsb_type_tail_rl(QT)    fsb_ty_quad_right_l (fsb_type_tail (QT))
#define fsb_type_tail_rr(QT)    fsb_ty_quad_right_r (fsb_type_tail (QT))

/* some of those have synonyms: */
#define fsb_type_head_lead(QT)  fsb_type_head_l(QT)
#define fsb_type_head_trail(QT) fsb_type_head_r(QT)
#define fsb_type_tail_lead(QT)  fsb_type_tail_l(QT)
#define fsb_type_tail_trail(QT) fsb_type_tail_r(QT)
#define fsb_type_head_0(QT)     fsb_type_head_ll(QT)
#define fsb_type_head_1(QT)     fsb_type_head_lr(QT)
#define fsb_type_head_2(QT)     fsb_type_head_rl(QT)
#define fsb_type_head_3(QT)     fsb_type_head_rr(QT)
#define fsb_type_tail_0(QT)     fsb_type_tail_ll(QT)
#define fsb_type_tail_1(QT)     fsb_type_tail_lr(QT)
#define fsb_type_tail_2(QT)     fsb_type_tail_rl(QT)
#define fsb_type_tail_3(QT)     fsb_type_tail_rr(QT)
          


Type words are the complete type codes stored in the first word of referencable objects

Parsing Ordinary Type Quads

(top)

Ordinarilly, a type quad describes the machines types of the values stored in a quad-sized region of memory.

If a quad is being used to store four values, each having a word-sized machine type, then the quarks of the type quad are in one-to-one correspondence with the words of the memory quad:

quark N value stored in word N illustration
b fsb_int8[4]
B fsb_uint8[4]
s fsb_int16[2]
S fsb_uint16[2]
i fsb_int32[1]
I fsb_uint32[1]
c fsb_codept[1]
f fsb_float[1]
o fsb_byte_ptr[1]*
x fsb_byte_ptr[1]*
m fsb_quad_ptr[1]**

* always NULL or points at or into a referencable object.

** always NULL or points at a limb.

Parsing Extended Width Type Quads

(top)

Any single-word type quark in position 0 or 2 can be extended to two words by storing a nil quark in positions 1 or 3, respectively.

64-bit integers and doubles always require two words. "object" and "cell" quarks change meaning, subtlely, when extended to two words:

quarks N/N+1 value stored in words N/N+1 illustration
bn fsb_int8[8]
Bn fsb_uint8[8]
sn fsb_int16[4]
Sn fsb_uint16[4]
in fsb_int32[2]
In fsb_uint32[2]
ln fsb_int64[1]
Ln fsb_uint64[1]
cn fsb_codept[2]
fn fsb_float[2]
dn fsb_double[1]
on fsb_uint32[1]
+fsb_word[1]*
xn fsb_uint32[1]
+fsb_word[1]*
mn fsb_uint32[1]‡‡
+fsb_quad_ptr[1]**
yn*** fsb_uint32[1]
+fsb_word[1]

* always NULL or points at or into a referencable object.

** always NULL or points at a limb.

*** this is a "sym reference", described elsewhere in this document

contains type information about the following word

‡‡ contains type information about the limb pointed at by the following word

Parsing Mixed Width Type Quads

(top)

Ordinary and extended width types can be mixed, as in these examples:

quarks 0..3 value stored in words 0..3 illustration
Ifdn fsb_uint32[1]+fsb_float[1]
fsb_double[1]
dnIf fsb_double[1]
fsb_uint32[1]+fsb_float[1]

Parsing Quad Width Type Quads

(top)

Just as type quarks can be extended to two words, they can be extended to four words:

quarks 0..3 value stored in words 0..3 illustration
bnnn fsb_int8[16]
Bnnn fsb_uint8[16]
snnn fsb_int16[8]
Snnn fsb_uint16[8]
innn fsb_int32[4]
Innn fsb_uint32[4]
lnnn fsb_int64[2]
Lnnn fsb_uint64[2]
cnnn fsb_codept[4]
fnnn fsb_float[4]
dnnn fsb_double[2]
onnn fsb_uint32[1]+fsb_word[1]
fsb_uint32[1]+fsb_word[1]
xnnn fsb_uint32[1]+fsb_word[1]
fsb_uint32[1]+fsb_word[1]
mnnn fsb_uint32[1]+fsb_quad_ptr[1]
fsb_uint32[1]+fsb_quad_ptr[1]
ynnn fsb_uint32[1]+fsb_word[1]
fsb_uint32[1]+fsb_word[1]

See an earlier table for more details.

Tagged Object Head Types

(top)
Tagged Object Head Types

Objects 32-bytes in size and larger are tagged objects which include a type code in the first word, and application data in the final four words. If the head type of the of the object's type code is an ordinary, extended, mixed, or quad width type quad, then it directly describes the type of the head of the object as illustrated in this example.

Huge Object Tail Types

(top)
Huge Object tail Types

Objects larger than 32-bytes in size are huge objects which include a type code in the first word, and application data in both the head and tail of the object. If both the the head and tail types of the of the object's type code are ordinary, extended, mixed, or quad width type quads, then the tail quad directly describes the type of the tail of the object. Tail types of this sort describe the layout of each quad of the tail. The tail is a repeating sequence of this pattern:

Exceptional Types

(top)

Many type quads fall into one of the categories ordinary, extended width, mixed width, or quad width and are therefore used to describe the physical layout of an object head or tail.

There are, however, many type quads which do not describe a valid physical layout, as in these examples:

exceptional type code reason for being exceptional
dddd four double-precision floats can't fit in four words of memory
nIII nil does not take up any space -- only three words are accounted for

Some (not all) exceptional type codes are given special meanings. These are described in detail in an appendix although some uses for them are described in the next section.

Small, Immutable Objects ("SMOBs")

(top)

Immediate objects are immutable objects which can be stored in 64 bits (8 bytes) or fewer of memory.

Immediate objects can be stored in aligned, 2-word fields within an object. Immediate objects can only be stored in locations tagged with one of the type pairs: fsb_on or fsb_xn. (See the explanation of extended width types for an introduction to type pairs.)

Immediate objects are also called smobs, a contraction of Small iMmutable OBjectS.

Most numeric types are smobs. Syms are smobs. Perhaps confusingly: references to heap objects are also smobs. Note that heap objects themselves are not smobs but "references" (type-tagged pointers) to heap objects are smobs.

Smobs use a system of staggered tagging. At least the first byte of a smob is used to discern its type. In some cases, additional bits of the smob must also be consulted.

Smobs are discussed in detail in an appendix. Here some examples are illustrated.

(top)
32-bit integer smobs
byte 0 1 2 3 4 5 6 7
contains ni* - - - fsb_int32_t
32-bit unsigned integer smobs
byte 0 1 2 3 4 5 6 7
contains nI* - - - fsb_uint32_t
single precision floating point smobs
byte 0 1 2 3 4 5 6 7
contains nf* - - - fsb_float_t
pair of signed 16-bit integers
byte 0 1 2 3 4 5 6 7
contains ns* - - - fsb_int16_t fsb_int16_t
object reference
byte 0 1 2 3 4 5 6 7
contains no* - - - fsb_byte_ptr_t
reference to an object known to be a linear object**
byte 0 1 2 3 4 5 6 7
contains oL* - - - fsb_byte_ptr_t
sym***
byte 0 1 2 3 4 5 6 7
contains ny* sym parameter (integer) sym index

* These two-letter codes refer to type pairs.

** When an object is known to be linear it is handy to record this fact in the reference to the object. That way, the linearity of the object can be recognized without having to examine the object's reference count.

*** This is not the only ways syms are represented. See the exceptional objects appendix.

Limbs

(top)

If a pair within an object is tagged with the type tag fsb_mn then the pair is a limb handle:

(top)
32-bit integer smobs
byte 0 1 2 3 4 5 6 7
contains limb type* limb size* limb cursor* fsb_quad_ptr_t*

* See the Limb Formats appendix for more detail.

Types and Mutability: Cells

(top)

Linear objects may be freely modified if they are addressed directly via a register object.

Only certain fields of non-linear objects can modified (and again, only when those objects are directly addresed via register objects). In particular, word pairs of type fsb_mn can be modified to contain any SMOB. Individual words of the un-extended type fsm_m can be modified to contain NULL or an fsb_byte_ptr_t which points to or within some referencable object.

Register Format in Detail

(top)

Each of the three FSB registers points to a register page. This chapter describes the internal structure of a register page.

Register Format

The head of a register page contains four stack pointers and program counters. The tail of the page is divided into several regions. Each region is divided into pairs each of which contains an array of SMOBs.

Each register contains four stacks (details discussed below) and four "overflow area links" for those stacks.

Each register contains a variety of system registers (described in detail below) and an index: a set of 2048, randomly accessible, general purpose registers.

Stack Details

(top)
Stack Details

Each register contains four stack pointers, four stacks, and four "displays". For each stack there is an overflow variable, pointing to heap allocated objects that are used to save values pushed deeply within the stack.

Virtual machine instructions can directly address only the top 256 stack entries and only while those entries are all present in the stack object (rather than stored in the heap, in an overflow object). In other words, each stack pointer addresses a floating 256 variable "stack window" at the top of the stack.

Each stack area in a register object has room for 768 stack entries. When the corresponding stack pointer is between 256 and 511 (inclusive), the stack is is said to be normalized meaning that a full 256 stack variables are currently addressable, and it is safe to push at least 256 new values on the stack, both with no danger of needing to write to or read from stack overflow objects.

When a stack is not normalized, the FSB machine will sometimes automatically move values within the stack area and move value to or from stack overflow objects in order to re-normalize the stack.

Each display is a 256-variable area of per-stack, application-controlled variables that do not move.

PC and Code Ptr Details

(top)
PC and Code Pointer Details

Each register can be used to execute up to four tasks, each with its own program counter and code pointer.

Code pointers point to arbitrary, heap-allocated objects and, roughly speaking, represent a "subroutine".

Each program counter is a 16-bit unsigned integer index to a tail field of the corresponding code object.

At all times, the next instruction scheduled for a task is the object stored in the task's code object tail field, indexed by the task's pc.

Primitive operations can effect control transfers by modifying either or both their task's code pointer and program counter.

Free Lists

(top)
Free Lists

Each register contains a list of 13 free lists: one for each object size.

Small object size free-lists are heaps of pages that contain non-empty free-lists of objects of that size.

Large object size free-lists are heaps of unallocated large objects.

Page IDs

(top)

All heap-resident objects either are or are part of page-sized objects. The maximum size of a heap object is one (virtual) page (64Kbytes).

An FSB instance contains at least one page (one register object). It contains at most 216 pages.

Pages are "first accessed" by an FSB node in some definite order and, at that time, each new page acquires a reference count.

When first accessed, pages are assigned sequential numbers, starting from 1. These numbers are page ids.

If a page's reference count drops to 0, it's page id may be re-used for any later page. If page ids are re-used, they are re-used from lowest-numbered to highest-numbered. Thus, overall, implementations of the FSB machine make a reasonable effort to keep page ids densely packed, near 0.

Given a pointer to a page, or an object within a page, or an internal field -- the corresponding page id can be quickly and inexpensively computed.

Page ids 0..3 are special: they are never assigned to any actual page. (Strictly speaking, therefore, the FSB machine has a limit of 216 - 4 pages.)

Page Ids on 32-bit Systems
While not entirely portable, a natural implementation of page ids on 32-bit systems is to use the upper 16 bits of a pointer to a page:
C code

<Low-level Page Code>
from: <The Whole Program>

<Page ID Computations>
<Raw Allocation of Memory for Pages>
<Raw Page Re-use>
          


Code: Page IDs

C code

Porting Alert

The reference implementation assumes that pointers are 32 bits (or, more specifically, that all pointers to FSB pages are unique in bits 16..31). There are many "truly portable" ways to implement page IDs in C but this implementation will work well on almost all popular systems and has the virtue that it can be coded in but a few lines of code.


<Page ID Computations>
from: <Low-level Page Code>

/*
 * Note that fsbx__page_to_id works for a
 * pointer to *any* location within a page,
 * not just pointers to the start of pages.
 *
 * For example, fsbx__page_to_id can tell you
 * the page id of an object, given a pointer to
 * that object.
 */
static inline fsb_uint16_t
fsbx__page_to_id (fsb_quad_ptr_t page)
{
  return (fsb_uint16_t)(0xffff & ((unsigned long)page.fsb_quad >> FSB_PAGE_LOG2));
}

static inline fsb_quad_ptr_t
fsbx__id_to_page (fsb_uint16_t id)
{
  return (fsb_quad_ptr_t){ .fsb_quad = (fsb_quad_t *)((unsigned long)id << FSB_PAGE_LOG2) };
}
          


Code: Raw Page Allocation

The page id code works only if page pointers are guaranteed to be aligned on 2FSB_PAGE_LOG2 boundaries. We'd best make sure that this is true when we allocate pages:

C code

Porting Alert

The reference implementation assumes an environment that provides an environment that provides the posix_memalign function and takes advantage of the alignment guarantees of that function.

Here is how the lowest levels allocate a new page from the system:


<Raw Allocation of Memory for Pages>
from: <Low-level Page Code>

extern fsb_quad_ptr_t fsbx__alloc_raw_pagemem (void);
#ifdef FSB_LIB
fsb_quad_ptr_t
fsbx__alloc_raw_pagemem (void)
{
  while (1)
    {
      void * chunk = 0;
      fsb_quad_ptr_t answer;

      (void)posix_memalign (&chunk, FSB_SIZEOF_PAGE, FSB_SIZEOF_PAGE);

      if (!chunk)
        fsbx__abort ("page allocation failed");

      if ((unsigned long)chunk & ((1UL << FSB_PAGE_LOG2) - 1))
        fsbx__abort ("improperly aligned page allocated");

      memset (chunk, 0, FSB_SIZEOF_PAGE);

      answer = (fsb_quad_ptr_t) { .fsb_quad = (fsb_quad_t *)chunk };

      if (fsbx__page_to_id (answer) > 3)
        return answer;
    }
}
#endif
          


Code: Raw Page Reuse

C code

Porting Alert

A more complicated allocator might "release pages back to the underlying system" in different ways.


<Raw Page Re-use>
from: <Low-level Page Code>

extern void fsbx__free_raw_pagemem (fsb_quad_ptr_t);
#ifdef FSB_LIB
void
fsbx__free_raw_pagemem (fsb_quad_ptr_t page)
{
  free ((void *)page.fsb_quad);
}
#endif
          


Syms and Compilation

(top)

In each instruction cycle an FSB node examines the task selector to pick one of the 12 program counters and code pointers. It executes the primitive operation represented by the object stored in the tail of the code object, at the offset indicated by the program counter. This chapter presents the details of that process.

Primitives as Functions

The complete state of an FSB machine is, in essence, a directed graph of objects within page, rooted in the three registers. The FSB machine maintains a number of invariants that constrain the structure of this graph (e.g, reference counting constraints).

An FSB's state graph is, by definition, finite. For example, the system can contain at most 216 pages of memory, 64Kbyte each.

Therefore, we can usefully think of "all valid FSB machine states" as a set of graphs:

FSB_states := the set of all valid FSB state graphs

From a programmer's perspective, an FSB state graph is represented by a set of registers: three registers pointing to register objects plus a task selector. Mathematically, we can equivocate by defining them to be the same thing:

FSB_regiseter_set_states := FSB_states

Execution of FSB programs is, semantically, a sequential execution of atomic primitive operations. Mathematically, a primitive is simply a function, albeit, a function that (realistically) can fail if we try to compute it:

FSB_primitives := FSB_states → { FSB_states, ⊥ }
Read: The set of primitives is identical with the set of functions mapping FSB states to FSB states, or mapping an FSB state to a "failure state" ("⊥" aka "bottom").

The execution of a primitive has a simple mathematical model:

Executing a primitive:
statet1 := prim (statet0)
Read: the machine state at time t1 is the state returned by the primitive function when given the state at time t0 as an argument.

The sequential execution of two primitives is the same as executing a single primitive which is the composition of the two primitive functions:

Sequential execution is function composition:
statet1 := prima (statet0)
statet2 := primb (statet1)
statet2 = primb (prima (statet0))
statet2 = primb ∘ prima (statet0)

That function composition models sequential execution of primitives is a useful property:

Primitives as C Functions; GCC as Optimizing Assembler

Primitive operations are mathematically functions. It is useful to model them in C as C functions. This section describes using C functions as primitive operations, but does not present the C API actually used. Rather, a slightly simplified API is shown. A subsequent section fills in the details and gives the actual C API.

Consider a hypothetical primitive operation such as push_nil. It's job is very simple: It must increment the stack pointer (the new top of stack is guaranteed to be initialized to nil already). It must also increment the program counter. In pseudocode, it is easy to represent this as a C function that takes a set of registers as its parameter, and returns a new set of registers as its result:

fsb_registers_t
prim_push_nil (fsb_registers_t regs)
{
  SET_SP (regs, 1 + GET_SP (regs));
  SET_PC (regs, 1 + GET_PC (regs));
  return regs;
}
      

A code block is, in effect, an array of primitive operations. For example, a code block defining a subroutine to push two nils onto the stack might look like:

[ push_nil push_nil ]
      

Because of the compositional property of primitives, such a block is trivial to compile into C:

[ push_nil push_nil ]
⇒
fsb_registers_t
compiled_block (fsb_registers_t regs)
{
  return prim_push_nil (prim_push_nil (regs));
}
      

One unproven hope of such compilation is that the C compiler can perform useful optimizations, particularly if inlining is used. For example, can primitive operations be define inline:

static inline fsb_registers_t
prim_push_nil (fsb_registers_t regs)
{
  SET_SP (regs, 1 + GET_SP (regs));
  SET_PC (regs, 1 + GET_PC (regs));
  return regs;
}
      
such that compiled code like:
  return prim_push_nil (prim_push_nil (regs));
      
is, in effect, expanded and optimized by the C compiler to as follows?:
  SET_SP (regs, 2 + GET_SP (regs));
  SET_PC (regs, 2 + GET_PC (regs));
  return regs;
      

Nota bene: While the reference implementation within makes a half-hearted attempt to so use GCC's inline function facility and optimizations, nevertheless, no attempt has yet been made to examine the code actually being generated. The expectation is that the generated code is not yet especially good and that more work is required to fully put this trick into practice.

Note of interest: The FSB's restrictions on memory mutation, in theory, are constraints on the aliasing of memory locations that are subject to reads and writes by primitive operations. The trick will be to make those constraints usefully obvious to the C compiler.

Extension Primitives and Primitive Names

Ordinary primitive operations are implemented as compiled functions, callable from C. Primitives may be built-in to an FSB instance or dynamically loaded. The FSB machine facilitates the "unloading" of primitive functions which can no longer be referenced from any object.

Each FSB instance is limited as to the number of primitives that may be present at any one time: there may be at most 224 - 1. However, over the lifetime of an FSB instance, a larger number of primitive functions may be referenced (any number) with only the constraint that no more than 224 - 1 can be concurrently present. (This limitation is "reasonable" in relation to the FSB machine's overall address size limit and it is a helpful restriction when designing the representation of SMOBs.)

All primitive operations are assigned globally unique names. To facilitate the distributed allocation of these names, the FSB requires that all primitives be named within an XML namespace, designated by an arbitrary URI, reserving the URI method "fsb:" for special interpretations.

In addition to an XML namespace, each primitive also has a unique XML "QName" within that namespace. The combination of a primitive's XML namespace and qname should be globally unique.

Finally, each primitive also has an arbitrary "pretty print" string which is some white-space-free string of graphical, ascii characters, unique among primitives within the primitive's XML namespace.

For example, the full name of the "push-nil" primitive described above might be:
primitive name example
namespace uri fsb:experimental-std:/0
QName push-nil
pretty-print name '()

Primitive Operands: Primitives as Functors

The model of primitives developed above is simple, but in some ways too simple. Our first approximation of a C API permits only primitives that take no "arguments" other than the state of a machine.

Consider the programming challenge: "Push the 32-bit integer 5294 onto a stack." What sequence of primitives might accomplish this? One could imagine a specialized primitive:

fsb_registers_t
prim_push_5294 (fsb_registers_t regs)
{
  SET_SP (regs, 1 + GET_SP (regs));
  SET_STACK_ENTRY_INT32 (regs, 0, 5294);
  SET_PC (regs, 1 + GET_PC (regs));
  return regs
}
      
however it would be awkward to require such a primitive for every integer, every floating point number, every character, and so forth.

When a "call" to a primitive operation is encoded in a code block it is stored as a 64 bit SMOB. At most 224 - 1 primitives can be loaded, so the choice of which primitive is to be invoked can be encoded in as few as 24 bits of that 64 bit object. The type tag system is arranged such that, of the remaining 40 bits, the encoded call to primitive may include up to 32 additional bits of data, that data to be passed to the primitive as its operands

For example, in the presence of operands, we need only a single function to push a 32-bit integer (any 32-bit integer) on the stack:

fsb_registers_t
prim_push_i32 (fsb_int32_t n, fsb_registers_t regs)
{
  SET_SP (regs, 1 + GET_SP (regs));
  SET_STACK_ENTRY_INT32 (regs, 0, n);
  SET_PC (regs, 1 + GET_PC (regs));
  return regs
}
      

In a sense, our mathematical model is too simple. Rather than functions, primitives that accept an operand argument could be considered a kind of functor. On the other hand, one can regard the combination of a primitive with a particular operand value as, itself, our simpler kind of primitive-as-function. For example, we might still hope to see:

  return prim_push_i32 (100, prim_push_i32 (200, regs))
      
to be compiled as:
  SET_SP (regs, 2 + GET_SP (regs));
  SET_STACK_ENTRY_INT32 (regs, 1, 200);
  SET_STACK_ENTRY_INT32 (regs, 0, 100);
  SET_PC (regs, 2 + GET_PC (regs));
  return regs
      
rather than as:
  SET_SP (regs, 1 + GET_SP (regs));
  SET_STACK_ENTRY_INT32 (regs, 0, 200);
  SET_PC (regs, 1 + GET_PC (regs));
  SET_SP (regs, 1 + GET_SP (regs));
  SET_STACK_ENTRY_INT32 (regs, 0, 100);
  SET_PC (regs, 1 + GET_PC (regs));
  return regs
      

C Calling Conventions for Primitives

Operands to primitives must be one of but a few types:

Operand Types
fsb_int8_t
fsb_uint8_t
fsb_int16_t
fsb_uint16_t
fsb_int32_t
fsb_uint32_t
fsb_float_t

e.g., fsb_registers_t prim (fsb_int8, fsb_registers_t);

All machine numeric type that fit in 32 bits can be used as operand types.

fsb_quad_ptr_t

fsb_registers_t prim (fsb_quad_ptr_t, fsb_registers_t)

The pointer points at or into some referencable object in the state graph of the register passed as an argument. The primitive is not automatically granted a reference count to the object: if the primitive mutates the heap but needs the object pointer to remain valid, it must first add (and later remove, before returning) a reference count to the object.

The primitive may freely mutate this object itself if the object has a reference count of 1 (i.e., it is a linear object). Regardless of the reference count, the primitive may mutate (with page locking) any cell of the object.

fsb_pair_ptr_t

fsb_registers_t prim (fsb_pair_ptr_t, fsb_registers_t)

The pointer points into the code object from which the primitive was invoked. In particular, it points to a SMOB. This facility allows primitives to manage their operands as (up to) 32-bits of opaque data such as a pointer to a data structure from a "foreign" C library.

fsb_uint28_t

fsb_registers_t prim (fsb_uint28_t, fsb_registers_t)

As an optimization, a special 28-bit operand type is defined. This is because the tagging system includes a representation of primitives in which the primitive is represented in 32 bits rather than 24, but the operand in only 28 bits. That representation has an advantage on many underlying real machines: a 32-bit representation of a primitive is (usually) enough storage to contain a pointer directly to a C function that implements the pointer. Thus, a primitive represented with exactly 28 bits of operand can (often) be invoked more quickly than a primitive represented with as many as 32-bits of operand.

28 is a particularly useful number of bits for an operand to have, as the next section about an addressing mode demonstrates.

Three Address Primitives

A convenient class of primitives are the three address operands, each of which expects a 28 bit operand.

The 28 bits are interpreted as a choice of stack and, within that stack, an index identifying two source operands and one destination:

Three Address Ops Code

There are other useful ways to break down 28-bit parameters into adresses into register objects. These are illustrated in association with the definitions of corresponding primitives.

Object IDs

(top)

Each 64K page is subdivided into, at most, 3,840 objects. Thus, a 12-bit integer suffices to identify any object within a sub-divided page. That integer is called the object page index.

The combination of a 16-bit page id with a 12-bit object page index is sufficient to uniquely identify any heap object. This gives rise to a useful property:

Although the FSB machine has a 32-bit address space, pointers to heap objects can be compressed to 28-bits, by converting them to a page id and object page index.

Porting Alert

The reference implementation, as usual in the lowest-level code, assumes a 32-bit architecture. Additionally, it assumes that objects, being quad-aligned, are 4-bit aligned.

C code

<Object Page Indexes>
from: <The Whole Program>


static inline fsb_uint32_t
fsbx__object_id (fsb_quad_ptr_t obj)
{
  return (fsb_uint16_t)(0xfffffff & ((unsigned long)obj.fsb_quad >> FSB_QUAD_LOG2));
}

static inline fsb_uint16_t
fsbx__object_index (fsb_quad_ptr_t obj)
{
  return (fsb_uint16_t)(0xfff & fsbx__object_id (obj));
}

static inline fsb_uint16_t
fsbx__object_page_id (fsb_quad_ptr_t obj)
{
  return (fsb_uint16_t)(0xffff & (fsbx__object_id (obj) >> 12));
}

static inline fsb_quad_ptr_t
fsbx__page_and_index_to_obj (fsb_uint16_t page_id, fsb_uint16_t obj_index)
{
  fsb_quad_ptr_t pg = fsbx__id_to_page (page_id);
  return (fsb_quad_ptr_t){ .fsb_quad = pg.fsb_quad + obj_index };
}

static inline fsb_quad_ptr_t
fsbx__object_page (fsb_quad_ptr_t obj)
{
  return fsbx__page_and_index_to_obj (fsbx__object_page_id (obj), 0);
}

          


User Tags: Object Properties

(top)

Referencable heap objects 32-bytes and larger include a 28-bit user tag in their third word of memory.

User Tags and Object Properties

A user tag is either NULL (0), or a 28-bit object id (in C, an fsb_obj_id_t value.

If a user tag is not NULL, the object containing the tag holds a reference count on the object referred to. This referred to object is called the referring object's properties:

Sixteen-bit Sparse Arrays

(top)

The FSB has built-in support for sparse arrays indexed by sixteen-bit integers. These are particularly handy in implementing the FSB's relational features, which are explained in a later chapter. Here, we simply explain how these sparse arrays work.

The particular variety of sparse array that we use is a two-level tree. Keys can be located with at most 8 comparisons. While 8 may seem a rather high number, the tree is also dynamically re-organized so that more popular keys will tend to require fewer comparisons than less popular keys. Finally, the data structure rewards both locality of reference in key space and numerically-dense-near-an-origin keys. The internal uses of sparse arrays in the FSB take advantage of these performance attributes.

Here is a series of illustrations which helps to explain the 16-bit sparse array data structure:

Sixteen-bit Sparse Arrays

A sparse array object's tail is a series of quads, each quad a key-value pair. This object is the "cache" for the sparse array. All keys having some number of low-order bits in common map to the same set of key value cache entries in a hash table object. A key can be found or stored in any position from that set.

Each key-value pair contains a 16-bit key and 64-bit value, as well as data about a (possibly present) limb which is used as a kind of table-entry overflow bucket.

"Short limbs" contain (possibly sparse) key-value pairs. "Long limbs" contains 256 values for keys whose low-order 8-bits are all the same.

When a key is inserted, it is put into a cache entry (in the top-level object) if one is available. Failing that, if there is a limb that can hold the key and room in the limb, they key is inserted there. If there is a full limb, smaller in size than the region that points to it, the size of that limb is doubled and the key inserted. If there is a full limb the same size as its region, and the top-level object already contains the next larger size region, the limb is split in two and the attempt to insert the key is restarted. If there is no room to split a limb, then the size of the top level object is doubled, and the key is inserted into one of the newly created cache entries.

A Note About Optimization

The above description is deliberately vague about the order in which cache entries are filled and the exact manner of searching short limbs. It also omits any mention of rearranging values, such as by swapping a key found on a limb with a key found in a cache entry.

Multiple policies are possible in each of those areas, leading to a wide variety of performance characteristsics over a given set of inputs.

Object-id Sparse Arrays

(top)

URHERE

Four-bit Sparse Arrays of Objects

(top)

Page Index Look-asides

(top)

Objects as Dynamic Relations

(top)

Weak References

(top)

Object Allocation

(top)

Object Death

(top)

Object Probate and Finalizers

(top)

Object Recycling

(top)

Page Recovery

(top)

Page Reclamation

(top)

The Conservation Properties of the Object Population

(top)

Parent/Child Pointers

(top)

Object Queue Pointers

(top)

Sub-object Pointers

(top)

Cons Pairs

(top)

Graph-Tracing Memory Mangement

(top)

Fast First-generation Object Allocation

(top)

Copy-On-Write References

(top)

Protected Objects

(top)

Modes and Traps

(top)

Events

(top)

I/O

(top)

Lazy Evaluation

(top)

Implementation:

Code: Allocating Registers

C code

A Test Program

The first test program does nothing but allocate and initialize registers.


<Register Allocation Test Program>
from: <The Whole Program>

#ifdef FSB_REG_ALLOC_TEST
int
main (int argc, char * argv[])
{
  fsb_registers_t regs = fsbx__new_regs ();
  return 0;
}
#endif
          


Code: fsbx__new_regs

C code

Three new pages from the underlying system will do. Zero is as reasonable an initial value for the task selector as any. Some individual fields on the register pages have to be allocated, though:


<fsbx__new_regs>
from: <The Whole Program>

extern fsb_registers_t fsbx__new_regs (void);
#ifdef FSB_LIB
fsb_registers_t
fsbx__new_regs (void)
{
  fsb_registers_t regs;
  int x;
  regs.fsb_task_selector = 0;
  for (x = 0; x < FSB_N_REGISTERS; ++x)
    {
      regs.fsb_reg[x] = fsbx__alloc_raw_pagemem ();
      <Initialize a Register Page>
    }
  return regs;
}
#endif
          


Initializing Register Pages

C code

The three register pages are "sui generis" objects. We have to initialize the object header fields "by hand" as a result.


<Initialize a Register Page>
from: <fsbx__new_regs>

  fsbx__set_object_head_type (regs.fsb_reg[x], FSB_REG_HEAD_TYPE);
  fsbx__set_object_tail_type (regs.fsb_reg[x], FSB_REG_TAIL_TYPE);
  fsbx__set_object_refcnt (regs.fsb_reg[x], 1);
  fsbx__set_object_gc_bits (regs.fsb_reg[x], FSB_GC_REGISTER_PAGE);
  fsbx__set_object_size (regs.fsb_reg[x], FSB_SZ_PAGE);
          


Code: Growing Stacks

(top)
C code

There is only a single mechanism for increasing the depth of an FSB stack: pushing between 1 and 256 nil values. (To push a non-nil value, programs first increase the stack depth and then copy the desired value to the newly allocated location.)

Increasing the depth of the stack can cause an oveflow condition if there is not enough room in the register object to hold the new entries. In that case, some stack entries in the register object are moved to an overflow object, to make room in the register object:


<Growing Stacks>
from: <The Whole Program>

extern fsb_registers_t fsbx__lower_stack (fsb_registers_t regs, int r, int s);

static inline fsb_registers_t
fsbx__push (fsb_registers_t regs, int r, int s, int amt)
{
  fsb_uint16_t maybe_new_sp = amt + fsbx__sp (regs, r, s);
  if (maybe_new_sp < 768)
    {
      return fsbx__set_sp (regs, r, s, maybe_new_sp);
    }
  else
    {
      regs = fsbx__lower_stack (regs, r, s);
      return fsbx__set_sp (regs, r, s, amt + fsbx__sp (regs, r, s));
    }
}

<fsbx__lower_stack>
          


"Lowering the stack" means moving stack entries from the bottom of the stack area in the register object, to the top of the