CHAPTER 32
AUTOMATIC MEMORY MANAGEMENT FOR FIXED-SIZED BLOCKS
Early LISP and ICL implementations offered only fixed-sized automatic
memory management. All blocks were of one size, a block big enough
to hold two addresses. Fixed-sized memory management incurs the least
overhead of all automatic memory management methods. Automatic
memory management is also known as ~garbage ~collection.
Because all blocks are the same size, one free list for all free
blocks is sufficient, and any new block can be allocated quickly, just
by picking always the first block off the free list.
The limitation of fixed block size becomes a deficit only when you want
to build structures that can be accessed randomly.
If you want a structure consisting of 100 elements, to
be accessed randomly, it is best to allocate one block of size 100.
In contrast, with fixed sized blocks, of size about two, it would take
a list structure consisting of 100 blocks where each block points to
the next, as in figure 32.1(b). This list structure can be accessed
serially (from left to right) as well as can the 100-bin array. Random
access is expensive though. It takes K instructions to skip over the
first K blocks, so as to arrive at the K+1'st block's datum.
In the early implementations of ICL, only fixed-sized blocks were
allowed, and so random access could be expensive. It was surprising
though how little random access was really needed. The natural
tendency to use arrays and indices for computations are often
avoidable, leading to solutions just as clear and efficient on
most occasions. (Lists most often would suffice, and trees would
cover the few other cases).
32.1
Two Basic Kinds Of Blocks: The P- And D- Blocks
ICL's fixed sized blocks are two bins, or eight bytes long. They
look like figure 32.2(a and b). Both have zero as the high-order bit
in the first bin. The next bit over distinguishes between the two
kinds of blocks.
The D-type (~data) block can hold 32-bit data in the
second bin, like an INT or a REAL. The P-type (~pointer) block can
hold no 32-bit data, but can hold a second pointer instead.
Any pointer can be zero (NIL), to represent "pointing to no block".
Each type of block has some spare bits in the first bin's high-order
byte. Those are used, for example, to distinguish between ICL's
three kinds of list blocks, a left-append, a right-append, and a
concatenation block, as shown in figure 22.1.
BOX: What is the difference between P-blocks and D-blocks?
BOX:
BOX: Why are D-blocks useful?
32.2
Marking
The strategy of nearly all garbage collectors is as follows. To
discover which blocks are no longer in use, we instead first discover
the opposite; which blocks are in fact ~in ~use. Having identified
all the blocks that are in use, all the ~other blocks are known to be
unused, and hence can be put onto the free list (for later use).
How do we discover which blocks are in use? First, we agree that a
block is ~in ~use if the user program could ~possibly arrive at that
block. Figure 32.3(a) shows a program variable, say P, of type POINT.
It points off to a block. The program can clearly get to that block,
just by writing ~P.X. The second block can also be reached from P,
by writing ~P.Y.
Starting from the variable P, we can arrive at those two blocks just
by ~following ~pointers. Therefore, those blocks are ~not garbage,
and should not be reused.
Figure 32.3(b) shows a more complex situation. Starting from P, we can
say that all blocks visited by taking all possible paths ~are ~in ~use.
The block A is in use, whereas the block B is not in use. You can't
get to block B from the variable P.
32.2.1
MARK Procedure
We discover which blocks are reachable from a given variable by
definining a recursive procedure called MARK. MARK walks all possible
paths starting from the variable:
define MARK( B: BLOCK ):
IF B is zero THEN do nothing
ELSE
MARK( the block referenced by B's first bin );
IF B's "P/D" bit is set (meaning that B's second
bin is a pointer)
THEN
MARK( the block referenced by B's second bin );
FI
FI
enddefn
MARK calls itself recursively with the pointer(s) present in the given
block B. For safety, since any pointer can be NIL (zero), MARK reacts
only if the given pointer B is non-zero. Figure 32.3(c) shows the
order in which the blocks are encountered by MARK, starting from the
variable P.
While MARK visits all accessible blocks, it presently leaves behind
no information regarding which blocks it saw. We would like MARK
to set a bit in each block visited. This way, given any block, we can
determine if it is ~in ~use simply by looking at this bit. This bit
is called the ~mark bit (figure 32.2(c)). Blocks visited by MARK will
be found to have the mark bit set, whereas all other blocks not reached
by MARK will have zero in that bit.
Here is the updated rendition of MARK that sets the mark bit:
define MARK( B: BLOCK ):
IF B is zero THEN do nothing
ELSE
IF B's ~mark ~bit is already set THEN do nothing
ELSE
Set B's ~mark ~bit
" As before ... "
MARK( the block referenced by B's first bin );
IF B's "P/D" bit is set THEN
MARK( the block referenced by B's second bin );
FI
FI
FI
enddefn
We process each block only once, when we turn on the mark bit. Second
encounters with the same block do nothing. Thus, the cost of MARKing
is proportional to the number of blocks encountered.
In figure 32.3(c), all the blocks with numbers wind up marked, whereas
all the other blocks are not marked (and hence could be rendered as
~free).
BOX: What does the procedure MARK do?
BOX:
BOX: How do you know if a given block was reached by MARK?
BOX:
BOX: The cost of marking all blocks in use is proportional
BOX: to what?
BOX:
BOX: Now that we've marked all blocks in use, how might we
BOX: gather all the free blocks?
32.3
The Roots - Where All The Variables Live
We've so far considered MARKing initiated from only ~one program
variable. We must do this for ~all program variables.
We therefore call MARK once for each program variable, so that all
blocks referenced from any variable become marked. In figure 32.3(d),
we see a second variable Q. It points off to the block labelled B.
Since we MARK from each variable, that B block will become marked. The
block labelled C, and the one that points to it, remain unmarked, since
neither of our two variable P and Q can get to them. (It is possible
that a third variable might reference the C block. If such is the
case, then the C block will wind up marked).
The garbage collector has to get a hold of all program variables.
Where are all the program's variables? ~Global variables may reside
in one large block which is permanently allocated. (That block is not
part of the universe of fixed-sized blocks). Figure 32.4(a) shows the
global variables.
Local variables reside on the "stack" (Chapter 18). We must take a
perhaps surprising step. We must replace the "stack" by ~two stacks,
called the D-stack and the P-stack. These two stacks grow and shrink
together, as though they were one stack. However, all local variables
that hold pointers reside on the P-stack (the ~pointer stack), whereas
all other, non-pointer local variables reside on the D-stack (the
~data stack).
In any strongly typed language, like ICL, each variable has an
associated type. Thus, we always know on a ~per variable basis
whether the variable's type is represented by a pointer or by a non-
pointer. Thus, it is easy for the compiler to allocate each
local variable on the appropriate stack.
The P-stack (for locals) and the block of global pointer variables
are called the ~roots. All accessible blocks are reachable from at
least one of those variables. Blocks inaccessible from any of
the roots are indeed truly not reachable from the program's point of
view. No matter how the program executes, its variables can never
be set to reference these inaccessible blocks.
BOX: What are the roots?
BOX:
BOX: Why might two distinct stack be required, in order
BOX: to implement garbage collection?
BOX:
BOX: In a strongly typed language, like ICL, why can you
BOX: always tell in which of the two stacks a given
BOX: variable should reside?
32.4
The Sweep
Having MARKed all blocks reachable from any variable, all the unmarked
blocks are now known to be unreachable. We form a free list
consisting of precisely the unmarked blocks.
Let's step back for a moment. The way the overall system works is
as follows:
1) Upon each request for a new block (during a program's
execution), pick the block off the free list
2) However, if the free list is empty (zero), then call the
garbage collector, to make up a new free list:
The garbage collector does:
a) Mark all blocks reachable from the roots
b) Make the new free list consisting exclusively
of unmarked blocks.
Upon return from the garbage collector, the new block is
taken from the (new) free list.
The process of collecting up all unmarked blocks to form a new free
list is called the ~sweep.
The sweep occurs by examining each and every block in the fixed-block
space. All blocks reside in a contiguous, giant block of memory, as
shown in figure 32.5(a). Start from one end of the giant block,
and examine each (2-bin) block in sequence, until arriving at the other
end of the giant block. For each block encountered this way, do the
following:
1) If the block is marked, turn off the mark bit and go to the
next block.
or
2) If the block is unmarked, put this block onto the free list.
(Go on to the next block).
Figure 32.5(b) shows the 2-bin blocks, some marked and others not.
Figure 32.5(c) shows the free list built up, consisting only of
unmarked blocks.
An unmarked block is put onto the free list by:
the block's first bin := FREE_LIST ;
FREE_LIST := the address of this block ;
The last block on the free list will have zero in its first bin,
indicating that there are no more blocks on the free list. (This
zero occurs naturally, as the value in FREE_LIST is initially zero.
We got into the garbage collector because of that zero condition).
32.5
Summary
The overall block allocation goes as follows:
IF FREE_LIST is empty (zero) THEN call the garbage
collector FI
ANSWER:= FREE_LIST;
FREE_LIST:= the contents of the first bin in the block
referenced by FREE_LIST ;
The garbage collector goes as follows:
"Marking..."
for each program variable
DO
MARK( the block referenced by that variable ) ;
END
"The Sweep..."
From the first block to the last block in the giant block
containing all fixed-sized blocks
DO
IF the block is marked THEN turn off the mark bit
ELSE
"Put the unmarked block onto the freelist..."
the block's first bin := FREE_LIST ;
FREE_LIST := the address of this block
FI
END
Notice that the sweep has the side effect of turning off the mark bits
in all blocks. This resets all blocks so as to be ready for the next
garbage collection, when reachable blocks' mark bits will once again
be turned on.
BOX: What is the sweep?
BOX:
BOX: Does the sweep occur before or after marking?
BOX:
BOX: What would happen to a block if you forgot to mark it?
BOX:
BOX: How do you know when to call the garbage collector?
32.6
The Cost
The main computation in the garbage collector is the MARKing of
reachable blocks. The second phase, the sweep, is relatively
inexpensive. Thus, it is the number of blocks ~in ~use that dominates
the cost of this garbage collection (GC).
What is the ~cost ~per ~block? Let:
USED be the number of blocks in use, and
FREE be the number of blocks ~not in use.
We can deduce the following:
1) The cost of this garbage collection is roughly proportional to
USED
(That's how many blocks MARK encounters).
2) This one garbage collection delivers FREE free blocks.
The cost for each block is thus:
USED / FREE
By having FREE be large relative to USED, we get a low cost per block.
Figure 32.6 shows the situation when FREE is relatively large and when
it is small. In the latter figure, we have to pay an amount
proportional to USED (a big expense), merely to get FREE (a small)
number of blocks.
It is therefore most efficient to use as much (real) memory as possible
for the space of fixed-sized blocks. For the same amount in use, the
cost per block goes as:
1 / FREE
An analogy can be drawn with the act of breathing. With large FREE,
we need to garbage collect less often because each garbage collection
delivers more blocks. This is like taking fewer, deeper breaths. With
a smaller number for FREE, we garbage collect more often and receive
fewer free blocks from each GC, like taking more, shallower breaths.
In practice with ICL on micro-VAX IIIs (3 Mips), we've observed garbage
collections lasting 5 seconds. (There may be minutes between
garbage collections). Typically 250000 blocks are in use and
250000 are freed by each GC. We've observed an application whose CPU
time was 27 minutes, using this ~half ~million block space. With ~one
~million blocks, the CPU time dropped to 19 minutes. The "1/FREE" is
thus exposed clearly. However, enlarging this to two million blocks
made little further difference.
This garbage collector works well only with real memory. For
applications requiring significantly more memory, the disk must be
used. We will show two methods for handling large databases.
32.6.1
Using The Disk - Part I
The first way to use the disk is the most efficient, although it
requires a little of the programmer's attention. The programmer must
specify where ~fractures may occur, allowing parts of structures to be
swapped out to the disk. This is often effortless in ICL, with the
help of coercions.
Experience with VLSI integrated circuit applications in ICL shows that
hundreds of designs, with some or all containing millions of figures
can be handled efficiently this way. This is done without incurring
large costs in garbage collections. With this method, the 5 or 10
second garbage collection is a fact ~no ~matter ~how ~large ~the
~database ~gets! (See Chapter 33).
32.6.2
Using The Disk - Part II
The other method for using the disk indeed makes extensive use of
virtual memory, and has strategies for putting off the first major
garbage collection. Long garbage collection times are tolerated,
although it is hoped that the application will finish soon enough to
avoid the ~big ~one.
Unlike the first method, whose cost is independent of database size,
this method incurs a cost that is linear with the size of the
database. The advantage of this method is the relief of the
programmer from having any awareness of the disk. (See Chapter 36).