CHAPTER 32


		    AUTOMATIC MEMORY MANAGEMENT FOR FIXED-SIZED BLOCKS



	  Early LISP and ICL implementations offered only fixed-sized automatic
	  memory management.  All blocks were of one size, a block big enough
	  to hold two addresses.  Fixed-sized memory management incurs the least
	  overhead of all automatic memory management methods.  Automatic
	  memory management is also known as ~garbage ~collection.

	  Because all blocks are the same size, one free list for all free
	  blocks is sufficient, and any new block can be allocated quickly, just
	  by picking always the first block off the free list.

	  The limitation of fixed block size becomes a deficit only when you want
	  to build structures that can be accessed randomly.
	  If you want a structure consisting of 100 elements, to
	  be accessed randomly, it is best to allocate one block of size 100.

	pic

	  In contrast, with fixed sized blocks, of size about two, it would take
	  a list structure consisting of 100 blocks where each block points to
	  the next, as in figure 32.1(b).  This list structure can be accessed
	  serially (from left to right) as well as can the 100-bin array.	 Random
	  access is expensive though.	 It takes K instructions to skip over the
	  first K blocks, so as to arrive at the K+1'st block's datum.

	  In the early implementations of ICL, only fixed-sized blocks were
	  allowed, and so random access could be expensive.  It was surprising
	  though how little random access was really needed.	The natural
	  tendency to use arrays and indices for computations are often
	  avoidable, leading to solutions just as clear and efficient on
	  most occasions.	 (Lists most often would suffice, and trees would
	  cover the few other cases).


32.1
Two Basic Kinds Of Blocks:  The P- And D- Blocks

	pic

	  ICL's fixed sized blocks are two bins, or eight bytes long.  They
	look like figure 32.2(a and b).  Both have zero as the high-order bit
	in the first bin.  The next bit over distinguishes between the two
	kinds of blocks.

	The D-type (~data) block can hold 32-bit data in the
	second bin, like an INT or a REAL.  The P-type (~pointer) block can
	hold no 32-bit data, but can hold a second pointer instead.

	Any pointer can be zero (NIL), to represent "pointing to no block".

	Each type of block has some spare bits in the first bin's high-order
	  byte.  Those are used, for example, to distinguish between ICL's
	three kinds of list blocks, a left-append, a right-append, and a
	concatenation block, as shown in figure 22.1.

		BOX:	What is the difference between P-blocks and D-blocks?
		BOX:
		BOX:	Why are D-blocks useful?


32.2
Marking

	The strategy of nearly all garbage collectors is as follows.  To
	discover which blocks are no longer in use, we instead first discover
	the opposite; which blocks are in fact ~in ~use.  Having identified
	all the blocks that are in use, all the ~other blocks are known to be
	unused, and hence can be put onto the free list (for later use).

	pic

	How do we discover which blocks are in use?  First, we agree that a
	block is ~in ~use if the user program could ~possibly arrive at that
	block.  Figure 32.3(a) shows a program variable, say P, of type POINT.
	It points off to a block.  The program can clearly get to that block,
	just by writing ~P.X.  The second block can also be reached from P,
	by writing ~P.Y.

	Starting from the variable P, we can arrive at those two blocks just
	by ~following ~pointers.  Therefore, those blocks are ~not garbage,
	and should not be reused.

	Figure 32.3(b) shows a more complex situation.  Starting from P, we can
	say that all blocks visited by taking all possible paths ~are ~in ~use.
	The block A is in use, whereas the block B is not in use.  You can't
	  get to block B from the variable P.


32.2.1
MARK Procedure

	  We discover which blocks are reachable from a given variable by
	  definining a recursive procedure called MARK.	 MARK walks all possible
	  paths starting from the variable:

		    define	MARK( B: BLOCK ):

			  IF	B is zero  THEN  do nothing

			  ELSE

				MARK( the block referenced by B's first bin );

			IF  B's "P/D" bit is set (meaning that B's second
						    bin is a pointer)
			THEN
				MARK( the block referenced by B's second bin );
				FI
			  FI
		    enddefn

	  MARK calls itself recursively with the pointer(s) present in the given
	  block B.	For safety, since any pointer can be NIL (zero), MARK reacts
	  only if the given pointer B is non-zero.  Figure 32.3(c) shows the
	  order in which the blocks are encountered by MARK, starting from the
	  variable P.

	  While MARK visits all accessible blocks, it presently leaves behind
	  no information regarding which blocks it saw.	 We would like MARK
	  to set a bit in each block visited.  This way, given any block, we can
	  determine if it is ~in ~use simply by looking at this bit.  This bit
	  is called the ~mark bit (figure 32.2(c)).  Blocks visited by MARK will
	  be found to have the mark bit set, whereas all other blocks not reached
	  by MARK will have zero in that bit.

	  Here is the updated rendition of MARK that sets the mark bit:

		    define	MARK( B: BLOCK ):

			  IF	B is zero  THEN  do nothing

			  ELSE
				IF  B's ~mark ~bit is already set  THEN  do nothing

			ELSE
			    Set B's ~mark ~bit

				    " As before ... "

				    MARK( the block referenced by B's first bin );

			    IF  B's "P/D" bit is set	THEN

					  MARK( the block referenced by B's second bin );
			    FI
			FI
		    FI
		enddefn


	We process each block only once, when we turn on the mark bit.  Second
	encounters with the same block do nothing.  Thus, the cost of MARKing
	is proportional to the number of blocks encountered.

	In figure 32.3(c), all the blocks with numbers wind up marked, whereas
	all the other blocks are not marked (and hence could be rendered as
	~free).

		BOX:	What does the procedure MARK do?
		BOX:
		BOX:	How do you know if a given block was reached by MARK?
		BOX:
		BOX:	The cost of marking all blocks in use is proportional
		BOX:	to what?
		BOX:
		BOX:	Now that we've marked all blocks in use, how might we
		    BOX:	gather all the free blocks?


32.3
The Roots - Where All The Variables Live

	  We've so far considered MARKing initiated from only ~one program
	variable.  We must do this for ~all program variables.

	We therefore call MARK once for each program variable, so that all
	blocks referenced from any variable become marked.  In figure 32.3(d),
	we see a second variable Q.  It points off to the block labelled B.
	Since we MARK from each variable, that B block will become marked.  The
	block labelled C, and the one that points to it, remain unmarked, since
	neither of our two variable P and Q can get to them.  (It is possible
	that a third variable might reference the C block.  If such is the
	case, then the C block will wind up marked).

	pic

	The garbage collector has to get a hold of all program variables.
	Where are all the program's variables?  ~Global variables may reside
	  in one large block which is permanently allocated.	(That block is not
	  part of the universe of fixed-sized blocks).	Figure 32.4(a) shows the
	  global variables.

	  Local variables reside on the "stack" (Chapter 18).	 We must take a
	  perhaps surprising step.  We must replace the "stack" by ~two stacks,
	  called the D-stack and the P-stack.  These two stacks grow and shrink
	  together, as though they were one stack.  However, all local variables
	  that hold pointers reside on the P-stack (the ~pointer stack), whereas
	  all other, non-pointer local variables reside on the D-stack (the
	  ~data stack).

	  In any strongly typed language, like ICL, each variable has an
	  associated type.  Thus, we always know on a ~per variable basis
	  whether the variable's type is represented by a pointer or by a non-
	pointer.  Thus, it is easy for the compiler to allocate each
	local variable on the appropriate stack.

	The P-stack (for locals) and the block of global pointer variables
	are called the ~roots.  All accessible blocks are reachable from at
	least one of those variables.  Blocks inaccessible from any of
	the roots are indeed truly not reachable from the program's point of
	  view.  No matter how the program executes, its variables can never
	  be set to reference these inaccessible blocks.

		    BOX:	What are the roots?
		    BOX:
		    BOX:	Why might two distinct stack be required, in order
		    BOX:	to implement garbage collection?
		    BOX:
		    BOX:	In a strongly typed language, like ICL, why can you
		    BOX:	always tell in which of the two stacks a given
		    BOX:	variable should reside?


32.4
The Sweep

	  Having MARKed all blocks reachable from any variable, all the unmarked
	  blocks are now known to be unreachable.	 We form a free list
	  consisting of precisely the unmarked blocks.

	  Let's step back for a moment.  The way the overall system works is
	as follows:

	   1)	Upon each request for a new block (during a program's
		    execution), pick the block off the free list

	     2)   However, if the free list is empty (zero), then call the
		    garbage collector, to make up a new free list:

			 The garbage collector does:

				a)	  Mark all blocks reachable from the roots

				b)	  Make the new free list consisting exclusively
					  of unmarked blocks.

			  Upon return from the garbage collector, the new block is
			  taken from the (new) free list.

	  The process of collecting up all unmarked blocks to form a new free
	  list is called the ~sweep.

	pic

	  The sweep occurs by examining each and every block in the fixed-block
	  space.  All blocks reside in a contiguous, giant block of memory, as
	  shown in figure 32.5(a).  Start from one end of the giant block,
	  and examine each (2-bin) block in sequence, until arriving at the other
	  end of the giant block.  For each block encountered this way, do the
	  following:

	     1)   If the block is marked, turn off the mark bit and go to the
		    next block.
	  or
	     2)   If the block is unmarked, put this block onto the free list.
		    (Go on to the next block).

	  Figure 32.5(b) shows the 2-bin blocks, some marked and others not.
	  Figure 32.5(c) shows the free list built up, consisting only of
	  unmarked blocks.

	  An unmarked block is put onto the free list by:

		    the block's first bin := FREE_LIST ;

		FREE_LIST := the address of this block ;

	The last block on the free list will have zero in its first bin,
	indicating that there are no more blocks on the free list.  (This
	zero occurs naturally, as the value in FREE_LIST is initially zero.
	We got into the garbage collector because of that zero condition).


32.5
Summary

	The overall block allocation goes as follows:

		IF  FREE_LIST is empty (zero)  THEN  call the garbage
							      collector  FI

		ANSWER:= FREE_LIST;
		FREE_LIST:= the contents of the first bin in the block
			    referenced by FREE_LIST ;


	The garbage collector goes as follows:

		"Marking..."

		for each program variable
		DO
			MARK( the block referenced by that variable ) ;
		END

		"The Sweep..."

		From the first block to the last block in the giant block
		containing all fixed-sized blocks
		DO
			IF  the block is marked  THEN  turn off the mark bit
			ELSE
			    "Put the unmarked block onto the freelist..."

			    the block's first bin := FREE_LIST ;

				    FREE_LIST := the address of this block
				FI
		    END

	  Notice that the sweep has the side effect of turning off the mark bits
	  in all blocks.	This resets all blocks so as to be ready for the next
	  garbage collection, when reachable blocks' mark bits will once again
	be turned on.

		BOX:	What is the sweep?
		BOX:
		BOX:	Does the sweep occur before or after marking?
		BOX:
		BOX:	What would happen to a block if you forgot to mark it?
		BOX:
		BOX:	How do you know when to call the garbage collector?


32.6
The Cost

	The main computation in the garbage collector is the MARKing of
	reachable blocks.  The second phase, the sweep, is relatively
	inexpensive.  Thus, it is the number of blocks ~in ~use that dominates
	the cost of this garbage collection (GC).

	What is the ~cost ~per ~block?  Let:

		USED	be the number of blocks in use, and
		FREE	be the number of blocks ~not in use.

	We can deduce the following:

	   1)	The cost of this garbage collection is roughly proportional to
		USED

		(That's how many blocks MARK encounters).

	     2)   This one garbage collection delivers FREE free blocks.

	  The cost for each block is thus:

		    USED / FREE


	pic

	  By having FREE be large relative to USED, we get a low cost per block.
	  Figure 32.6 shows the situation when FREE is relatively large and when
	  it is small.  In the latter figure, we have to pay an amount
	  proportional to USED (a big expense), merely to get FREE (a small)
	  number of blocks.

	  It is therefore most efficient to use as much (real) memory as possible
	  for the space of fixed-sized blocks.  For the same amount in use, the
	  cost per block goes as:

		    1 / FREE


	  An analogy can be drawn with the act of breathing.	With large FREE,
	  we need to garbage collect less often because each garbage collection
	  delivers more blocks.	 This is like taking fewer, deeper breaths.  With
	  a smaller number for FREE, we garbage collect more often and receive
	  fewer free blocks from each GC, like taking more, shallower breaths.

	  In practice with ICL on micro-VAX IIIs (3 Mips), we've observed garbage
	collections lasting 5 seconds.  (There may be minutes between
	garbage collections).  Typically 250000 blocks are in use and
	250000 are freed by each GC.  We've observed an application whose CPU
	  time was 27 minutes, using this ~half ~million block space.  With ~one
	  ~million blocks, the CPU time dropped to 19 minutes.  The "1/FREE" is
	  thus exposed clearly.	 However, enlarging this to two million blocks
	  made little further difference.

	  This garbage collector works well only with real memory.	For
	  applications requiring significantly more memory, the disk must be
	  used.  We will show two methods for handling large databases.


32.6.1
Using The Disk - Part I

	  The first way to use the disk is the most efficient, although it
	  requires a little of the programmer's attention.  The programmer must
	specify where ~fractures may occur, allowing parts of structures to be
	swapped out to the disk.  This is often effortless in ICL, with the
	help of coercions.

	Experience with VLSI integrated circuit applications in ICL shows that
	hundreds of designs, with some or all containing millions of figures
	can be handled efficiently this way.  This is done without incurring
	large costs in garbage collections.  With this method, the 5 or 10
	second garbage collection is a fact ~no ~matter ~how ~large ~the
	~database ~gets!  (See Chapter 33).


32.6.2
Using The Disk - Part II

	The other method for using the disk indeed makes extensive use of
	virtual memory, and has strategies for putting off the first major
	garbage collection.  Long garbage collection times are tolerated,
	although it is hoped that the application will finish soon enough to
	avoid the ~big ~one.

	Unlike the first method, whose cost is independent of database size,
	this method incurs a cost that is linear with the size of the
	database.  The advantage of this method is the relief of the
	programmer from having any awareness of the disk.  (See Chapter 36).