CHAPTER 12


				A GENERAL PARSER



	  The presentation of our parser has two parts.	 The first part, this
	  chapter and Chapter 14, covers the essentials of the parser, including
	  the use of general rewrite rules, and the polynomial upper bound for
	  context-free grammars.  The upper bound is a function of the length
	  of the input string.	(Chapter 13, in the middle, provides a proof of
	  correctness).

	  The second part, Section 16.2, optimizes the parser so as to be
	  efficient for grammars having many rules.  Sometimes, our parser will
	  be used with computer-generated grammars, which can have many rules
	  indeed.  This latter part does not improve the theoretical upper
	  bounds, but it does render faster execution times in practice.

	  The parser presented in this chapter alone is similar to an earlier
	  one by Thompson[], which was itself based on one by Kay[].  We modify
	  it in Chapter 14 so that it always terminates and works always in
	  polynomial time as a function of the length of the input string (for
	  context-free grammars).

	  Thompson has successfully used it to implement (formalizations of)
	  natural languages, including English and Japanese [].

	  We begin by lifting some datatypes from Chapter 9 and refitting them
	  for convenient parsing.


12.1
The Types PHRASE and CHOICE_OF_PHRASES

	  Let's redeclare the types PHRASE and CHOICE_OF_PHRASES introduced
	in chapter 9:

		TYPE	PHRASE =  [ LEFT: CHOICE_OF_PHRASES   POS: INT ] ;

			CHOICE_OF_PHRASES =  { PHRASE } ;

	Relative to chapter 9, we replace what was called the CHAR field
	now with the POS (Part-Of-Speech) field, of type INT.

	Any part-of-speech, including characters on your terminal, is
	represented by a unique integer.  We will write things like "<DIGIT>"
	to represent the unique integer chosen to represent the part-of-speech
	<DIGIT>.  The POS field thus holds nonterminal parts-of-speech as
	well as terminals.

	We differ also from chapter 9 in that we make "the rest of the phrase",
	next to the POS, appear to the ~left instead of to the right.  There,
	where phrases point rightward, ~new blocks are created going from
	right to left.

	However, we want to take user input from left-to-right.  ~Newer blocks
	are going to be created to the right.  They now reference older blocks
	to the left, via the LEFT field.

	  
	  

	Figure 12.1 shows a CHOICE_OF_PHRASES together with each of its
	constituent PHRASEs.  Each PHRASE's LEFT field points to the same
	  place, to another CHOICE_OF_PHRASES.  The leftmost part of figure 12.2
	  also shows the same CHOICE_OF_PHRASES.	Further to the right, we see
	  another, similar looking CHOICE_OF_PHRASES.  That rightmost
	  CHOICE_OF_PHRASES (omitting the bottom <FORM>) looks identical to
	  figure 12.1, and here we see what all the LEFT fields point to, the
	  middle CHOICE_OF_PHRASES.

		    BOX:	Why do we have PHRASE point leftward?
		    BOX:


12.1.1
Examples Of CHOICE_OF_PHRASES

	  Figure 12.1 was created when the user typed a 1.  The "numbers" and
	  "formulas" grammars (Chapter 1) were active.	The rule

		    1		->	  <DIGIT>

	  created the <DIGIT> phrase.	 The rule

		    <DIGIT>	   ->	  <NUMBER>

	  produced the <NUMBER> phrase.  The <ATOM>, <TERM>, and <FORM> phrases
	  were generated by the rules:

		    <NUMBER>	  ->	    <ATOM>
		    <ATOM>		  ->	    <TERM>
		    <TERM>		  ->	    <FORM>

	  The leftmost part of figure 12.2, like figure 12.1, shows this.	 The
	  middle CHOICE_OF_PHRASES in figure 12.2 is created when the user types
	  the next character, "+".  (The rightmost CHOICE_OF_PHRASES does not
	  exist yet).

	  Finally, the user types a 2, and the rightmost CHOICE_OF_PHRASES
	  comes into existence.	 Except for the very bottom <FORM> phrase, this
	  CHOICE_OF_PHRASES, triggered by the 2, looks identical to the leftmost
	  CHOICE_OF_PHRASES, and was created by the same rule applications
	  (except using

		    2		->	  <DIGIT>

	  instead of

		    1		->	  <DIGIT>		).

	  The bottom <FORM> phrase was generated by the rule:

		    <FORM>	+  <TERM>	    ->	<FORM>

	  This odd block has a different LEFT field than the other PHRASE blocks.
	  This LEFT points to the same place that the leftmost CHOICE_OF_PHRASES
	  points to, to the left of the "1".

	  When a rule applies, it contributes its give-phrase in such a way as
	  to share the same left and right neighbors as the matched occurence
	  of its want-phrase.  This is what it means to rewrite one phrase
	  to another, leaving unchanged everything to the left and right.

	  The want-phrase and the give-phrase share ~right neighbors by
	  appearing in the same CHOICE_OF_PHRASES.  They share ~left neighbors
	  by sharing the same CHOICE_OF_PHRASES to the left (via the leftmost
	  PHRASE block's LEFT field).


12.2
What Is A ~Visible ~Phrase?

	Start at the rightmost CHOICE_OF_PHRASES in figure 12.2.
	You can travel leftward.  Upon each CHOICE_OF_PHRASES, you choose
	~one of the PHRASE blocks therein and continue leftward from it.
	A visible phrase is any leftward going sequence of PHRASE blocks that
	you encounter on such a leftward trip.

	More precisely, a visible phrase is a leftward going sequence of
	PHRASE blocks, where PHRASE(k) must be a left neighbor of
	PHRASE(k-1):

	     PHRASE(k)  is a member of the CHOICE_OF_PHRASES  PHRASE(k-1).LEFT

	  

	Figure 12.3 shows such a sequence.  Many visible phrases
	are represented in a CHOICE_OF_PHRASES and in a PHRASE.
      ----------------(parentheses in previous paragraph mean subscripting)----

	We will be intensely interested in the ~rightmost PHRASE block of any
	visible phrase.  The rightmost block is the most interesting because it
	is the ~last of the PHRASE blocks created.  A PHRASE block always
	points leftward to an older block.

	The first moment that a visible phrase comes to exist is when its
	rightmost PHRASE block is created.  We will apply a rule of grammar
	upon a visible phrase only when its rightmost block comes to exist.

	We say that a visible phrase ~emenates from a PHRASE block if that
	block is the visible phrase's rightmost PHRASE block.	 We also say
	  that a visible phrase emenates from a CHOICE_OF_PHRASES, when the
	  CHOICE_OF_PHRASES contains the rightmost PHRASE block.

	  Our parsing strategy is to:

	     1)   Represent the given string as a visible phrase.

	     2)   Apply all rules of grammar so that the results of all possible
		    rewrites also appear as visible phrases.

	     3)   If the given string is a valid member of the grammar's
		language, then among all those visible phrases will be an
		occurence of the grammar's root part-of-speech, spanning all
		    the way across the input string.

	  The parsing is done when that full-spanning root part-of-speech
	  appears.

		    BOX:	What is the definition of a ~visible ~phrase?
		    BOX:
		    BOX:	Will the original input string be a visible phrase?
		    BOX:
		    BOX:	For legal input, what else will be a visible phrase?


12.2.1
Transition From Syntax To Semantics

	  In Chapter 1, we said that the end of syntax analysis and the beginning
	  of semantic processing occured when a rule like the following is
	  applied:

		    something	  ->	    a program

	  For us, that rule would be:

		    <ROOT_POS>	  ->	    a program

	  This rule would trigger at least when the full-spanning occurence of
	  <ROOT_POS> appears.

	  We should really write this rule as:

		    <START>	 <ROOT_POS>	 <FINISH>	  ->	  the program

	  and surround the given input string with <START>...<FINISH>.  (<START>
	  and <FINISH> should be entirely new parts-of-speech, appearing in no
	  other rule).

	  This will limit the invocation of this program, the
	  commencement of semantics, to apply only for occurences of the root
	  part-of-speech that span the entirety of the given input string.  This
	  guards against other conceivable occurences of the root part-of-speech,
	  which might occur over substrings of the given string.

	  The program will execute as soon as the phrase:

		    <START>	 <ROOT_POS>	 <FINISH>

	  appears as a visible phrase.


12.2.2
The Sequence Of Rewrites

	  Imagine the input string, and then imagine the next string derived
	  by applying one rewrite.  You can imagine a sequence of full-spanning
	  strings, each one derived from the previous string by applying one
	  rewrite, somewhere.  Such a sequence can end with a full-spanning
	  occurence of the root part-of-speech, if the original string is a
	  member of the grammar's language.

	Each string in that sequence will in fact be a full-spanning visible
	phrase in the finished, rightmost CHOICE_OF_PHRASES.


12.2.3
More Growth

	  

	Figure 12.2 shows all the CHOICE_OF_PHRASES that exist after the
	user types the "1+2".  Figure 12.4 shows the situation after the user
	types a "*" and "3", beyond the original "1+2".  The middle
	CHOICE_OF_PHRASES and everything to its left is identical to figure
	12.2.  Remember, we will apply always ~subjective ~modification, and so
	no existing data ever change.

	The "1+2" in figure 12.2 is merely grown upon.  All possible
	interpretations of the "1+2" phrase already exist in figure 12.2.
	Figure 12.4 is a rightward augmentation of figure 12.2.

	The rightmost CHOICE_OF_PHRASES is similar to figure 12.1, with the
	addition of three new phrases, a <TERM>, a <FORM>, and another <FORM>
	(bottom).  Of those last three phrases, the <TERM> was generated by the
	rule:

		<TERM>  *  <ATOM>	->	<TERM>

	upon matching the <TERM> below "2" and the <ATOM> below "3".

	The <FORM> (second from the bottom) was generated by the application
	of the rule:

		<TERM>	    ->	     <FORM>

	upon our latest <TERM>.  Naturally, both the <TERM> and <FORM> have
	the same span (the same left neighbors and ultimately the same right
	neighbors).  The very bottom <FORM> was generated by the rule

		<FORM>  +  <TERM>	->	<FORM>

	It matched the <FORM> below "1" and the new <TERM> below "2*3".  This
	latest <FORM> spans the entire input string, because the matched phrase
	did.

	Figures 12.1, 12.2, and 12.4 show all possible interpretations within
	the phrases "1", "1+2", and "1+2*3" respectively.  All possible rules
	have applied at all possible places.  We will always build such rich
	CHOICE_OF_PHRASES.

	We need a way to see all possible visible phrases represented in a
	CHOICE_OF_PHRASES.  Rules need to see occurences of their want-phrases
	so that they can apply.

		BOX:	Is there a PHRASE block for "1+2" in the string "1+2*3"
		BOX:	in figure 12.4?
		BOX:
		BOX:	Is there a <TERM> phrase for "1+2" in the figure?
		BOX:
		BOX:	Are any of the phrases spanning "1+2" modified or
		BOX:	deleted by the time we take in the "*3"?


12.3
Seeing All Visible Phrases Emenating From A CHOICE_OF_PHRASES

	We use the type CHOICE_OF_PHRASES to represent an ambiguous phrase, the
	structure representing many individual, unambiguous phrases.  Each
	rule of grammar "wants" to see occurences of its want-phrase in the
	CHOICE_OF_PHRASES.  That is, since rules are supposed to apply
	everywhere as much as possible, we say that a rule "wants" to see all
	visible phrases matching its want-phrase.

	  

	Given a CHOICE_OF_PHRASES in a variable C, e.g., as in figure 12.5(a),
	the following quantifier (loop generator) sets P1 to each individual
	PHRASE in C:

		FOR  P1  $E  C ;

	Figure 12.5(a) shows the values taken on by P1, given C.

	Let's concentrate for the moment on the rule:

		    <TERM>	  ->	  <FORM>

	  This rule "wants" to see all possible <TERM>s.  If we limit our
	  attention to the CHOICE_OF_PHRASES C, then we can see only one <TERM>.
	  In general, there may be more than one <TERM> in a CHOICE_OF_PHRASES
	  (e.g., in the rightmost CHOICE_OF_PHRASES in figure 12.4).

	  With the following concise notation, we can cause one iteration not
	  per PHRASE in C, but per PHRASE whose POS is <TERM>:

		    FOR  P1	 $E  C;    WITH  P1.POS = <TERM> ;

	  The "FOR" part causes one iteration per PHRASE in C, and the "WITH"
	  part ~filters ~out all iterations except for those where
	  P1.POS = <TERM>.  Only P1's whose POS equals <TERM> survive (figure
	12.5(b)).  (Sections 22.3.1 and 22.3.3 present these constructions more
	completely).

	We always apply a quantifier to something else.  The quantifier causes
	iterations, and the something else is executed once per iteration.
	We can apply a quantifier to a sentence, e.g.,

		FOR P1 $E C;

		DO  something  END

	or
		FOR P1 $E C;  WITH  P1.POS = <TERM>;

		DO  something  END

	The latter example could be rewritten more traditionally as:

		FOR P1 $E C;
		DO
		    IF  P1.POS = <TERM>  THEN  something  FI
		END

	The "WITH" notation is more concise, and we hope clearer.

	The "<TERM> -> <FORM>" rule wants to match all <TERM>s, and for each
	match, it gives a <FORM> having the same span as the matched <TERM>.
	We can write this tendency as a program:

		FOR P1 $E C;  WITH P1.POS = <TERM>;
		DO
		    GIVE(  [LEFT: P1.LEFT  POS: <FORM>]  );
		END

	For each matched <TERM>, we call GIVE.  We pass to GIVE the replacement
	phrase.  The replacement phrase, the rule's give-phrase, has a LEFT
	  and a POS field.  In this example, the POS (Part-Of-Speech) is <FORM>
	  and the LEFT is P1.LEFT, the ~LEFT field of the matched phrase.

	  GIVE inserts the new given phrase into the CHOICE_OF_PHRASES C.	 For
	  the moment, we can imagine that GIVE is defined by:

		    DEFINE	GIVE( P: PHRASE ):

				C:= C $> P ;

		    ENDDEFN

	  C thus acquires the new phrase, having now both the original <TERM>
	  phrase and now the new <FORM> phrase passed to GIVE as well.

	  The "<TERM> -> <FORM>" rule has been presented as a program, a program
	  which sees all <TERM>s in the CHOICE_OF_PHRASES C and then inserts
	  a corresponding <FORM> for each one.  The inserted <FORM> has the same
	  span as the found <TERM>:  Both have the same LEFT (P1.LEFT) and both
	  reside in the same CHOICE_OF_PHRASES C.

	  Let's consider another rule, whose want-phrase has length greater than
	one:

		<FORM>  +  <TERM>	->	<FORM>

	  

	Consider figure 12.6.  We match the rightmost part-of-speech in the
	want-phrase first:

		FOR P1 $E C;  WITH  P1.POS = <TERM> ;

	Figure 12.6 shows the value in P1 for the one iteration caused by this
	quantifier.  (We use the quantifier because in general, there may be
	more than one match).  Given that P1, we can
	acquire all possible phrases to its left, by writing:

		FOR P2 $E P1.LEFT ;

	P2 is set to all PHRASEs in the CHOICE_OF_PHRASES P1.LEFT.  We filter
	out everyting except "+"s via:

		FOR P2 $E P1.LEFT ;  WITH  P2.POS = '+' ;

	Of course in this example, this quantifier also causes only one
	iteration.  The one value for P2 appears in the figure at the "+".

	We put both quantifiers together using the ICL operator "!!":

		FOR P1 $E C;  WITH P1.POS = <TERM> ;
		!!
		FOR P2 $E P1.LEFT;  WITH  P2.POS = '+' ;

	The "!!" means:

		for each iteration caused by the first quantifier,
		make the second quantifier cause all of its iterations.

		This is a ~nesting of loops.

	(Section 22.3.2 documents this operator).  For example, the quantifier:

		FOR P1 $E C;
		!!
		FOR P2 $E P1.LEFT;

	sets P1 to each phrase in C, and for each of those P1's, it sets P2 to
	  all phrases to the immediate left of P1.  For each P1, we set P2 to
	  each phrase in the CHOICE_OF_PHRASES to the left.  This quantifier can
	  be written less concisely but more familiarly as:

		    FOR P1 $E C;
		    DO
				FOR P2 $E P1.LEFT ;
				DO
					  something
				END
		    END


	  We match all possible "<FORM> + <TERM>" phrases starting from C via
	  the quantifier:

		    FOR P1 $E C;	  WITH P1.POS = <TERM> ;
		    !!
		    FOR P2 $E P1.LEFT;	WITH P2.POS = '+' ;
		    !!
		    FOR P3 $E P2.LEFT;	WITH P3.POS = <FORM> ;

	  P1 is taken from C, P2 is taken from P1.LEFT, and P3 is taken from
	  P2.LEFT.	The triple P3,P2,P1 represents the matched phrase, on each
	  iteration.  This is a three-dimensional quantifier, and can be written
	  less concisely as:

		    FOR P1 $E C;
		    DO  IF P1.POS=<TERM> THEN
				FOR P2 $E P1.LEFT;
				DO  IF  P2.POS='+' THEN
					  FOR P3 $E P2.LEFT;
					  DO	IF  P3.POS=<FORM>	 THEN
						    something
						FI
					  END
				    FI
				END
			  FI
		    END

	  We complete the rule by applying that 3-D quantifier to a GIVE:

		    FOR P1 $E C;	  WITH P1.POS = <TERM> ;
		    !!
		    FOR P2 $E P1.LEFT;	WITH P2.POS = '+' ;
		    !!
		    FOR P3 $E P2.LEFT;	WITH P3.POS = <FORM> ;

		    DO	GIVE(	 [LEFT: P3.LEFT  POS: <FORM>]	 ); END

	  Each match (iteration) GIVEs a <FORM> whose LEFT is that of P3, the
	  leftmost matched item.

	  In general, we translate a rule:

		    <POS(1)> <POS(2)> ... <POS(n)>	    ->	<GIVE_POS>
	  into
		    FOR P(n) $E C;  WITH  P(n).POS = POS(n) ;
		    !!
		    FOR P(N-1) $E P(n).LEFT;	WITH	P(n-1).POS = POS(n-1) ;
		    !!
		    ...
		    !!
		    FOR P(1) $E P(2);  WITH  P(1).POS = POS(1);
		    DO
				GIVE(	 [LEFT: P(1).LEFT	 POS: <GIVE_POS>]	 );
		    END

	  For example, a rule whose want-phrase is of length 1 is:

		    FOR P(1) $E C;  WITH  P(1).POS = POS(1) ;
		    DO
				GIVE(	 [LEFT: P(1).LEFT	 POS: <GIVE_POS>]	 );
		    END
	---------------- Parentheses in previous paragraph mean subscripting! ---
	----------------(except after "GIVE")------------------------------------

	  

	  Let's see how the 3-D quantifier just shown for the "<FORM>+<TERM>"
	rule behaves.  Figure 12.7, which represents "1+2*3", shows the values
	taken on by the variables P1, P2, and P3.  The outer quantifier, the
	P1 quantifier, has two iterations, defining P1(1) and P1(2).  (Those
	are the only <TERM>s emenating from C).
      ---------------- Parentheses in previous paragraph around "1" and "2"
      ----------------					mean subscripting! ----

	For P1(1), the second quantifier, which looks for a "+", causes zero
	iterations.  (There is no "+" in the CHOICE_OF_PHRASES P1(1).LEFT,
	only a "*").  Thus P1(1) appears in zero iterations of the full 3-D
	quantifier.  However, P1(2), whose LEFT is different from that of
	P1(1), does land us at a place where the second quantifier ("+") can
	now generate one iteration.  From there, the innermost quantifier,
	which is looking for a <FORM>, itself causes one iteration.  The full
	3-D quantifier thus generates a total of one iteration, matching only
	one phrase.  (There are no other occurences of the phrase
	"<FORM>+<TERM>" emenating from C).
      ---------------- Parentheses in previous paragraph around "1" and "2"
      ----------------					mean subscripting! ----

	  

	Consider figure 12.8, which represents "1+2+3".  The first quantifier
	causes one iteration.  The second quantifier from there matches the
	second plus, causing one iteration.  The innermost, <FORM>, quantifier
	causes two iterations, defining two values for P3.  The whole 3-D
	quantifier thus causes a total of two iterations.

	P3 is different in those two grand iterations.  P3.LEFT is also
	different on each iteration.  Notice how this 3-D quantifier matches
	two phrases, where each has a different span (LEFT).  This will cause
	the "<FORM>+<TERM>" rule to apply twice, once under the "2+3" and once
	under the "1+2+3".

		BOX:	How are visible phrases seen?
		BOX:
		BOX:	How do you translate the rule:
		BOX:
		BOX:		<ATOM>  ->  <TERM>
		BOX:
		BOX:	into a program?
		BOX:
		BOX:	How about the rule:
		BOX:
		BOX:		(  <FORM>  )		->	<ATOM>	?


12.3.1
The Grammar Is A Program

	We have seen how to render a grammar rule as a program.  The program
	finds all matches in the CHOICE_OF_PHRASES C for the rule's want-
	  phrase.  For each match, the rule GIVEs its give-phrase.	The given
	  phrase will reside on C, just like the rightmost end of the matched
	  phrase does (P1, in C).  Also, both the matched and the new phrase
	  share the same LEFT neighbors (P3.LEFT).  Each iteration rewrites
	  (subjectively) the matched phrase into the give-phrase, preserving
	  left and right neighbors.

	  Imagine such a program for each rule of grammar.  Put them all
	  together, one after another, in any order.  That overall program is
	  said to be the "grammar program".

	  Our grammar program will apply all rules upon all the phrases emenating
	  from C.  Each application augments C, so that the rules' give-phrases
	come to exist in C.


12.3.2
Assuring That All Rules Apply

	Given a CHOICE_OF_PHRASES that includes the "1" phrase in figure 12.1,
	a call to the grammar will introduce the <DIGIT> phrase.  Another call
	will now see the <DIGIT>, and will introduce the <NUMBER> phrase.
	How many times must we call the grammar in order to assure that all
	rules have applied?

	We need to call the grammar each time a ~new phrase comes to exist.
	Remember, the grammar has to see all possible phrases, in our effort
	to apply all rules in all possible ways.  If we give the grammar a
	crack at all new phrases, since every phrase is new at some time, all
	rules will have been applied to all phrases.  A new phrase comes to
	exist in GIVE, when GIVE puts the new phrase onto C.

	Let's make GIVE call the grammar:

		    DEFINE	GIVE( P: PHRASE ):

				C:= C $> P ;

				call the grammar program

		    ENDDEFN

	  Now, whenever a phrase comes to exist (GIVE), the grammar will be
	  employed and will try to apply all rules to it.

	  In the end, if you see an occurence of some rule's want-phrase
	emenating from C, then we know that the rule's give-phrase also
	  emenates from C, sharing the same LEFT with the want-phrase.  That is,
	  that occurence of the want-phrase has a rightmost PHRASE block.	 At
	  the time GIVE put it onto C, GIVE also called the grammar, and hence
	  the grammar saw the want-phrase.	At that time, the grammar introduced
	  (via GIVE) its give-phrase.

	  The GIVE procedure and the grammar are mutually recursive.  Each calls
	  the other.


12.3.3
A Necessary Refinement

	  We almost have a working parser.	We will verify its completeness and
	  its termination shortly.  However, our most recent definition of GIVE
	  causes some redundancies.

	  GIVE not only inserts the given phrase P onto the global variable C,
	  but it also calls the grammar.  The grammar sees ~all phrases in C
	  each time.  That is, if C contains a phrase X now, then each time the
	  grammar calls GIVE, GIVE's subsequent call to the grammar will have
	the grammar see the phrase X again, because X is still on C.  All
	phrases in C will be seen again and again, each time GIVE calls the
	grammar.

	We really want the grammar to see each phrase only once.  The grammar
	should examine not C, but only the one phrase P passed into GIVE.
	Let's change GIVE as follows:

		    DEFINE	GIVE( P: PHRASE ):

				C:= C $> P ;

				"call the grammar passing in P..."

				the_grammar( P );

		    ENDDEFN

	  GIVE now passes P into the grammar.  We must now change slightly the
	  grammar program.  The grammar, a function, can itself be defined as:

		    DEFINE	the_grammar( P: PHRASE ):

				program for rule #1
				program for rule #2
				...
				program for rule #n

		    ENDDEFN

	  The grammar now takes in a parameter P.	 The grammar should use P
	  instead of C as it tries to find phrase matches.  The program for the
	  "<FORM>+<TERM>" rule has as its first line:

		    FOR P1 $E C;	WITH	P1.POS = <TERM> ;
		    !!
		    ...

	  Let's change that to be:

		FOR P1 $E {P};  WITH  P1.POS = <TERM> ;

	We have replaced C with a new CHOICE_OF_PHRASES that contains only the
	one PHRASE P.  All the rest of the rule's program is unchanged.

	  Now the grammar sees only the new PHRASE given to GIVE, and not all of
	  C.

	  Here is the whole program for this rule:

		    FOR P1 $E {P};    WITH P1.POS = <TERM> ;
		    !!
		    FOR P2 $E P1.LEFT;	WITH P2.POS = '+' ;
		    !!
		    FOR P3 $E P2.LEFT;	WITH P3.POS = <FORM> ;

		    DO	GIVE(	 [LEFT: P3.LEFT  POS: <FORM>]	 ); END

	  For efficiency, let's replace the first line again.  Notice that the
	FOR quantifier on that line causes only one iteration.  The WITH
	clause might filter out that one iteration.  Thus, the first line
	causes zero or one iterations.

	Let's change the first line and its subsequent "!!" into an IF
	  statement.  (IF statements always cause zero or one "iterations" over
	  the IF's THEN-clause).  The rule now appears as:

		P1:= P;
		IF  P1.POS = <TERM>  THEN

			" The rest is unchanged. "

			FOR P2 $E P1.LEFT;  WITH P2.POS = '+' ;
			!!
			FOR P3 $E P2.LEFT;  WITH P3.POS = <FORM> ;

			DO	GIVE(  [LEFT: P3.LEFT  POS: <FORM>]  );	END
		FI

	The
		P1:= P ;
		IF  P1.POS = <TERM>  THEN
			...

	replaces the old, and equivalent:

		FOR P1 $e {P};  WITH  P1.POS = <TERM> ;
		!!
		...

	Our general program for any one rule:

		<POS(1)> <POS(2)> ... <POS(n)>	   ->     <GIVE_POS>

	appears as:

		P(n) := P ;
		IF  P(n).POS = POS(n)  THEN

			FOR P(n-1) $E P(n).LEFT;  WITH  P(n-1) = POS(n-1) ;
			!!
			FOR P(n-2) $E P(n-1).LEFT;  WITH  P(n-2) = POS(n-2);;
			!!
			...
			!!
			FOR P(1) $E P(2).LEFT;  WITH  P(1) = POS(1);
			DO
				GIVE( [LEFT: P(1).LEFT  POS: <GIVE_POS>] );
			END
		FI

	Coercion rules, whose want-phrases are of length 1, thus come out with
	only an IF and no FORs:

		P(1) := P;
		IF  P(1).POS = POS(1)  THEN

			GIVE(  [LEFT: P(1).LEFT  POS: <GIVE_POS>]  );
		FI
      ---------------- Parentheses in previous paragraph mean subscripting! ---
      ---------------- except after "GIVE" ------------------------------------


	Recall that the call to GIVE puts the new give-phrase onto C.  Thus,
	the new phrase spans from P(1).LEFT to C, the same span covered by the
	matched want-phrase.
      ---------------- Parentheses in previous paragraph mean subscripting! ---


	   BOX:	Why is an IF-statement instead of a FOR-quantifier used to see
	   BOX:	the rightmost token in a phrase?
	   BOX:
	   BOX:	How many FOR-quantifiers are nested ("!!") within a rule
	   BOX:	whose want-phrase's length is L?
	     BOX:
	     BOX: Why do we examine visible phrases from right-to-left?
	     BOX:
	     BOX: Why is every visible phrase seen by the grammar?


12.4
Growing An Ambiguous Phrase From User Input

	  We produce figure 12.1 via the following:

		    C:= NIL;
		    GIVE(  [POS:'1']  );

	  C is now figure 12.1.	 Recall that C is a global variable augmented
	  by GIVE.

	  The "1" PHRASE block in that figure is the one
	  passed to GIVE from here.  The others exist because GIVE calls the
	  grammar so as to see the "1" phrase.  The grammar GIVEs all the other
	  PHRASE blocks in this figure.

	  We take in the next character and place to the right of C as follows:

		    LEFT:= C;

		    C:= NIL;
		    GIVE(  [LEFT:LEFT  POS:'+']  );

	  The "+" PHRASE block has as its LEFT the result of the previous step,
	  figure 12.1.

	  We take in each successive character and do exactly the same thing:

		    LEFT:= C;	  "What will be to our left:	the result of the
					   previous step"

		    C:= NIL;	  "Clear the slate"
		    GIVE(  [LEFT:LEFT  POS: the new character ]	 );

	  The whole input is processed by the loop:

		    LEFT:= NIL;

		    FOR each input character CHAR
		    DO
				C:= NIL;
				GIVE(	 [LEFT:LEFT	 POS:CHAR]	);

				LEFT:= C;
		    END

	  At the end, C has the answer.

	  The important thing is that the PHRASE block for the K'th character
	has as its LEFT a CHOICE_OF_PHRASES that includes the (K-1)'th
	  character.  That is, when GIVing the K'th character, LEFT held the
	previous value of C, which contained the (K-1)'th character.

	  The final value in C thus contains the given input string
	  as a visible phrase.	It is a full-spanning phrase because it
	  emenates from C, the far right.  The LEFT of the leftmost character is
	  NIL.  There is nothing further to the left.

	  At this point, our parser is done for context-free rules, except for
	  one more modification which will be necessary in order to
	  guarantee that this parser finishes in finite time for all context-
	  free grammars (Chapter 14).

		    BOX:	Why is the input string a visible phrase in C, after
		    BOX:	taking all the user input?
		    BOX:
		    BOX:	Should we ignore ~white ~space (like spaces)?  (Section
		    BOX:	17.2).


12.5
General Rewrite Rules

	  General rewrite rules are rules whose give-phrases have length greater
	  than one, like:

		    <X>	->	  <A> <B>
	  or
		    <X> <Y>	   ->	    <A> <B>
	  or
		    <X> <Y> <Z>	  ->	 <A> <B>

	  Each general rewrite rule is turned into a program very much like other
	  rules.  Any rule:

		    <POS(1)> <POS(2)> ... <POS(n)>	    ->	??

	  becomes a program:

		    P(n):= P;
		    IF  P(n).POS = <POS(n)>  THEN

			  FOR P(n-1) $E P(n).LEFT;  WITH  P(n-1).POS = <POS(n-1)> ;
			  !!
			  ...
			  !!
			  FOR P(1) $E P(2).LEFT;  WITH  P(1).POS = <POS(1)> ;
			  DO
				??
			  END

	  This shows how the want-phrase is turned into a program.	The "??"
	  represents the give-phrase (or some arbitrary action).  For context-
	  free rules, the "??" is simply

		    GIVE(  [LEFT: P(1).LEFT  POS: <GIVE_POS>]  );

	---------------- Parentheses in previous paragraph mean subscripting! ---
	---------------- except after "GIVE" ------------------------------------


	  Consider the following rule's give-phrase, of length greater than one:

		...	->	<GIVE_POS(1)> <GIVE_POS(2)> ... <GIVE_POS(k)>

	The give part for this rule will look almost like the situation when
	taking user input.
      ---------------- Parentheses in previous paragraph mean subscripting! ---

	Each <GIVE_POS> is GIVEn, from left-to-right.  (We will
	come back and disect the following):

		" Leftmost PHRASE block ... "

		RIGHTHAND:= C;	   "(Save C, for use in the rightmost GIVE)"

		C:= NIL;
		GIVE(  [LEFT: P(1).LEFT  POS: <GIVE_POS(1)>]  );

		" Second to leftmost PHRASE block ... "

		LEFT:= C;	  "(This looks identical to taking user input)"

		C:= NIL;
		GIVE(  [LEFT: LEFT  POS: <GIVE_POS(2)>]  );

		...

		" Second to rightmost PHRASE block ... "

		LEFT:= C;	  "(This looks identical to taking user input)"

		C:= NIL;
		GIVE(  [LEFT:LEFT  POS: <GIVE_POS(k-1)>]  );

		" Rightmost PHRASE block ... "

		LEFT:= C;

		C:= RIGHTHAND;	   "Restore C to original value"
		GIVE(  [LEFT:LEFT  POS: <GIVE_POS(k)>]  );

	This looks identical to taking user input, except for the leftmost and
	rightmost GIVEs.
      ---------------- Parentheses in previous paragraph around "1", "2"
      ----------------  "k", and "k-1" mean subscripting!	---------------

	The final (rightmost) GIVE is exactly like the GIVE in a context-free
	rule.  Both augment C, the C in force when the grammar was called.
	Thus, C winds up containing the rightmost PHRASE block of both the
	matched phrase and the new phrase given by the general rewrite rule.

	Let's examine the give-phrase generation more closely.

	  Let RIGHTHAND and LEFT be (local) variables of type CHOICE_OF_PHRASES.

	  The ~leftmost block is always generated by:

		    RIGHTHAND:= C;  "(Save C, for use in the rightmost GIVE)"

		    C:= NIL;
		    GIVE(  [LEFT: P(1).LEFT  POS: <GIVE_POS(1)>]  );

	  We set C to NIL so that this GIVE creates a new CHOICE_OF_PHRASES.
	---------------- Parentheses in previous paragraph around "1"
	----------------	mean subscripting!		  -----------------------

	  The ~rightmost block is always generated by:

		    LEFT:= C;

		    C:= RIGHTHAND;    "(Restore C to original value)"
		    GIVE(  [LEFT: LEFT	POS: <GIVE_POS(k)>]  );

	  We set C back to its original value (RIGHTHAND) so that this
	  rightmost PHRASE block comes to reside on the same CHOICE_OF_PHRASES
	---------------- Parentheses in previous paragraph around "k"
	----------------	mean subscripting! ------------------------------------

	  that holds the want-phrase's rightmost PHRASE block.  This assures
	that both the want- and give-phrases will share ultimately the same
	right neighbors.

	Any ~middle block is always generated by:

		LEFT:= C;	"This looks exactly like taking user input"

		C:= NIL;
		GIVE(  [LEFT: LEFT  POS: <GIVE_POS(i)>]  );

	  

	Figure 12.9 shows the want-phrase and the newly generated give-
	phrase.  Notice that for all but the rightmost part-of-speech in the
	give-phrase, each one gets its own CHOICE_OF_PHRASES, a new point of
	view.
      ---------------- Parentheses in previous paragraph around "i"
      ----------------  mean subscripting! ------------------------------------

	Finally, C is restored by setting it to RIGHTHAND for the last part-
	of-speech.  Thus, our new give-phrase, like the want-phrase, will
	emenate from the same our CHOICE_OF_PHRASES, C (the original).  The
	give- and want-phrases have the same LEFTs in their leftmost PHRASE
	blocks.  (Notice that we supply the want-phrase's LEFT, P(1).LEFT, as
	  the LEFT for our first, leftmost PHRASE block in the give-phrase).

	  This concludes the presentation of Thompson's and Kay's parser.	 We
	  still need to modify it so as to achieve a polynomial and not
	  exponential (or infinite) cost.  That appears in Chapter 14.

	  The following chapter studies the parser close enough to prove that
	  it always works.


		    BOX:	How are general rewrite rules similar to the process
		    BOX:	of taking user input?
		    BOX:
		    BOX:	What is the variable RIGHTHAND used for?