CHAPTER 15


					  SEMANTICS



	  This chapter brings semantics into the parsing process.  In conjunction
	  with the change that assured termination (previous chapter), something
	  new must arise, ~ambiguous ~semantics.	They provide for the
	  representation of exponentially many meanings in only a polynomial
	  amount of memory.


15.1
Introducing The SEM Field

	  Our discussion of the parser to this point has ignored semantics
	  entirely.	 We incorporate semantics by augmenting the type PHRASE
	  so as to have a new, SEM (semantics) field:

		    TYPE	PHRASE =  [ LEFT: CHOICE_OF_PHRASES
						POS:	INT
						SEM:	BASIC_PROCESS ] ;

	  As discussed in chapter 4, we always render semantics in terms of the
	  type BASIC_PROCESS.  Even though we write the rule:

		    <FORM:a>  +  <TERM:b>	  ->	     <FORM:	 <*a*> + <*b*>  >

	  we actually get:

		    <FORM:a>  +  <TERM:b>	  ->

						 <FORM:  //[a;b;]	 <*a*> + <*b*>  \\ >

	  All semantic data are constructed by the //...\\, even though you are
	  freed from having to specify this.

	  Let's see how to turn this rule into a program, including
	semantics.  The want-phrase is unchanged, i.e., this rule is programmed
	by:

		P1:= P;
		IF  P1.POS = <TERM>  THEN

			FOR P2 $E P1.LEFT;  WITH  P2.POS = '+' ;
			!!
			FOR P3 $E P2.LEFT;  WITH  P3.POS = <FORM> ;
			DO
			    ??
			END

	The give-phrase, the "??", is specified by using GIVE, as follows:

		GIVE(  [LEFT: P1.LEFT  POS: <FORM>

			SEM:  //[P1;P3;]  <*P1.SEM*> + <*P3.SEM*>  \\   ]  );

	The parameter to GIVE is of type PHRASE, as always.  What's new here is
	  the inclusion of the SEM field in the PHRASE.	 The SEM field, the

		    //[P1;P3;] ... \\

	  is a process which retains access to P1 and P3, so that it can perform
	  the necessary semantics by using P1 and P3, as in:

		    <*P1.SEM*>  +	 <*P3.SEM*>

	  The invocation of our semantics thus will execute this sum, where the
	  two integers are acquired by invoking the semantics of P1 and of P3.

	  In fact, we actually translate:

		    <FORM:a>  +  <TERM:b>	 ->	   <FORM: ... >

	  into

		    P1:= P;
		    IF  P1.POS = <TERM>	 THEN

				FOR P2 $E P1.LEFT;  WITH  P2.POS = '+';
				!!
				FOR P3 $E P2.LEFT;  WITH  P3.POS = <FORM>;
				DO
					  ~A:= ~P3.SEM;	"New"
					  ~B:= ~P1.SEM;	"New"

					  GIVE(  [LEFT: P3.LEFT	 POS: <FORM>

						    SEM:  ~//[A;B;]  ...   ~\\  ]  );
				END
		    FI

	  Notice that the variables mentioned in the rule, A and B, hold the SEM
	  fields of the matched want-phrase.

	  Thus, the user's semantic specification can continue to use the
	variables A and B, the ones the user specified in the rule's want-
	  phrase.

		    Recall that the execution of

				//[A;B;]  ...  \\

	

		    is very fast. It makes merely a new block that holds the
		    contents of A and B, and a pointer to the "..." program (figure
		    15.1). That "..." program is ~not executed now. (This was
		    introduced in Chapter 2).

	  In general, we translate:

		    <POS(1):V(1)>	 ...	<POS(n):V(n)>	  ->

				<GIVE_POS: semantic_program >

	  into

		    P(n):= P;
		    IF  P(n).POS = <POS(n)>  THEN

				FOR P(n-1) $E P(n).LEFT;  WITH P(n-1).POS = <pos(n-1)>;
				!!
				...
				!!
				FOR P(1) $E P(2).LEFT;	WITH	P(1).POS = <pos(1)> ;
				DO
					  V(1):= P(1).SEM;
					  V(2):= P(2).SEM;
					  ...
					  V(n):= P(n).SEM;

					  GIVE(  [LEFT: P(1).LEFT  POS: <GIVE_POS>

						    SEM:  //[V(1);...;V(n);]

								   Semantic_Program	\\
						   ]	);
				END

	

	  Figure 15.2 shows the PHRASE block we're passing to GIVE.  Its SEM
	field is itself a block.  That semantic block's first bin points to the
	  already compiled semantic_program.  The remaining bins hold the values
	  of variables, V(1)...V(n), which will come alive with those values upon
	  invocation of this SEM block.
	---------------- Parentheses in previous paragraph except after "GIVE"
	----------------	mean subscripting! ------------------------------------

	  (The program block "Semantic_Program" is not recreated now.  It has
	  existed since we first gave our grammar to the computer).

	

	  Figure 15.3a shows an occurence of the following rule's want-phrase:

		<FORM:a>  +  <TERM:b>	    ->	     <FORM:  <*a*> + <*b*>  >

	Each of the <FORM> and <TERM>'s semantic blocks are shown, although
	  we don't show their insides, the variables nor the pointers to some
	program blocks.

	Figure 15.3(b) shows the situation after this rule applies.  The new
	<FORM> PHRASE block (on the bottom) shares the same span with the
	want-phrase.  Its SEM field is an BASIC_PROCESS, a block containing
	the address of a program and the values in variables.  That block is
	the second to rightmost block in the figure, labelled "SEMANTICS".
	Notice that those values in A and B are the semantics taken from the
	want-phrase's <FORM> and <TERM> blocks respectively.	The semantic
	  program is:

		    <*A*> + <*B*>

	  It means invoke each of A and B so as to acquire their INTeger values,
	  and deliver their sum.

		    BOX:	How are semantics included in our parser?
		    BOX:
		    BOX:	How is it that the semantic specification can continue
		    BOX:	to use the variables specified in the want-phrase?
		    BOX:
		    BOX:	Where does the implicit "//...\\" get introduced.


15.1.1
General Rewrite Rules

	  Each element in the give-phrase of a ~general ~rewrite ~rule
	  (introduced in Section 1.3) specifies its own semantics, e.g.,

		    <A:a>  <B:b>	     ->	 <X:...>  <Y:...>	 <Z:...>

	  Each of X, Y, and Z gets its own semantics.  As we GIVE each of X, Y,
	  and Z, we now supply a value for the SEM field each time, in the PHRASE
	  block that is passed to GIVE.

	  There is no notion of semantics associated with the rule as a whole.
	  Each GIVEn PHRASE block has its own independent semantics.  Those
	  semantics can read the variables "a" and "b".


15.2
Ambiguous Semantics Arise When Duplicate PHRASE Blocks Are Supressed

	  Recall that GIVE is now defined by:

		    DEFINE	GIVE( P:PHRASE ):
			 BEGIN  VAR	 P1 = PHRASE ;

				IF  FOR P1 $E C;	NEVER	 (P.POS=P1.POS) &
								 (P.LEFT=P1.LEFT)

				THEN	  C:= C $> P;

					  the_grammar(P);
				FI
			 END
		    ENDDEFN

	  That is, we put P on C and call the grammar with P ~as ~long ~as ~P
	  ~isn't ~a ~duplicate of something already in C.  If P is a duplicate,
	then we have so far ignored P entirely, including its semantics.

	We chose to supress duplicate PHRASE blocks because they added
	nothing new.  They only created duplicates of existing
	phrases.  We argued that all the new phrases caused by a duplicate
	PHRASE block would be duplicates of those generated by the original.

	We defined "duplicate" to mean "matching in both POS and LEFT".
	Now that we've introduced semantics, "duplicate" no longer means
	  "identical".  "Duplicates" can have different semantics.	However,
	  GIVE and the grammar ~never read semantics (until the last rule
	  applies).	 Thus, GIVE's and the grammar's execution still depend in no
	  way upon semantics.

	  Suppose we didn't supress duplicates at all.  Upon GIVing the original,
	many new phrases resulted.  The semantics in those phrases were also
	built up.  If we let GIVE call the grammar upon receiving the 
	duplicate, the grammar will build exact duplicates of the phrases and
	semantics grown by the original.

	

	Figure 15.4(a) shows the semantics delivered with the original PHRASE
	block (OLD).  We show arrows coming from above, representing other
	semantics built up in terms of the original PHRASE block's semantics.
	  Figure 15.4(b) shows the semantics of the duplicate PHRASE block (NEW).
	  The same arrows above the original semantics appears over this new
	  semantic block.	 After all, GIVE behaves identically the first and
	  second times around.	The same semantic structure built over NEW
	  was built over OLD upon receipt of the original.

	  Let's call all points-of-view that can see OLD "View(OLD)".  All
	the structure above NEW is identical to that above OLD.  That is, by
	letting the duplicate PHRASE block pass thru GIVE, we gain two
	structures:

		View(OLD) and View(NEW)

	Any point of view in View(OLD) has a corresponding point of view in
	View(NEW).  (The "view" structures are identical).

	As long as one point of view in View(OLD) can participate in an
	overall understanding of the user's input, then so will the
	  corresponding point-of-view in View(NEW).  ~Both will be part of the
	  understanding, or neither will.

	

	  Figure 15.5(a) shows View(OLD) and View(NEW).	 Part (b) shows how we
	  can represent that exact same ambiguity, by burying the ambiguity
	  under one view.

	  We represent both View(OLD) and View(NEW) in only one structure by
	  representing:

		    View( OLD ~or NEW )

	  That is, you can use the structure View(OLD) to act also as the
	  structure View(NEW).	You put both OLD and NEW under the same (old)
	  structure.  The "OLD ~or NEW" is an instance of ambiguous semantics.

	  The block that ties NEW and OLD together is called the ~OR-block, as
	  it represents the choice of NEW ~or OLD.

		    BOX:	What causes ambiguous semantics?
		    BOX:
		    BOX:	Why does GIVE behave identically when given a PHRASE
		    BOX:	block and later when given a duplicate of it?


15.2.1
Turning View(OLD) Into View( OLD ~or NEW )

	  We represent OLD ~or NEW like any other semantics.	It is the
	  BASIC_PROCESS:

		    //[OLD;NEW;]	f( OLD , NEW )   \\

	  The BASIC_PROCESS's invocation will perform

		f( OLD , NEW ).

	F may invoke OLD or invoke NEW.  We leave that unspecified for now.
	If f invokes OLD, then we have represented View(OLD).  If f invokes
	NEW, then we have represented View(NEW).  (Part 7, Section 25.5.4 and
	Chapter 28 reconsider F).

	We modify the OLD semantic block ~in ~place so as to represent now
	"OLD ~or NEW".  We proceed in three steps.  First, we copy the OLD
	semantic block, and call it COPY_OLD:

		COPY_OLD := COPY( OLD ) ;

	Figure 15.4(c) shows COPY_OLD as well as View(OLD), OLD, and NEW.
	We set OR to be the new OR block:

		OR:= //[NEW;COPY_OLD;]  f(NEW,COPY_OLD)   \\ ;

	This is shown in figure 15.4(d) with thick lines.  We leave unspecified
	the function "f" called within OR.  It may choose to invoke NEW or
	COPY_OLD.

	Finally, we write over the OLD block:

		@(OLD) := OR ;

	Figure 15.4(e) shows the result.  All points of view that could see OLD
	now see (at the same address) the OR block.  This is an objective
	modification.  Thus, what was OLD has become NEW ~or OLD.  View(OLD)
	now acts as View(NEW ~or OLD).

	These semantics concerns are implemented in our final definition for
	GIVE:

		DEFINE	GIVE( P:PHRASE ):

		   BEGIN   VAR  P1 = PHRASE;  OLD,NEW,OR = BASIC_PROCESS;

			IF  FOR P1 $E C;  NEVER  (P.POS = P1.POS)  &
						 (P.LEFT=P1.LEFT) ;

			THEN	"No duplicate found."

				C:= C $> P ;

				the_grammar(P);

			ELSE	"This ELSE clause is what's new ..."

					  "P is the duplicate of P1.	(That FOR-loop
					   above leaves in P1 the (first) PHRASE block
					   that matches P)"

					  "The OLD and NEW semantic blocks..."

					  OLD:= P1.SEM;
					  NEW:= P.SEM;

					  "Prepare to modify OLD objectively..."

					  COPY_OLD:= COPY( OLD );

					  OR:= //[NEW;COPY_OLD;]  f(NEW,COPY_OLD)	 \\ ;

					  @(OLD) := OR;
				FI
			 END
		    ENDDEFN

	  You may wonder why we bothered to make a COPY of the OLD block, why we
	  put it into COPY_OLD, and why the OR block references COPY_OLD instead
	  of OLD.  Figure 15.4(f) shows the situation if we didn't copy OLD.  Upon
	  executing the objective modification:

		    @(OLD) := OR ;

	  the OR block's third bin, which is intended to represent the old
	  semantics, references now the OR block itself.  The old semantics used
	  to reside there, but those data have been lost.  Our use of COPY_OLD
	  preserves the old semantics, albeit at a different address.

		    BOX:	How does GIVE retain the semantics of both the
		    BOX:	original and duplicate block?
		    BOX:
		    BOX:	What might the function "f" do?  (Part 7 introduces
		    BOX:	more examples, in particular Sections 25.5.4 and
		    BOX:	28.1).
		    BOX:
		    BOX:	Why must OLD be copied?


15.2.2
	What If The OLD and OR Blocks Are Of Different Sizes?

	  You might wonder how this can work, particularly if the OLD, NEW,
	  and OR blocks are of different sizes.  Figure 15.4 shows them all being
	  three bins large.  In fact the sizes may differ.  It has been clearest
	  to show semantic blocks (of type BASIC_PROCESS) as single blocks (of
	  varying sizes).

	

	  However, the implementation actually represents
	  BASIC_PROCESSs as a linked list of fixed size blocks, as shown in
	  figure 15.6(b).	 We use always blocks of size two.	These blocks are
	  linked so that one points to the next, and the last one points to the
	  program.	Figure 15.6(c) shows OLD, NEW, and COPY_OLD as they are
	  really represented.  Part(d) shows the OR BASIC_PROCESS, referencing
	  COPY_OLD and NEW.  Part (e) shows the situation after executing the
	  objective modification:

		    @(OLD) := OR ;

	  Only the first block need be overwritten, as that is sufficient to make
	  the point of view OLD now represent the OR BASIC_PROCESS.

15.3
Examples Of Ambiguous Semantics


15.3.1
Example 1

	  The ICL expression

		    A + B # C

	  is syntactically ambiguous, as it represents both possible groupings:

		    (A+B) # C   and
		    A + (B#C)

	  (We haven't seen the grammar rules involving "#", which allow for this
	  ambiguity).

	

	  Figure 15.7 shows the semantic structure built up for A+B#C.  The top
	  block is the OR block.  It represents the semantics of
	  (A+B)#C ~or A+(B#C).

		    These semantic pictures omit the program blocks.	Instead of a
		    pointer to a program block, we have written just "(+)" or
		    "(#)", to represent the semantic programs associated with the
		    "+" rule and the "#" rule.

	  Notice in figure 15.7 how the semantic block (A+B)#C references the
	  semantic block (A+B) and the semantic block for C.	Similarly, the
	  slanted A+(B#C) block references the A and the (B#C) blocks.  Also
	  note that both overall interpretations share the A, B, and C blocks.


15.3.2
Example 2

	  Let's consider a more complex example of ambiguous semantics.  For
	  this purpose, let's assume that even the "+" operator alone is also
	  ambiguous, so that:

		    1+2+3	  could be either		(1+2)+3  or	 1+(2+3)
	  and
		    1+2+3+4	  could be either of	((1+2)+3)+4	 or
								(1+(2+3))+4	 or
								(1+2)+(3+4)	 or
								1+((2+3)+4)	 or
								1+(2+(3+4)).

	  (This ambiguity can be obtained by replacing the rule:

				<FORM>  +  <TERM>		->	  <FORM>

		    by

				<FORM>  +  <FORM>		->	  <FORM>

		    We introduce this ambiguity only for the purposes of this
		    example.  However, ICL does indeed have a class of operators
		    that introduces this kind of ambiguity, the ~infix notation
		    for calling functions:

				paramter_1	\Function_name  parameter_2

		    The grouping of:

				value1  \Function1  value2  \Function2  value3

		    is ambiguous, just like the example with the "+" and "#".  Do
		    we call Function1 first, or Function2?)

	  Consider two of the five possibilities for 1+2+3+4:

		    ((1+2)+3) + 4	 and
		    (1+(2+3)) + 4


	

	  The distinction between these two interpretations occurs within the
	  1+2+3 part, independent of the 4 part.	Figure 15.8 shows the semantics
	  for these two interpretations.  The distinction between the
	  two interpretations for 1+2+3 is ~buried within the semantics for
	  (1+2+3)+4.  That is, the "?" in "?+4" is ambiguous.

	  We have not shown the semantics for 1+(2+3+4) nor (1+2)+(3+4).	The
	  former, the:

		    1 + (2+3+4)

	  represents actually two interpretations, as did our old (1+2+3)+4:

		    1 + ((2+3)+4)
		    1 + (2+(3+4))

	  Again, the ambiguity in this case involves only the 2+3+4, independent
	  of the 1.


15.3.3
Example 3

	  Consider an even more complex example:

		    1+2+3+4+5

	  We will limit our attention to only those interpretations that group
	  the 1+2+3 together:

		    (1+2+3) + (4 + 5)	 or
		   ((1+2+3) + 4) + 5

	  Both these interpretations ~share the same representation for the
	  (1+2+3).	Here we have an ambiguity ~within an ambiguity.	 The two
	  lines just shown form the outer ambiguity.  Within each of those two
	  interpretations, the 1+2+3 part represents the inner ambiguity,
	  representing both:

		    (1+2)+3	  and
		    1+(2+3)

	

	  Figure 15.9 illustrates this nested ambiguity.  There is an OR block in
	  the center of the figure, which represents both interpretations for
	  1+2+3.  That OR block is referenced from above, twice, once within the:

		    ( (1+2+3) + 4 ) + 5

	  and once within the

			(1+2+3) + (4+5).


15.4
Exponentially Many Meanings In Only Polynomial Space

	

	  Figure 15.10 is an abstraction of figure 15.9, showing only the OR
	  blocks.  The essential ingredient is that both interpretations
	  represented by the upper OR block come to share the other OR block
	  below.  Figure 15.10(b) shows a similar nesting, this time involving
	  three OR blocks.

	  This nesting of ambiguities within ambiguities gives rise to
	  exponentially many distinct meanings represented in only polynomial
	  memory.  Each OR block represents a two-way choice.
	  If there are N OR blocks nested like this, then there are 2-to-th-Nth
	  possible overall choices.

	  Our parser represents at least exponentially many phrases in only
	  polynomial memory.  We see the same thing here with semantics.	Both
	  exponentials are possible because of sharing.	 Distinct meanings
	  share much in common.	 Only the essential differences among
	  meanings are represented, thus giving us this wonderful savings.

	  Since our parser consumes only polynomial execution time, the amount of
	  memory used to represent all phrases and all semantics is also bounded
	  above by the polynomial.  Within that polynomial memory, we know that
	  at least exponentially many distinct phrases and meanings must be
	  represented, as there is at least exponentially many ways to introduce
	  parentheses around the expression of lenght N:

		    1+2+ ... +N


	  Our examples for 1+2+3+4 and 1+2+3+4+5 showed only some of the
	  possible interpretations.  There are many more, and they are also
	  represented with a great deal of sharing.  For example, the two
	  interpretations:

		    (1+(2+3))+4	and
		    1+((2+3)+4)

	  share in common the (2+3).

	  How do we handle ultimately such voluminous semantics?  If
	  exponentially many meanings are indeed represented, won't it take an
	  exponential amount of time to process all of them?	Certainly if we
	  enumerate all the possible meanings, exponential time will be consumed.
	  However, many kinds of very useful processing can be done in only
	  polynomial time!  Part 7 continues with these considerations.

		    BOX:	How can exponentially many meanings be represented
		    BOX:	in only polynomial space?
		    BOX:
		    BOX:	Why are there at least exponentially many possible ways
		    BOX:	to place parentheses within a string like "1+2+...+N"?